ℝolliℵg M∀th Thr∑a∂

Message Bookmarked
Bookmark Removed
Not all messages are displayed: show all messages (1159 of them)

Also, guy who inspired me to start this thread Math & Music: The Severed Alliance. Some Recent Academic Approaches (Do Not Read If You Hate Drums) who is a topologist at Lehman College is playing a jazz piano gig tonight at Mezzrow. Would like to go sometime but probably not tonight.

Frankie Teardrop Explodes (James Redd and the Blecchs), Wednesday, 29 June 2016 23:20 (seven years ago) link

it is functional and a tool for stat analysis

― de l'asshole (flopson), Sunday, June 26, 2016 11:02 AM Bookmark Flag Post Permalink

R is not really that functional. It's a weird hybrid designed by people in an ad-hoc way. It's got amazing libraries and tooling, but as a language if you think "functional" you'll get confused after a while (but probably if you don't think 'functional' you'll get confused too -- lots of things just don't make much sense outside of 'it was easier to implement this way').

R.I.P. Haram-bae, the good posts goy (s.clover), Thursday, 30 June 2016 00:21 (seven years ago) link

Hadley Wyckham says R is "at its heart, a functional programming language"

de l'asshole (flopson), Thursday, 30 June 2016 00:44 (seven years ago) link

idk it never bothered me that much but i only ever read 'Godel's Proof' by Ernst Nagel
When I studied this as an undergraduate, we spent a term going through Boolos and Jeffrey, but this Nagel book looks short and sweet.

Frankie Teardrop Explodes (James Redd and the Blecchs), Thursday, 30 June 2016 01:22 (seven years ago) link

Hadley Wyckham is right and wrong in the same way as if you said that about say Javascript. If you _can_ have closures, you _can_ be functional. But that's not how most libraries are written, and the language has lots else going on

http://jasp.ism.ac.jp/kinou2sg/contents/R-ism-dec-8-no-anim.pdf

http://r.cs.purdue.edu/pub/ecoop12.pdf

http://community.haskell.org/~ndm/temp/EGMitchell-ExperienceReport.pdf

R.I.P. Haram-bae, the good posts goy (s.clover), Thursday, 30 June 2016 01:34 (seven years ago) link

i read the first 50 pages of Zia Haider Rahman's book due to a favorable James Wood review, but found it insufferable

OK I looked at the Wikipedia page for this and I can't think of a time extravagant praise for something made it sound so terrible

Guayaquil (eephus!), Thursday, 30 June 2016 01:43 (seven years ago) link

from paper sclover linked

As a language, R is like French; it has an elegant core, but every rule comes with a set of ad-hoc exceptions that directly contradict it.

sick burn lol

and this seems like a good answer to Euler's question:

The R user community roughly breaks down into three groups. The largest groups are the end users. For them, R is mostly used interactively and R scripts tend to be short sequences of calls to prepackaged statistical and graphical routines. This group is mostly
unaware of the semantics of R, they will, for instance, not know that arguments are passed by copy or that there is an object system (or two). The second, smaller and more savvy, group is made up of statisticians who have a reasonable grasp of the semantics
but, for instance, will be reluctant to try S4 objects because they are “complex”. This group is responsible for the majority of R library development. The third, and smallest, group contains the R core developers who understand both R and the internals of the
implementation and are thus comfortable straddling the native code boundary. One of the reasons for the success of R is that it caters to the needs of the first group, end users. Many of its features are geared towards speeding up interactive data analysis.
The syntax is intended to be concise. Default arguments and partial keyword matches reduce coding effort. The lack of typing lowers the barrier to entry, as users can start working without understanding any of the rules of the language. The calling convention
reduces the number of side effects and gives R a functional flavor.

de l'asshole (flopson), Thursday, 30 June 2016 02:21 (seven years ago) link

sorry 4 butchered formatting

first set of slides were incomprehensible (although i liked how the code was typeset in comic sans lol) to me and the paleontology one i didn't really get but the middle one seems spot on, from a skim by an extremely non-CS person. a lot of the R gods on SO constantly admit flaws and inconsistencies in the language due to weird implementation

de l'asshole (flopson), Thursday, 30 June 2016 02:24 (seven years ago) link

Was trying to remember earlier what mathematician had no hands and was going to post to this thread for help but then it finally came to me.

Frankie Teardrop Explodes (James Redd and the Blecchs), Thursday, 30 June 2016 02:49 (seven years ago) link

Hadley Wyckham says R is "at its heart, a functional programming language"

I like Hadley. But he’s wrong. It’s iteration and selection throughout. You should use apply because R’s loop optimizations are horrible to non-existant.

I liked R. And I’ll still occasionally use R when it’s a collaborators preference. But I can’t imagine a student starting with R in 2016 when Python’s scientific community is so far ahead in practically every area.

Nonetheless, when someone asks for R advice, I usually tell them to read Patrick Burns’ “The R Inferno:”

http://www.burns-stat.com/pages/Tutor/R_inferno.pdf

Allen (etaeoe), Thursday, 30 June 2016 12:21 (seven years ago) link

And nobody should ever use S4 objects! Woof!

Allen (etaeoe), Thursday, 30 June 2016 12:22 (seven years ago) link

I liked ggplot2. But Hadley’s post-ggplot2 work is a reminder of Maslow’s hammer. Your work suffers when you become too attached to a familiar tool.

And, frankly, ggplot2 feels archaic in 2016. gnuplot and matplotlib too.

Allen (etaeoe), Thursday, 30 June 2016 12:30 (seven years ago) link

"data science" is this meaninglessly general term that is starting to be usefully divided up in to "product data science" (e.g. machine learning in the product) and "analytics" (e.g. decision science/business intelligence).

R is virtually useless in the first, but much more useful in the second, which is more traditional stats and batch/static reporting.

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 12:31 (seven years ago) link

yes the work my wife is looking at is in "data analytics" particularly. the company she's looking at right now wants (in addition to a doctorate in math or stats, and English fluency) capacity with SQL, and R and/or Python and/or Excel. I lolled at Excel but I think that says well what they want.

droit au butt (Euler), Thursday, 30 June 2016 13:05 (seven years ago) link

"data science" is this meaninglessly general term that is starting to be usefully divided up in to "product data science" (e.g. machine learning in the product) and "analytics" (e.g. decision science/business intelligence).

R is virtually useless in the first, but much more useful in the second, which is more traditional stats and batch/static reporting.

― 𝔠𝔞𝔢𝔨 (caek), Thursday, June 30, 2016 8:31 AM (1 hour ago) Bookmark Flag Post Permalink

i work in analytics but there's tonnes of ML in R

i used to lol at Excel when i was in school but it's the least pain in the ass way to just look at data quickly imo, which is extremely useful in the job

ty for R inferno, this is hilarious

de l'asshole (flopson), Thursday, 30 June 2016 13:50 (seven years ago) link

R has ML libraries, sure. so does javascript. they don't get used in product though.

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 13:54 (seven years ago) link

what does that mean?

de l'asshole (flopson), Thursday, 30 June 2016 14:09 (seven years ago) link

as far as i've experienced, r doesn't get used as the backend for web apps, for collaborative filtering at web scale, for CNNs, etc. these are the use cases i mean when i say "product".

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 14:19 (seven years ago) link

you can probably do all those things in r (write an api, collaborative filtering, train a neural network, etc.), but i don't know anybody who does in production.

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 14:23 (seven years ago) link

it doesn't seem that needs chez moi involve developing apps of any kind, that's for the developers afaict, not the analysts, but I dunno. from what I've read of R it seems silly to do development there.

droit au butt (Euler), Thursday, 30 June 2016 15:23 (seven years ago) link

i once got asked in an interview "what kind of data scientist are you" and it turned out he was getting at this product/production vs analyst distinction. i think it's real, and IME r definitely falls on one side of it in practice, and that's at least in part because of the design of the language (rather than mere social network effects). but to be clear there are tons of jobs where r is far and away the most useful language you can know.

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 15:27 (seven years ago) link

yeah I mean we're just reading ads but it seems to me if you want a doctorate in math/stats then you're not just looking for a developer. but I dunno.

droit au butt (Euler), Thursday, 30 June 2016 15:29 (seven years ago) link

this is extremely reductive and misses out on tons of factors/complications, but gives a very rough idea of what's most valuable to know. valuable != necessary of course.

https://duu86o6n09pv.cloudfront.net/reports/2015-data-science-salary-survey.pdf

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 15:37 (seven years ago) link

huh that's interesting and helpful

here's a very stupid question: is there some recommended "certification" for having learned these tools, or can you just pick them up on your own and then list it on your cv/resumé ? my own CS degree is like 20 years old & I don't remember anything about that (& my wife doesn't have any CS degrees, just math, though she used Matlab a lot for her dissertation, in applied math). like what do self-trained people in these tools have to do to convince employers that they can use them? or will this come out in some test in an interview?

droit au butt (Euler), Thursday, 30 June 2016 15:51 (seven years ago) link

for data science, it's less of a problem to be a self taught coder in "tech" businesses than in more traditional business. the discipline is mature enough that there's a fairly good change you end up being interviewed by someone who themselves has a strong quant but non-CS phd.

so, given a maths phd, i don't think further credentials are strictly necessary.

that said, there's a cottage industry of boot camps/recruitment things that make the transition quite a lot easier (and perhaps more lucrative), either by formally teaching stuff and providing credentials, providing an environment in which your "job" is to learn for a few weeks, or helping with applications/interviews. http://insightdatascience.com/ is the best known of these.

if your wife knows matlab already, then i recommend andrew ng's coursera machine learning course. it's intellectually interesting but it's also excellent interview prep. the only thing i didn't like about it was that the exercises were in matlab, because i had to waste time learning that. i put that (and a couple of other coursera courses) on my resume my first time out, but i don't think anyone noticed or cared about how i'd acquired the knowledge.

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 16:01 (seven years ago) link

ok super, we'll have a look. she's got plenty of time for coursera courses; right now she's working through an O'Reilly book on R and it's going easily as expected.

droit au butt (Euler), Thursday, 30 June 2016 16:04 (seven years ago) link

(major caveat with any advice i give: my experience and network is all tech/startup, which is an unusual industry and is not where most of the jobs are, i.e. healthcare, insurance, finance, etc.)

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 16:04 (seven years ago) link

right, she's looking at the tech/startup industry in Paris, which is quite weird as you can imagine.

droit au butt (Euler), Thursday, 30 June 2016 16:11 (seven years ago) link

(though one startup in Paris last year hired more mathematicians in France than all universities in France combined, and this is the current target)

droit au butt (Euler), Thursday, 30 June 2016 16:12 (seven years ago) link

caek does your ilxmail work? my wife has questions for you if you'd be willing.

droit au butt (Euler), Thursday, 30 June 2016 16:44 (seven years ago) link

i read this book

http://www-bcf.usc.edu/~gareth/ISL/

which does all the examples in R. the methods are outdated but perfect for getting the intuition, and the big themes bias-variance tradeoff are really well-developed. it's extremely easy and i got through it in a week. it's the baby version (created for an MBA class iirc) of Elements Of Statistical Learning, which i'm reading now

de l'asshole (flopson), Thursday, 30 June 2016 17:03 (seven years ago) link

i hear v good things about ESL and ISL

euler i think so, and sure!

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 17:05 (seven years ago) link

you can probably do all those things in r (write an api, collaborative filtering, train a neural network, etc.), but i don't know anybody who does in production.

ha, having said that, i saw on twitter this talk is happening today

http://schedule.user2016.org/event/7Sq2/gradient-boosted-trees-model-deploying-r-models-into-production-environments

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 18:05 (seven years ago) link

Allen/etaeoe, what's your favorite plotting library (in any language) right now?

I use ggplot2 all day every day, and while I try to keep my eye on new developments, I haven't yet found anything else yet that lets me get what's in my mind's eye onto a realized plot as quickly and easily. Lately I've been using Plotly with it, and wrapping ggplots in ggplotly() for some quick and easy interactivity (zooming, tool tips, etc)

Dan I., Thursday, 30 June 2016 18:59 (seven years ago) link

-1 "yet"

Dan I., Thursday, 30 June 2016 19:00 (seven years ago) link

Another good applied intro-level book along the lines of ISL is Max Kuhn's Applied Predictive Modeling, which gets into some hairier stuff that other sources tend to skip like how to deal with extreme class imbalance. He also touches on response surface methodology and multiobjective optimization, which is potentially so useful but I never see anybody else talking about (then again I don't come from an engineering background). Again, though, the book is R-based, so don't read it if you hate R.

Dan I., Thursday, 30 June 2016 20:47 (seven years ago) link

Allen/etaeoe, what's your favorite plotting library (in any language) right now?

“It’s complicated.”

Typically, I use visualizations either as descriptive statistics or as figures.

When I need a descriptive statistic (e.g. histogram, Q-Q, or scatter), I’ll continue to use Seaborn from Python and ggplot2 from R. I find them too verbose. Especially when compared to R’s default plotting functions. But they work.

When I need a figure, I’ll use D3 to render an SVG suitable for publication. I’ve tried Cytoscape too. If a figure is computationally expensive to render (e.g. more than one hundred thousand observations), I’ll use SVG or WebGL directly.

I’ve used TikZ too. It works.

Everything I’ve mentioned feels inadequate. When I used ggplot2 (matplotlib too) in 2005, it was a major revelation. TikZ too. However, it’s been an insane decade for mathematics and statistics. 2005’s tools feel way too limiting for the ideas I want to express in 2016.

Conceptually, D3 is fantastic. And Mike Bostock has been a champion for articulating the transition we’re undergoing. Unfortunately, I don’t think D3 should become the default option. It feels antithetical to both standard and emerging web technologies. And it’s isolated from the larger web ecosystem (e.g. D3 uses custom selection and data-binding operations).

I think Plot.ly’s Plotly.js library is sensible as a curated collection of D3 visualizations. But venture-backed visualization software makes me nervous.

I also feel burdened by the lack of contemporary visualization tools for common problems (e.g. volumetric images).

Allen (etaeoe), Sunday, 3 July 2016 20:45 (seven years ago) link

Don't want to appear uncharitable, but feel like this software angle should perhaps have its own thread.

Tarzan v. BMI (James Redd and the Blecchs), Sunday, 3 July 2016 20:47 (seven years ago) link

Unless you are using to calculate Catalan numbers, of course:)

Tarzan v. BMI (James Redd and the Blecchs), Sunday, 3 July 2016 20:54 (seven years ago) link

Don't want to appear uncharitable, but feel like this software angle should perhaps have its own thread.

Yeah. Someone should start a “statistics” (or “data science” or whatever) thread.

Allen (etaeoe), Sunday, 3 July 2016 20:59 (seven years ago) link

Unless you are using to calculate Catalan numbers, of course:)

Or,

http://i.stack.imgur.com/ceazj.png

Allen (etaeoe), Sunday, 3 July 2016 21:00 (seven years ago) link

Don't want to appear uncharitable, but feel like this software angle should perhaps have its own thread.

― Tarzan v. BMI (James Redd and the Blecchs), Sunday, July 3, 2016 3:47 PM (Yesterday) Bookmark Flag Post Permalink

Unless you are using to calculate Catalan numbers, of course:)

― Tarzan v. BMI (James Redd and the Blecchs), Sunday, July 3, 2016 3:54 PM (Yesterday) Bookmark Flag Post Permalink

Maybe Catalan numbers should have their own thread, they rule

Guayaquil (eephus!), Monday, 4 July 2016 19:36 (seven years ago) link

They definitely have their own book or two.

My City Slang Was Gone (James Redd and the Blecchs), Monday, 4 July 2016 19:56 (seven years ago) link

During the "grande affaire" of the earlier twentieth century debate on The Theory of Relativity between Albert Einstein and Henri Bergson, Paul Valéry, the French poet, diarist, and general man of ideas and letters, who corresponded with both on friendly terms, acted as a middleman on at least one occasion, accompanying Einstein on a visit in 1922 to Bergson's home.

My City Slang Was Gone (James Redd and the Blecchs), Monday, 4 July 2016 20:08 (seven years ago) link

Ha, wrong thread, mostly.

My City Slang Was Gone (James Redd and the Blecchs), Monday, 4 July 2016 20:12 (seven years ago) link

i'm against a separate 'data science' thread via apprehension of other ilxors posting their 'opinions' on it. everyone except us seems to ignore this one B-)

de l'asshole (flopson), Monday, 4 July 2016 20:54 (seven years ago) link

ive successfully avoided doing just that so far fwiw :/

( ^_^) (Lamp), Monday, 4 July 2016 21:30 (seven years ago) link

RIP Kalman. almost broke my brain trying to understand your filter in time-series stats class :-)

http://hungarytoday.hu/news/renowned-hungarian-scientis-rudolf-kalman-dies-aged-86-46732

de l'asshole (flopson), Friday, 8 July 2016 16:00 (seven years ago) link

RIP

Hare in the Gated Snare (James Redd and the Blecchs), Saturday, 9 July 2016 01:06 (seven years ago) link

My "aha" moment in getting the Kalman filter was when deriving a simple version of it myself as a special case of the Bayes theorem, iirc.

anatol_merklich, Monday, 18 July 2016 08:51 (seven years ago) link


You must be logged in to post. Please either login here, or if you are not registered, you may register here.