ℝolliℵg M∀th Thr∑a∂

Message Bookmarked
Bookmark Removed
Not all messages are displayed: show all messages (1172 of them)

i once got asked in an interview "what kind of data scientist are you" and it turned out he was getting at this product/production vs analyst distinction. i think it's real, and IME r definitely falls on one side of it in practice, and that's at least in part because of the design of the language (rather than mere social network effects). but to be clear there are tons of jobs where r is far and away the most useful language you can know.

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 15:27 (nine years ago)

yeah I mean we're just reading ads but it seems to me if you want a doctorate in math/stats then you're not just looking for a developer. but I dunno.

droit au butt (Euler), Thursday, 30 June 2016 15:29 (nine years ago)

this is extremely reductive and misses out on tons of factors/complications, but gives a very rough idea of what's most valuable to know. valuable != necessary of course.

https://duu86o6n09pv.cloudfront.net/reports/2015-data-science-salary-survey.pdf

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 15:37 (nine years ago)

huh that's interesting and helpful

here's a very stupid question: is there some recommended "certification" for having learned these tools, or can you just pick them up on your own and then list it on your cv/resumé ? my own CS degree is like 20 years old & I don't remember anything about that (& my wife doesn't have any CS degrees, just math, though she used Matlab a lot for her dissertation, in applied math). like what do self-trained people in these tools have to do to convince employers that they can use them? or will this come out in some test in an interview?

droit au butt (Euler), Thursday, 30 June 2016 15:51 (nine years ago)

for data science, it's less of a problem to be a self taught coder in "tech" businesses than in more traditional business. the discipline is mature enough that there's a fairly good change you end up being interviewed by someone who themselves has a strong quant but non-CS phd.

so, given a maths phd, i don't think further credentials are strictly necessary.

that said, there's a cottage industry of boot camps/recruitment things that make the transition quite a lot easier (and perhaps more lucrative), either by formally teaching stuff and providing credentials, providing an environment in which your "job" is to learn for a few weeks, or helping with applications/interviews. http://insightdatascience.com/ is the best known of these.

if your wife knows matlab already, then i recommend andrew ng's coursera machine learning course. it's intellectually interesting but it's also excellent interview prep. the only thing i didn't like about it was that the exercises were in matlab, because i had to waste time learning that. i put that (and a couple of other coursera courses) on my resume my first time out, but i don't think anyone noticed or cared about how i'd acquired the knowledge.

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 16:01 (nine years ago)

ok super, we'll have a look. she's got plenty of time for coursera courses; right now she's working through an O'Reilly book on R and it's going easily as expected.

droit au butt (Euler), Thursday, 30 June 2016 16:04 (nine years ago)

(major caveat with any advice i give: my experience and network is all tech/startup, which is an unusual industry and is not where most of the jobs are, i.e. healthcare, insurance, finance, etc.)

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 16:04 (nine years ago)

right, she's looking at the tech/startup industry in Paris, which is quite weird as you can imagine.

droit au butt (Euler), Thursday, 30 June 2016 16:11 (nine years ago)

(though one startup in Paris last year hired more mathematicians in France than all universities in France combined, and this is the current target)

droit au butt (Euler), Thursday, 30 June 2016 16:12 (nine years ago)

caek does your ilxmail work? my wife has questions for you if you'd be willing.

droit au butt (Euler), Thursday, 30 June 2016 16:44 (nine years ago)

i read this book

http://www-bcf.usc.edu/~gareth/ISL/

which does all the examples in R. the methods are outdated but perfect for getting the intuition, and the big themes bias-variance tradeoff are really well-developed. it's extremely easy and i got through it in a week. it's the baby version (created for an MBA class iirc) of Elements Of Statistical Learning, which i'm reading now

de l'asshole (flopson), Thursday, 30 June 2016 17:03 (nine years ago)

i hear v good things about ESL and ISL

euler i think so, and sure!

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 17:05 (nine years ago)

you can probably do all those things in r (write an api, collaborative filtering, train a neural network, etc.), but i don't know anybody who does in production.

ha, having said that, i saw on twitter this talk is happening today

http://schedule.user2016.org/event/7Sq2/gradient-boosted-trees-model-deploying-r-models-into-production-environments

𝔠𝔞𝔢𝔨 (caek), Thursday, 30 June 2016 18:05 (nine years ago)

Allen/etaeoe, what's your favorite plotting library (in any language) right now?

I use ggplot2 all day every day, and while I try to keep my eye on new developments, I haven't yet found anything else yet that lets me get what's in my mind's eye onto a realized plot as quickly and easily. Lately I've been using Plotly with it, and wrapping ggplots in ggplotly() for some quick and easy interactivity (zooming, tool tips, etc)

Dan I., Thursday, 30 June 2016 18:59 (nine years ago)

-1 "yet"

Dan I., Thursday, 30 June 2016 19:00 (nine years ago)

Another good applied intro-level book along the lines of ISL is Max Kuhn's Applied Predictive Modeling, which gets into some hairier stuff that other sources tend to skip like how to deal with extreme class imbalance. He also touches on response surface methodology and multiobjective optimization, which is potentially so useful but I never see anybody else talking about (then again I don't come from an engineering background). Again, though, the book is R-based, so don't read it if you hate R.

Dan I., Thursday, 30 June 2016 20:47 (nine years ago)

Allen/etaeoe, what's your favorite plotting library (in any language) right now?

“It’s complicated.”

Typically, I use visualizations either as descriptive statistics or as figures.

When I need a descriptive statistic (e.g. histogram, Q-Q, or scatter), I’ll continue to use Seaborn from Python and ggplot2 from R. I find them too verbose. Especially when compared to R’s default plotting functions. But they work.

When I need a figure, I’ll use D3 to render an SVG suitable for publication. I’ve tried Cytoscape too. If a figure is computationally expensive to render (e.g. more than one hundred thousand observations), I’ll use SVG or WebGL directly.

I’ve used TikZ too. It works.

Everything I’ve mentioned feels inadequate. When I used ggplot2 (matplotlib too) in 2005, it was a major revelation. TikZ too. However, it’s been an insane decade for mathematics and statistics. 2005’s tools feel way too limiting for the ideas I want to express in 2016.

Conceptually, D3 is fantastic. And Mike Bostock has been a champion for articulating the transition we’re undergoing. Unfortunately, I don’t think D3 should become the default option. It feels antithetical to both standard and emerging web technologies. And it’s isolated from the larger web ecosystem (e.g. D3 uses custom selection and data-binding operations).

I think Plot.ly’s Plotly.js library is sensible as a curated collection of D3 visualizations. But venture-backed visualization software makes me nervous.

I also feel burdened by the lack of contemporary visualization tools for common problems (e.g. volumetric images).

Allen (etaeoe), Sunday, 3 July 2016 20:45 (nine years ago)

Don't want to appear uncharitable, but feel like this software angle should perhaps have its own thread.

Tarzan v. BMI (James Redd and the Blecchs), Sunday, 3 July 2016 20:47 (nine years ago)

Unless you are using to calculate Catalan numbers, of course:)

Tarzan v. BMI (James Redd and the Blecchs), Sunday, 3 July 2016 20:54 (nine years ago)

Don't want to appear uncharitable, but feel like this software angle should perhaps have its own thread.

Yeah. Someone should start a “statistics” (or “data science” or whatever) thread.

Allen (etaeoe), Sunday, 3 July 2016 20:59 (nine years ago)

Unless you are using to calculate Catalan numbers, of course:)

Or,

http://i.stack.imgur.com/ceazj.png

Allen (etaeoe), Sunday, 3 July 2016 21:00 (nine years ago)

Don't want to appear uncharitable, but feel like this software angle should perhaps have its own thread.

― Tarzan v. BMI (James Redd and the Blecchs), Sunday, July 3, 2016 3:47 PM (Yesterday) Bookmark Flag Post Permalink

Unless you are using to calculate Catalan numbers, of course:)

― Tarzan v. BMI (James Redd and the Blecchs), Sunday, July 3, 2016 3:54 PM (Yesterday) Bookmark Flag Post Permalink

Maybe Catalan numbers should have their own thread, they rule

Guayaquil (eephus!), Monday, 4 July 2016 19:36 (nine years ago)

They definitely have their own book or two.

My City Slang Was Gone (James Redd and the Blecchs), Monday, 4 July 2016 19:56 (nine years ago)

During the "grande affaire" of the earlier twentieth century debate on The Theory of Relativity between Albert Einstein and Henri Bergson, Paul Valéry, the French poet, diarist, and general man of ideas and letters, who corresponded with both on friendly terms, acted as a middleman on at least one occasion, accompanying Einstein on a visit in 1922 to Bergson's home.

My City Slang Was Gone (James Redd and the Blecchs), Monday, 4 July 2016 20:08 (nine years ago)

Ha, wrong thread, mostly.

My City Slang Was Gone (James Redd and the Blecchs), Monday, 4 July 2016 20:12 (nine years ago)

i'm against a separate 'data science' thread via apprehension of other ilxors posting their 'opinions' on it. everyone except us seems to ignore this one B-)

de l'asshole (flopson), Monday, 4 July 2016 20:54 (nine years ago)

ive successfully avoided doing just that so far fwiw :/

( ^_^) (Lamp), Monday, 4 July 2016 21:30 (nine years ago)

RIP Kalman. almost broke my brain trying to understand your filter in time-series stats class :-)

http://hungarytoday.hu/news/renowned-hungarian-scientis-rudolf-kalman-dies-aged-86-46732

de l'asshole (flopson), Friday, 8 July 2016 16:00 (nine years ago)

RIP

Hare in the Gated Snare (James Redd and the Blecchs), Saturday, 9 July 2016 01:06 (nine years ago)

My "aha" moment in getting the Kalman filter was when deriving a simple version of it myself as a special case of the Bayes theorem, iirc.

anatol_merklich, Monday, 18 July 2016 08:51 (nine years ago)

can you show us?

de l'asshole (flopson), Monday, 18 July 2016 17:09 (nine years ago)

Proof is left to the readers.

Death of a Disco Mystic (James Redd and the Blecchs), Monday, 18 July 2016 20:17 (nine years ago)

The proof is obvious

Miami Jeeves And The Ties That Bind (James Redd and the Blecchs), Monday, 18 July 2016 20:22 (nine years ago)

Or is it?

Miami Jeeves And The Ties That Bind (James Redd and the Blecchs), Monday, 18 July 2016 20:23 (nine years ago)

*leaves thread*

Miami Jeeves And The Ties That Bind (James Redd and the Blecchs), Monday, 18 July 2016 20:23 (nine years ago)

*time passes*

Miami Jeeves And The Ties That Bind (James Redd and the Blecchs), Monday, 18 July 2016 20:23 (nine years ago)

Yes, it's obvious

Miami Jeeves And The Ties That Bind (James Redd and the Blecchs), Monday, 18 July 2016 20:24 (nine years ago)

Been a long time, I'll see if I can reproduce the aha. :-)

anatol_merklich, Tuesday, 19 July 2016 06:06 (nine years ago)

https://twitter.com/AnalysisFact

flopson, Wednesday, 20 July 2016 16:53 (nine years ago)

http://www.johndcook.com/blog/twitter_page/

flopson, Wednesday, 20 July 2016 16:55 (nine years ago)

Euler buy this for your wife for xmas ;-) http://r4ds.had.co.nz/introduction-1.html

flopson, Friday, 22 July 2016 21:12 (nine years ago)

looks good!

droit au butt (Euler), Saturday, 23 July 2016 15:46 (nine years ago)

Still pondering making a mod request to delete my miscalculation of the first few Catalan numbers.

The New Original Human Beatbox (James Redd and the Blecchs), Saturday, 30 July 2016 03:55 (nine years ago)

one month passes...

FYI Artificial intelligence still has some way to go

Allen (etaeoe), Monday, 26 September 2016 19:25 (nine years ago)

https://www.youtube.com/watch?v=b0HzWMqLeiE

Berberian Begins at Home (James Redd and the Blecchs), Monday, 26 September 2016 19:29 (nine years ago)

one month passes...

In a lecture that started talking about https://en.m.wikipedia.org/wiki/Classical_Wiener_space#Classical_Wiener_measure

And finding it very hard not to bust out giggling

the klosterman weekend (s.clover), Wednesday, 23 November 2016 21:13 (nine years ago)

hope you don't run into Tits groups then

droit au butt (Euler), Wednesday, 23 November 2016 21:21 (nine years ago)

Nah this is way funnier

the klosterman weekend (s.clover), Wednesday, 23 November 2016 21:59 (nine years ago)

I asked unthread for a readable analysis book and I chanced upon a pretty good one somehow. I'm reading Abbott, Understanding Analysis just for comprehension and it's going pretty well, in that a bunch of half-remembered things from high school math suddenly seem important in light of the careful construction of R and proofs about sequences and limits. On to continuity.

slathered in cream and covered with stickers (silby), Monday, 28 November 2016 05:50 (nine years ago)

yeah, i had high school math flashbacks when i took intro analysis. "wait, haven't i done this before? oh wait, it all fits together."

Einstein, Kazanga, Sitar (abanana), Monday, 28 November 2016 06:25 (nine years ago)


You must be logged in to post. Please either login here, or if you are not registered, you may register here.