Last post I talked about how data scientists probably ought to spend some time talking about optimization (but not too much time - I need topics for my blog posts!). While I provided a basic optimization example in that post, that may have not been so interesting, and there definitely wasn't any machine learning involved.
Aug 30, 2016
Jul 20, 2016
You've studied machine learning, you're a dataframe master for massaging data, and you can easily pipe that data through a bunch of machine learning libraries.
Jan 09, 2016
In my last post, I described user- and item-based collaborative filtering which are some of the simplest recommendation algorithms. For someone who is used to conventional machine learning classification and regression algorithms, collaborative filtering may have felt a bit off. To me, machine learning almost always deals with some function which we are trying to maximize or minimize. In simple linear regression, we minimize the mean squared distance between our predictions and the true values. Logistic regression involves maximizing a likelihood function. However, in my post on collaborative filtering, we randomly tried a bunch of different parameters (distance function, top-k cutoff) and watched what happened to the mean squared error. This sure doesn't feel like machine learning.
Nov 02, 2015
I've written before about how much I enjoyed Andrew Ng's Coursera Machine Learning course. However, I also mentioned that I thought the course to be lacking a bit in the area of recommender systems. After learning basic models for regression and classification, recommmender systems likely complete the triumvirate of machine learning pillars for data science.
Oct 06, 2015
This is the final part in my series on going from PhD to Data Science (parts I and II). As previously mentioned, while I was demoing my Insight project at companies, I also spent a good bit of time studying for interviews. The technical areas of study for interviews can …
Sep 29, 2015
Welcome to Part II of my journey from academic to industry data scientist. In my previous post, I wrote of my preparation leading up to the application to Insight Data Science. I will now talk about the Insight application process, the actual program, and demoing my project at companies. I …
Sep 23, 2015
The internet is awash with posts by former PhD students who have succesfully transitioned into data scientist roles in industry (see here, here, here, and tangentially here). I loved reading these posts while studying for job interviews because I felt like the more I saw examples of sucessful transitions, the …
Nov 25, 2014
I think this post will probably conclude my Festival Chatter series on analyzing Bonnaroo tweets in Python (part 1, part 2, part 3). I've had a lot of fun messing around with this dataset, but I think it's time to move on to playing with something else. For this last …
Oct 06, 2014
In this series of posts (part 1, part 2), I have been showing how to use Python and other data scientist tools to analyze a collection of tweets related to the 2014 Bonnaroo Music and Arts Festival. So far, the investigation has been limited to summary data of the full …
Sep 09, 2014
In my previous post, I wrote about how I collected tweets about the Bonnaroo Music and Arts Festival during the entirety of the festival. There are a wide range of questions that could be answered by this dataset, like
- Do people spell worse as they become more intoxicated throughout the …