# Data Piques

## Quick and Dirty Serverless Integer Programming

We all know that Python has risen above its humble beginnings such that it now powers billion dollar companies. Let's not forget Python's roots, though! It's still an excellent language for running quick and dirty scripts that automate some task. While this works fine for automating my own tasks because I know how to navigate the command line, it's a bit much to ask a layperson to somehow install python and dependencies, open Terminal on a Mac (god help you if they have a Windows computer), type a random string of characters, and hit enter. Ideally, you would give the layperson a button, they hit it, and they get their result.

## Time Series for scikit-learn People (Part II): Autoregressive Forecasting Pipelines

In this post, I will walk through how to use my new library skits for building scikit-learn pipelines to fit, predict, and forecast time series data.

## Time Series for scikit-learn People (Part I): Where's the X Matrix?

When I first started to learn about machine learning, specifically supervised learning, I eventually felt comfortable with taking some input $\mathbf{X}$, and determining a function $f(\mathbf{X})$ that best maps $\mathbf{X}$ to some known output value $y$. Separately, I dove a little into time series analysis and thought of this as a completely different paradigm. In time series, we don't think of things in terms of features or inputs; rather, we have the time series $y$, and $y$ alone, and we look at previous values of $y$ to predict future values of $y$.

## Matrix Factorization in PyTorch

Hey, remember when I wrote those ungodly long posts about matrix factorization chock-full of gory math? Good news! You can forget it all. We have now entered the Era of Deep Learning, and automatic differentiation shall be our guiding light.

## From Analytical to Numerical to Universal Solutions

I've been making my way through the recently released Deep Learning textbook (which is absolutely excellent), and I came upon the section on Universal Approximation Properties. The Universal Approximation Theorem (UAT) essentially proves that neural networks are capable of approximating any continuous function (subject to some constraints and with upper …

## Rec-a-Sketch: a Flask App for Interactive Sketchfab Recommendations

After the long series of previous posts describing various recommendation algorithms using Sketchfab data, I decided to build a website called Rec-a-Sketch which visualizes the different algorithms' recommendations. In this post, I'll describe the process of getting this website up and running on AWS with nginx and gunicorn.

## Using Keras' Pretrained Neural Networks for Visual Similarity Recommendations

To close out our series on building recommendation models using Sketchfab data, I will venture far from the previous posts' factorization-based methods and instead explore an unsupervised, deep learning-based model. You'll find that the implementation is fairly simple with remarkably promising results which is almost a smack in the face to all of that effort put in earlier.

## Learning to Rank Sketchfab Models with LightFM

In this post we're going to do a bunch of cool things following up on the last post introducing implicit matrix factorization. We're going to explore Learning to Rank, a different method for implicit matrix factorization, and then use the library LightFM to incorporate side information into our recommender. Next, we'll use scikit-optimize to be smarter than grid search for cross validating hyperparameters. Lastly, we'll see that we can move beyond simple user-to-item and item-to-item recommendations now that we have side information embedded in the same space as our users and items. Let's go!

## Intro to Implicit Matrix Factorization: Classic ALS with Sketchfab Models

Last post I described how I collected implicit feedback data from the website Sketchfab. I then claimed I would write about how to actually build a recommendation system with this data. Well, here we are! Let's build.

## Likes Out! Guerilla Dataset!

tl;dr -> I collected an implicit feedback dataset along with side-information about the items. This dataset contains around 62,000 users and 28,000 items. All the data lives here inside of this repo. Enjoy!

Next → Page 1 of 3