Fit Predict #3: At this point, what's not written in Rust?

In this issue of the Fit Predict newsletter, we look at a Python linter written in Rust, Copliot extensions for VSCode, deep learning search ranking, and more.

Stephen Allwright
Stephen Allwright
πŸ“₯
What is Fit Predict?
This is a light-hearted overview of what's been going on in the world of Data Science this week. See it as your 5-minute update such that you can sound at least slightly knowledgeable at your next coffee chat β˜•

Have you been forwarded this? You can subscribe here!

Hey there,

My past week or so has primarily been spent making improvements to a new customer behaviour prediction model. Whilst it's been fun, it has become a classic case of scope creep. What started out as a simple bug fix, ended up being a two-week-long overhaul.

Whoops. It happened again.

So, my advice to you. If you ever find yourself thinking "oh let me just add this quickly", put the keyboard down, go for a walk, and really think long and hard about your life choices.

I'm happy to say that the changes I made have improved model performance, but at what cost I ask you... what cost?!

Anyway, on with the show!

🧰 Tools

The tools that will make your life that little bit easier, or at least more interesting... but either way it's fun to play with new toys.

Code Brushes

This addition to Copilot's VSCode extension helps you to modify your code in a similar way to working in a tool like Photoshop.
Ruff linter

An extremely fast Python linter, written in... you guessed it, Rust! Like everything is these days.Β 
Kangas

This Python package will help you explore multimedia datasets.

πŸ§‘β€πŸ”¬ In practice

Stories of those who are genuinely implementing Data Science. Step aside Titanic dataset, this is the real deal

Deep learning search ranking at Etsy

Etsy moved from a boosting model for their search rankings over to a deep learning model. This blog post is an insightful and honest look at why they made the switch, how they did it, and what the results were.Β 
LinkedIn's feature store

LinkedIn open-sourced the feature store that they use to develop their machine learning models. In this post, they explain why they built it and how it works, which is a great reminder of how speeding up the basics of machine learning development can pay dividends as you scale.

🐦 The best of Data Twitter

Data Twitter is the best Twitter.

You can't "train" a model.

The model always exists.

It existed before you were born and it exists after your death.

You can only find the model.

"Training" is just your way of looking for the model's location in the infinite hypothesis space and binding its essence to siliconEvery single day.

from @ChristophMolnar
Hello, I've been using Matplotlib for 7+ years and I still google how to do everything except plt.plot()

from @marktenenholtz
worriedly waiting until they're going to rewrite me in rust

from @vboykis
Every single day.

from @rabaath

πŸ’­ Thought-provoking

Content to inspire, or at the very least keep you informed.

I've personally never looked at a SQL file and thought it told a compelling story, but after reading this I think I might start to.

SQL Tells a Human Story β€” Little Miss Data
Read through your most complicated SQL files to learn the humans story of how your organization functions.

What is a realistic goal or target? This is a difficult question to answer as a data professional, but this post offers a solution.

Demetri Pananos Ph.D - Forecasting Experimental Lift Using Hierarchical Bayesian Modelling
You’re part of a team at a company who is tasked with improving conversion on some web page.

πŸ”§ Updates

Did you know that your favourite Python packages actually get updated regularly and you should update your requirements.txt file?

A few other minor releases to be aware of:


πŸ’¬ Enjoyed this issue? Share it

πŸ”— stephenallwright.com/newsletter-issue-3
🐦 Share on Twitter
βœ‰οΈ Forward via email

Stephen Allwright Twitter

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.