Fit Predict #5: Pubs are the conference centres of the future
In this issue of the Fit Predict newsletter, we look at Approximate Nearest Neighbors, focussed work, how Netflix chooses personalised artwork, and more.
This is a light-hearted overview of what's been going on in the world of Data Science this week. See it as your 5-minute update such that you can sound at least slightly knowledgeable at your next coffee chat β
Have you been forwarded this? You can subscribe here!
Hey there,
Have you ever had a work conference in a pub? πΊ
No?
Well, I'm happy to report that I now have.
Last week was the aforementioned company "kick-off", and it was held in a stand-up comedy pub here in Oslo. There was something quite cosy about sitting on a stool at a bar counter, whilst listening to a talk about product strategy for our app.
Scandinavian society at work, I guess. π§π»
I have a feeling this will be one of those stories I tell people in five years when the company is five times the size. I'll sit around, telling all the new joiners about the "good old days", when we had our meetings in a pub surrounded by posters from the 50s advertising "beer for children".
Ah yes, nostalgia.
π§° Tools
The tools that will make your life that little bit easier, or at least more interesting... but either way it's fun to play with new toys.
Annoy is an open-sourced project from Spotify for running Approximate Nearest Neighbors. It's written in C++/Python and optimized for memory usage.
This is not a data-related tool, but something which could help you in doing your data work. It's a simple app that adds one visible task to your menu bar, so you can remain focused on doing one thing at a time.
π§βπ¬ In practice
Stories of those who are genuinely implementing Data Science. Step aside Titanic dataset, this is the real deal
The artwork and trailers that you see on Netflix are personalised for you, which is already pretty impressive, but here they go further and explain how the data informs the creation of those creative assets.
In this article, Etsy explains how they improved their support for deep learning models. It's a good reminder of the considerations we need to make when increasing the complexity of our model architectures.
π¦ The best of Data Twitter
Data Twitter is the best Twitter.
Junior DS: Got any fatherly advice for me?
Senior DS: You will inevitably get in quarrels with stakeholders
You can get out of most of them by shouting "drift" and backpedalling away
from @untitled01ipynb
π Thought-provoking
Content to inspire, or at the very least keep you informed.
This blog post argues that data scientists should work in teams, similar to software engineers, rather than going solo in order to improve their skills.

One big question that has arisen since ChatGPT exploded onto the scene, is how to know whether the text has been generated by a human or not. This paper outlines a potential solution.

π§ Updates
Did you know that your favourite Python packages actually get updated regularly and you should update your requirements.txt
file?
π¬ Enjoyed this issue? Share it
π stephenallwright.com/newsletter-issue-5
π¦ Share on Twitter
βοΈ Forward via email