The error “ModuleNotFoundError: No module named sklearn" is a common error experienced by data scientists when developing in Python. The error is likely an environment issue whereby the scikit-learn package has not been installed correctly on your machine, thankfully there are a few simple steps to go through to troubleshoot the problem and find a solution.
ModuleNotFoundError: No module named sklearn
Your error, whether in a Jupyter Notebook or in the terminal, probably looks like one of the following:
No module named 'sklearn'
ModuleNotFoundError: No module named 'sklearn'
In order to find the root cause of the problem we will go through the following potential fixes:
- Upgrade pip version
- Upgrade or install scikit-learn package
- Check if you are activating the environment before running
- Create a fresh environment
- Upgrade or install Jupyer Notebook package
Are you installing packages using Conda or Pip package manager?
It is common for developers to use either Pip or Conda for their Python package management. It's important to know what you are using before we continue with the fix.
If you have not explicitly installed and activated Conda, then you are almost definitely going to be using Pip. One sanity check is to run
conda info in your terminal, which if it returns anything likely means you are using Conda.
Upgrade or install pip for Python
First things first, let's check to see if we have the up to date version of pip installed. We can do this by running:
pip install --upgrade pip
Upgrade or install scikit-learn package via Conda or Pip
The most common reason for this error is that the scikit-learn package is not installed in your environment or an outdated version is installed. So let’s update the package or install it if it’s missing.
# To install in the root environment conda install -c anaconda scikit-learn # To install in a specific environment conda install -n MY_ENV scikit-learn
# To install in the root environment python3 -m pip install -U scikit-learn # To install in a specific environment source MY_ENV/bin/activate python3 -m pip install -U scikit-learn
Activate Conda or venv Python environment
It is highly recommended that you use isolated environments when developing in Python. Because of this, one common mistake developers make is that they don't activate the correct environment before they run the Python script or Jupyter Notebook. So, let’s make sure you have your correct environment running.
conda activate MY_ENV
For virtual environments:
Create a new Conda or venv Python environment with scikit-learn installed
During the development process, a developer will likely install and update many different packages in their Python environment, which can over time cause conflicts and errors.
Therefore, one way to solve the module error for sklearn is to simply create a new environment with only the packages that you require, removing all of the bloatware that has built up over time. This will provide you with a fresh start and should get rid of problems that installing other packages may have caused.
# Create the new environment with the desired packages conda create -n MY_ENV python=3.9 scikit-learn # Activate the new environment conda activate MY_ENV # Check to see if the packages you require are installed conda list
For virtual environments:
# Navigate to your project directory cd MY_PROJECT # Create the new environment in this directory python3 -m venv MY_ENV # Activate the environment source MY_ENV/bin/activate # Install scikit-learn python3 -m pip install scikit-learn
Upgrade Jupyter Notebook package in Conda or Pip
If you are working within a Jupyter Notebook and none of the above has worked for you, then it could be that your installation of Jupyter Notebooks is faulty in some way, so a reinstallation may be in order.
conda update jupyter
pip install -U jupyter
Best practices for managing Python packages and environments
Managing packages and environments in Python is notoriously problematic, but there are some best practices which should help you to avoid package the majority of problems in the future:
- Always use separate environments for your projects and avoid installing packages to your root environment
- Only install the packages you need for your project
- Pin your package versions in your project’s requirements file
- Make sure your package manager is kept up to date
Join the newsletter to receive the latest updates from the world of Data Science in your inbox.