How to install LightGBM in Python
This post walks you through how to install the popular machine learning model, LightGBM, in Python.
1. What is LightGBM?
2. Install LightGBM in Python
3. Use LightGBM in Python
What is LightGBM?
LightGBM is an open-source machine learning model developed by Microsoft for classification and regression problems which uses gradient boosting.
It's an ensemble method which trains a series of decision trees sequentially but does so leaf-wise (aka. vertically), where the trees have many leaves but the number of trees is relatively low. This approach creates a highly performant boosting model whilst being fast to train.
How do I install LightGBM in Python?
There are several methods for installing LightGBM, the most common of which are:
- Pip
- Conda
- Poetry
- Homebrew (macOS)
Install LightGBM using pip
The suggested and preferred method for installing LightGBM is through the pip package manager. In order to do this, you need to run this command in your terminal:
pip install lightgbm
Install LightGBM using Conda
Whilst not the preferred method, it is also possible to install LightGBM using the Conda package manager, by running this in your terminal:
conda install -c conda-forge lightgbm
Install LightGBM using Poetry
If you use Poetry for your Python environment management, then you can install it by adding the package to your project requirements like so:
cd pre-existing-project
poetry init
poetry add lightgbm
Install LightGBM using Homebrew on macOS
When on macOS
it's possible to use Homebrew to install LightGBM through the terminal:
brew install lightgbm
How do I use LightGBM in Python?
Once you have installed the LightGBM package using one of the above methods, it can be used within your Python script.
Here is a simple example of how that could look:
import lightgbm as lgb
import pandas as pd
from sklearn.model_selection import train_test_split
x = pd.read_csv('data_train.csv')
y = pd.read_csv('data_target.csv')
x_train, x_test, y_train, y_test = train_test_split(x.values, y.values, test_size=0.2)
train_data = lgb.Dataset(x_train, label=y_train)
test_data = lgb.Dataset(x_test, label=y_test)
params = {'metric': 'auc', 'objective': 'binary'}
model = lgb.train(params,
train_data,
num_boost_round=100,
valid_sets=test_data)
In this example we undertook the following steps:
- Load the feature dataset,
x
, and the targets,y
- Split the data into testing and training
- Convert the data into LightGBM
Dataset
objects - Define the parameters for the model, which in our case is binary classification using
auc
as the metric - Train the model
Related articles
LightGBM vs XGBoost
LightGBM vs Catboost