my notes: Light Gradient Boosting

2 min readJan 7, 2020

Light GBM is a tree based learning algorithm where the trees are grown leaf-wise (horizontally) as compared to other models that are grown level-wise.

High accuracy: It is grown on the leaf with the highest loss and will be able to reduce more loss than a level-wise model.
Fast: high performance and takes up lesser memory to run
Runs on GPU
Prone to overfitting: not suitable for datasets <10,000
So many hyper parameters to tune (>100) …

Parameters

objective: regression, binary, multiclass
metric: mae (mean absolute error), use (mean squared error), binary_logloss (binary classification), multi_logloss (multiclass classification)
boosting: gbdt (traditional gradient boosting decision tree), rf (random forest), dart (dropouts meet multiple additive regression trees), goss (gradient based one side sampling)
num_boost_round: number of iterations (usually 100+). Large value increases accuracy but decreases speed of training
learning_rate: determine the contribution of each tree for each iteration. low learning rate will take many iteration (slow) before converging, and a high learning rate may converge quickly but with lower accuracy (usually 0.1, 0.01 etc)
max_depth: max depth of tree. Adjusted to smaller value to prevent overfitting
min_data_in_leaf: min number of record in each leaf to prevent overfitting. higher value will prevent overfitting, but can also cause underfitting. lower values for imbalanced class data such that minority class can fall within the same leaf(usually 100–1000s for large dataset)
num_leaves: total number of leaves in a full tree (usually < 2^max depth)
feature_fraction: subset of features used for growing tree in each iteration. random selection of features for each tree reduces multicollinearity and overfitting (usually 0.8) and speeds up training
bagging_fraction: subset of data to be used in each iteration. prevents overfitting and speeds up training
max_bin: splitting continuous feature into discrete bins. smaller value speeds up training and prevents overfitting, larger value will be more accurate.

my notes: Light Gradient Boosting

Parameters

Written by Cheryl

No responses yet