We compare our GBDT with XGBoost (commit e38bea3) and LightGBM (commit 57ad014) on 04/20/2017.
Experiments are conducted on Higgs. The task is of binary classification. The size of train set is 10,500,000 and the number of features is 28, and we use the last 500,000 samples as test set.
We perform the experiment on a Linux Server. Following is the detailed info:
OS: CentOS Linux release 7.1.1503
CPU: 2 * Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (32 virtual cores in total)
Memory: 128G
Following are main configurations of XGBoost, LightGBM and our GBDT. To make the three experiments fair, all models are trained by histogram approximate algorithm while average of logloss of train and test data set are monitored during training process(loss is calculated in default during each training iteration in ytk-learn).
- XGBoost
nthread=16
tree_method=hist
max_bin=255
grow_policy=lossguide
objective="binary:logistic"
max_leaves=255
num_round=500
max_depth=0
eta=0.1
min_child_weight=100
gamma=0
lambda=0
alpha=0
eval_metric="logloss"
eval[train]="higgs.train"
eval[test]="higgs.test"
- LightGBM
num_threads=16
objective=binary
max_bin=255
num_leaves=255
num_trees=500
learning_rate=0.1
min_data_in_leaf=0
min_sum_hessian_in_leaf=100
metric=binary_logloss
is_training_metric=true
valid="higgs.test"
- ytk-learn
tree_maker: "data",
tree_grow_policy: "loss",
max_leaf_cnt: 255,
round_num: 500,
max_depth:-1,
loss_function : "sigmoid",
regularization.learning_rate: 0.1,
regularization.l1 : 0.0,
regularization.l2: 0.0,
min_split_samples:-1,
min_child_hessian_sum:100,
feature.approximate: [
{cols: "default", type: "sample_by_quantile", max_cnt: 255, use_sample_weight: false, alpha: 0.5}]
thread_num=16
We run the experiment of XGBoost and LightGBM based on boosting_tree_benchmarks and we modified the configurations to suit our needs. And the experiment of our GBDT is based on higgs.
- Performance
In our GBDT, the results of each training may be different because of the unstable algorithm of generating percentiles, so we repeate the experiment three times. The following table shows average logloss and AUC.
XGBoost | LightGBM | GBDT in ytk-learn | |
---|---|---|---|
logloss | train:0.473704,test:0.482996 | train:0.473409,test:0.482948 | (1) train: 0.472630, test: 0.482095 (2) train:0.473604, test:0.483073 (3) train:0.473141, test:0.482539 |
auc | 0.845605 | 0.845612 | (1) 0.846235 (2) 0.845539 (3) 0.845923 |
- Speed
The following table shows running time of experiments.
XGBoost | LightGBM | ytk-learn | |
---|---|---|---|
load data and preprocess | 11.96s | 14.24s | 35.46s |
train | 610.38s | 269.19s | 567.83s |
total | 622.34s | 283.43s | 603.30s |
Loss is calculated in default during each training iteration in ytk-learn, so we also monitor train and test loss in xgboost and LightGBM. Without monitoring loss, the results are listed in the following table:
XGBoost | LightGBM | |
---|---|---|
load data and preprocess | 5.45s | 13.46s |
train | 372.01s | 239.96s |
total | 377.46s | 253.42s |
We compare our GBDT with LightGBM (commit 353286a) on 05/18/2017.
In this experiment, we use the same data Higgs as "Experiment on single machine" to compare the change of computing and communication time with the number of machines.
OS: CentOS Linux release 7.2.1511
CPU: 2 * Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz (24 virtual cores in total)
Memory: 128G
Network: 1 Gigabit Ethernet.
The Configuration is almost the same ecept the thread number, bin count and tree number.
- Configuration: bin count: 255, tree number: 50, thread number: 4
Number of Machines | Building Tree cost in ytk-learn | Building Tree cost in LightGBM |
---|---|---|
1 | 196.34s | 121.84s |
2 | 140.93s | 105.02s |
3 | 133.89s | 128.56s |
4 | 135.76s | 72.97s |
5 | 149.07s | 87.7s |
6 | 165.01s | 94.31s |
7 | 171.38s | 118.57s |
8 | 185.03s | 71.51s |
9 | 196.55s | 454.87s |
- Configuration: bin count: 5000, tree number:20, thread number:4. We use larger bin count mainly to compare the change of communicgation time.
Number of Machines | Building Tree cost in ytk-learn | Building Tree cost in LightGBM |
---|---|---|
1 | 109.92s | 64.90s |
2 | 130.81s | 129.67s |
3 | 174.78s | 309.39s |
4 | 192.96s | 157.94s |
5 | 225.09s | 326.38s |
6 | 258.04s | 352.55s |
7 | 228.79s | 329.99s |
8 | 252.15s | 181.23s |
9 | 265.06s | 322.11s |
When using more machines, it takes more time in communication. Even the computing time is less, the total time may also be more. From the above tables, we can see that LightGBM takes much more time when number of machines isn't a power of two.