- Supports most of operating systems: Linux, Mac OS, Windows.
- Supports various platforms: single machine, common cluster, hadoop, spark.
- Without complex installation, only needs Java SE Runtime Environment 8 installation.
- Supports local file system, hdfs file system and uniform file system interface which can be applied to other file systems easily.
- Provides user friendly codes for online prediction.
- Multiple objectives and metrics.
- All models support L1, L2 and L1 + L2 regularization.
- Label-based instance sampling.
- All tree models(GBDT, GBST) support instance sampling, feature sampling.
- All tree models support training with initial prediction.
- Supports continous training with previous checkpoint.
- Weighted Instance training.
- Two kinds of hyperparameter optimizition methods: grid search, hoag(automatic).
- Supports unbiased feature hash.
- Supports feature preprocessing(standardization, scaling).
- Supports count-based feature filtering.
- Provides python-based powerful data transformation script, can transform data lines easily without changing its original data during training.
- Laplace approximation in linear model(used for Thompson sampling in E&E application).
- GBDT features: exact greedy algorithm and histogram approximate algorithm, tree growing by level-wise and leaf-wise policies, see more details in gbdt features.