-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathexample.py
67 lines (47 loc) · 2.03 KB
/
example.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# coding: utf-8
# ## Example: tuning xgboost for multinomial regression
# Here is a basic example of using `xgbtuner` to optimize the parameters of an xgboost instance that does multinomial prediction. First import some modules and load the example data:
# In[3]:
import xgbtuner
import pkg_resources
import pandas as pd
import xgboost as xgb
f = pkg_resources.resource_filename('xgbtuner', 'data/fake_multinomial_data.csv')
df = pd.read_csv(f)
xdat = xgb.DMatrix(
data = df.drop('target', axis=1).values,
label= df['target'].values
)
# To search for the best xgboost parameters, you first have to decide what "best" means by setting up a loss function to be minimized (using the negative of the objective function if bigger is better):
# In[4]:
loss = xgbtuner.Loss(xdat)
# `Loss` is a class that holds the data and all model parameters and includes a method to evaluate the cross validation score for a given objective function. By default, `Loss` uses the default xgboost model, which uses a tree as the base learner and minimizes the mean squared error of the predictions (treating the response as numeric, not multinomial/categorical!). To use `Loss` with multinomial regression, therefore, it is essential to change at least some of the defaults.
#
# `Loss` groups the loss function parameters into two types:
# * fixed_params: Overrides xgboost defaults. Any specified values will be held constant throughout the tuning process.
# * tuning_params: Specifies any parameters that should be tuned.
#
# In[5]:
fixed_params = {
'nfold':5,
'n_early_stop':20,
'objective':'multi:softprob',
'eval_metric':'mlogloss',
'num_class':3
}
# Evaluate the objective on a set of params:
params = {
'num_boost_round':10,
'bst:max_depth':6,
'bst:eta':0.3,
'bst:min_child_weight':3,
}
loss.evaluate(params)
#
#
# # Fit the final xgboost model
# gbm = xgb.train(param, xdat, num_boost_round = wm, verbose_eval=False)
#
# # Use the model to make predictions on new data
# predictions = gbm.predict(xdat)
# In[ ]: