Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

Commit

Permalink
Merge pull request #27 from IBM/ColumnProjection
Browse files Browse the repository at this point in the history
Extracting Numeric/String columns (Project operator)
  • Loading branch information
ppalmes authored May 20, 2021
2 parents a316655 + 519ab42 commit f5bcfe4
Show file tree
Hide file tree
Showing 6 changed files with 1,122 additions and 1 deletion.
34 changes: 34 additions & 0 deletions demo/demo-lale-project.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
using Lale
using DataFrames
using CSV

strfeat= laleoperator("Project","lale")(columns=Dict("type"=>"string"))
numfeat = laleoperator("Project","lale")(columns=Dict("type"=>"number"))

OneHotEncoder = laleoperator("OneHotEncoder")
ConcatFeatures = laleoperator("ConcatFeatures","lale")
Tree = laleoperator("DecisionTreeClassifier")
KNN = laleoperator("KNeighborsClassifier")
RF = laleoperator("RandomForestClassifier")
PCA = laleoperator("PCA")
Hyperopt = laleoperator("Hyperopt","lale")

df = CSV.read("./demo/old/credit.csv",DataFrame)
y = df[!,"class"] |> collect
X = df[:,2:end]

train_X,train_y,test_X,test_y = Lale.train_test_split(X, y,testprop=0.20)

prep_to_numbers =
((stringprj >> OneHotEncoder(handle_unknown = "ignore")) & numericprj )>> ConcatFeatures
planned_orig = prep_to_numbers >> ( Tree | KNN)
lopt = LalePipeOptimizer(planned_orig,max_evals = 10,cv = 3)
laletrained = fit(lopt,train_X,train_y)
lalepred = Lale.transform(laletrained,test_X)
score(:accuracy,lalepred,test_y) |> println

pca_tree_planned = prep_to_numbers >> PCA >> RF
lopt = LalePipeOptimizer(pca_tree_planned,max_evals = 10,cv = 3)
laletrained = fit(lopt,train_X,train_y)
lalepred = Lale.transform(laletrained,test_X)
score(:accuracy,lalepred,test_y) |> println
2 changes: 1 addition & 1 deletion demo/aif360.py → demo/old/aif360.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

import pandas as pd
pd.options.display.max_columns = None
pd.concat([all_y, all_X], axis=1)
df = pd.concat([all_y, all_X], axis=1)

import lale.pretty_print
lale.pretty_print.ipython_display(fairness_info)
Expand Down
Loading

0 comments on commit f5bcfe4

Please sign in to comment.