Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

84 compare basic fedot vs fedot eith contextual mab warm start #86

Open
wants to merge 39 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
763581d
remove datasets list from datasets loaders
MorrisNein Nov 14, 2023
50b3ea4
introduce generic typing & index retrieval for dataset data
MorrisNein Nov 16, 2023
2b2b28b
remove meta-features cache, refactor mf extractor interface
MorrisNein Nov 16, 2023
e94f973
minor fixes
MorrisNein Nov 16, 2023
d496bcb
remove redundant constant
MorrisNein Nov 24, 2023
4148123
add persistent_cache.py
MorrisNein Nov 14, 2023
4b9c12a
add logging messages
MorrisNein Nov 15, 2023
869e5e4
finalize persistent_cache.py
MorrisNein Nov 16, 2023
b353401
finalize persistent_cache.py
MorrisNein Nov 16, 2023
ea24a17
create Dockerfile abd .dockerignore
MorrisNein Apr 20, 2023
34591a5
create the experiment script & config
MorrisNein Jul 20, 2023
c4c2680
adapt to #39
MorrisNein Jul 27, 2023
3e5e7bb
add config for debugging
MorrisNein Jul 28, 2023
bbfd898
remove data leak
MorrisNein Oct 12, 2023
e4fd2ff
persist train/test datasets split
MorrisNein Oct 12, 2023
bd9697a
add final choices to the best models
MorrisNein Oct 12, 2023
b765671
add FedotHistoryLoader
MorrisNein Oct 22, 2023
dbfdfb4
add MetaLearningApproach and its children
MorrisNein Oct 22, 2023
24f33aa
set TMPDIR from script
MorrisNein Nov 3, 2023
274da9a
simplify MetaLearningApproach
MorrisNein Nov 4, 2023
5a4a9e4
set logging level of FEDOT
MorrisNein Nov 7, 2023
7c00db9
create config_light.yaml
MorrisNein Nov 10, 2023
acdf7a8
add dataset_id to description
MorrisNein Nov 10, 2023
70549e5
fix train/test split
MorrisNein Nov 13, 2023
f3d79c7
fix progress bar
MorrisNein Nov 13, 2023
5a98422
make fit unnecessary for MetaLearningApproach
MorrisNein Nov 14, 2023
02c9af4
fix n_datasets
MorrisNein Nov 14, 2023
4962931
add evaluation caching
MorrisNein Nov 15, 2023
e943c02
split config file
MorrisNein Nov 21, 2023
d4eeb69
add data split
MorrisNein Nov 21, 2023
41148fb
fix types into inner components
MorrisNein Nov 21, 2023
5195176
increase debug fedot timeout
MorrisNein Nov 21, 2023
40feb19
fix knn experiment
MorrisNein Nov 21, 2023
0913354
fix pipeline evaluation, compute fitness on test data
MorrisNein Nov 24, 2023
aa6a03d
refactor
maypink Nov 27, 2023
3ccb77d
add new infrastructure
maypink Nov 30, 2023
b47c9cc
refactor
maypink Dec 4, 2023
bc77eef
add consideration of datasets
maypink Dec 8, 2023
7a9e882
move datasets from train to test
maypink Dec 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Config & info files
.pep8speaks.yml
Dockerfile
LICENSE
README.md

# Unnecessary files
examples
notebooks
test

# User data
data/cache
30 changes: 30 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Download base image ubuntu 20.04
FROM ubuntu:20.04

# For apt to be noninteractive
ENV DEBIAN_FRONTEND noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN true

# Preseed tzdata, update package index, upgrade packages and install needed software
RUN truncate -s0 /tmp/preseed.cfg; \
echo "tzdata tzdata/Areas select Europe" >> /tmp/preseed.cfg; \
echo "tzdata tzdata/Zones/Europe select Berlin" >> /tmp/preseed.cfg; \
debconf-set-selections /tmp/preseed.cfg && \
rm -f /etc/timezone /etc/localtime && \
apt-get update && \
apt-get install -y nano && \
apt-get install -y mc && \
apt-get install -y python3.9 python3-pip && \
apt-get install -y git && \
rm -rf /var/lib/apt/lists/*

# Set the workdir
ENV WORKDIR /home/meta-automl-research
WORKDIR $WORKDIR
COPY . $WORKDIR

RUN pip3 install pip && \
pip install wheel && \
pip install --trusted-host pypi.python.org -r ${WORKDIR}/requirements.txt

ENV PYTHONPATH $WORKDIR
5 changes: 1 addition & 4 deletions configs/run_surrogate_model.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,9 @@ model:
model_parameters:
pipe_encoder_type: "graph_transformer"
dataset_encoder_type: "column"

dataset_params:
root_path: "./data/pymfe_meta_features_and_fedot_pipelines/all"

dataset_params:
root_path: "./data/pymfe_meta_features_and_fedot_pipelines/all"

model_data:
save_dir: "./experiments/base/"
save_dir: "./experiments/base2/"
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ def main():

if __name__ == '__main__':
result = main()
print(result)
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
dataset_name = 'higgs'
datasets_loader = OpenMLDatasetsLoader()
dataset = datasets_loader.load_single(dataset_name, allow_name=True)
checkpoints_dir = get_checkpoints_dir() / 'tabular'
checkpoints_dir = get_checkpoints_dir() / 'base'
# Load surrogate model
surrogate_model = RankingPipelineDatasetSurrogateModel.load_from_checkpoint(
checkpoint_path=checkpoints_dir / 'checkpoints/best.ckpt',
Expand Down
2 changes: 1 addition & 1 deletion examples/6_gnn_surrogate/surrogate_optimizer_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
dataset_name = 'sylvine' # Specify your OpenML dataset here to get the dataset meta-features.
datasets_loader = OpenMLDatasetsLoader()
train_data = datasets_loader.load_single(dataset_name, allow_name=True)
surrogate_knowledge_base_dir = get_checkpoints_dir() / 'tabular'
surrogate_knowledge_base_dir = get_checkpoints_dir() / 'base'

# Load surrogate model
surrogate_model = RankingPipelineDatasetSurrogateModel.load_from_checkpoint(
Expand Down
2 changes: 1 addition & 1 deletion examples/knowledge_base_loading.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
for dataset_id in train_datasets['dataset_id']:
dataset_models = models_loader.load(
dataset_ids=[dataset_id], # load models just for this exact dataset.
fitness_metric='logloss', # must correspond to a metric name in a knowledge base.
fitness_metric='logloss', # must correspond to a metric name in a knowledge base2.
)
models_for_train[dataset_id] = dataset_models

Expand Down
Empty file added experiments/__init__.py
Empty file.
Empty file.
17 changes: 17 additions & 0 deletions experiments/fedot_warm_start/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
seed: 42
tmpdir: '/var/essdata/tmp'
#data_settings:
n_datasets: null # null for all available datasets
test_size: 0.25
train_timeout: 15
test_timeout: 15
#meta_learning_params:
n_best_dataset_models_to_memorize: 10
mf_extractor_params:
groups: general
assessor_params:
n_neighbors: 5
advisor_params:
minimal_distance: 1
n_best_to_advise: 5
17 changes: 17 additions & 0 deletions experiments/fedot_warm_start/config_debug.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
seed: 42
save_dir_prefix: debug_
#data_settings:
n_datasets: 3 # null for all available datasets
test_size: 0.33
train_timeout: 1
test_timeout: 1
#meta_learning_params:
n_best_dataset_models_to_memorize: 10
mf_extractor_params:
groups: general
assessor_params:
n_neighbors: 2
advisor_params:
minimal_distance: 1
n_best_to_advise: 5
17 changes: 17 additions & 0 deletions experiments/fedot_warm_start/config_light.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
seed: 42
tmpdir: '/var/essdata/tmp'
#data_settings:
n_datasets: 16 # null for all available datasets
test_size: 0.25
train_timeout: 15
test_timeout: 15
#meta_learning_params:
n_best_dataset_models_to_memorize: 10
mf_extractor_params:
groups: general
assessor_params:
n_neighbors: 5
advisor_params:
minimal_distance: 1
n_best_to_advise: 5
3 changes: 3 additions & 0 deletions experiments/fedot_warm_start/configs_list.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- config_debug.yaml
- evaluation_config.yaml
- fedot_config.yaml
11 changes: 11 additions & 0 deletions experiments/fedot_warm_start/evaluation_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
n_folds: 1
split_seed: 0
collect_metrics:
- f1
- roc_auc
- accuracy
- neg_log_loss
- precision
baseline_model: 'xgboost'
data_test_size: 0.25
data_split_seed: 0
6 changes: 6 additions & 0 deletions experiments/fedot_warm_start/fedot_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
fedot_params:
problem: classification
logging_level: 10
n_jobs: -1
show_progress: false
seed: 42
Loading