Last updated:
0 purchases
piex 0.2.0
An open source project from Data to AI Lab at MIT.
Pipeline Explorer
Classes and functions to explore and reproduce the performance obtained by
thousands of MLBlocks pipelines and templates across hundreds of datasets.
Free software: MIT license
Documentation: https://HDI-Project.github.io/piex
Homepage: https://github.com/HDI-Project/piex
Overview
This repository contains a collection of classes and functions which allows a user to easily
explore the results of a series of experiments run by team MIT using MLBlocks pipelines over
a large collection of Datasets.
Along with this library we are releasing a number of fitted pipelines, their performance on
cross validation, test data and metrics. The results of these experiments were stored in a
Database and later on uploaded to Amazon S3, from where they can be downloaded and analyzed
using the Pipeline Explorer.
We will continuously add more pipelines, templates and datasets to our experiments and make
them publicly available to the community.
These can be used for the following purposes:
Find what is the best score we found so far for a given dataset and task type (given the
search space we defined and our tuners)
Use information about pipeline performance to do meta learning
Current summary of our experiments is:
# of
datasets
453
pipelines
2115907
templates
63
tests
2152
Concepts
Before diving into the software usage, we briefly explain some concepts and terminology.
Primitives
We call the smallest computational blocks used in a Machine Learning process
primitives, which:
Can be either classes or functions.
Have some initialization arguments, which MLBlocks calls init_params.
Have some tunable hyperparameters, which have types and a list or range of valid values.
Templates
Primitives can be combined to form what we call Templates, which:
Have a list of primitives.
Have some initialization arguments, which correspond to the initialization arguments
of their primitives.
Have some tunable hyperparameters, which correspond to the tunable hyperparameters
of their primitives.
Pipelines
Templates can be used to build Pipelines by taking and fixing a set of valid
hyperparameters for a Template. Hence, Pipelines:
Have a list of primitives, which corresponds to the list of primitives of their template.
Have some initialization arguments, which correspond to the initialization arguments
of their template.
Have some hyperparameter values, which fall within the ranges of valid tunable
hyperparameters of their template.
A pipeline can be fitted and evaluated using the MLPipeline API in MLBlocks.
Datasets
A collection of ~450 datasets was used covering 6 different data modalities and 17 task types.
Each dataset was split using a holdout method in two parts, training and testing, which were
used respectively to find and fit the optimal pipeline for each dataset, and to later on
evaluate the goodness-of-fit of each pipeline against a specific metric for each dataset.
This collection of datasets is stored in an Amazon S3 Bucket in the D3M format,
including the training and testing partitioning, and can be downloaded both using piex or a web browser following this link: https://d3m-data-dai.s3.amazonaws.com/index.html
What is an experiment/test?
Throughout our description we will refer to a search process as an experiment or a test.
An experiment/test is defined as follows:
It is given a dataset and a task
It is given a template
It then searches using a Bayesian tuning algorithm (using a tuner from our BTB library). Tuning
algorithm tests multiple pipelines derived from the template and tries to find the best set of
hyperparameters possible for that template on each dataset.
During the search process, a collection of information is stored in the database and is
available through piex. They are:
Cross Validation score obtained over the training partition by each pipeline fitted during
the search process.
In parallel, at some points in time the best pipeline already found was validated against
the testing data, and the obtained score was also stored in the database.
Each experiment was given one or more of the following configuration values:
Timeout: Maximum time that the search process is allowed to run.
Budget: Maximum number of tuning iterations that the search process is allowed to perform.
Checkpoints: List of points in time, in seconds, where the best pipeline so far was
scored against the testing data.
Pipeline: The name of the template to use to build the pipelines.
Tuner Type: The type of tuner to use, gp or uniform.
Getting Started
Installation
The simplest and recommended way to install the Pipeline Explorer is using pip:
pip install piex
Alternatively, you can also clone the repository and install it from sources
git clone [email protected]:HDI-Project/piex.git
cd piex
pip install -e .
Usage
The S3PipelineExplorer
The S3PipelineExplorer class provides methods to download the results from previous
tests executions from S3, see which pipelines obtained the best scores and load them
as a dictionary, ready to be used by an MLPipeline.
To start working with it, it needs to be given the name of the S3 Bucket from which
the data will be downloaded.
For this examples, we will be using the ml-pipelines-2018 bucket, where the results
of the experiments run for the Machine Learning Bazaar paper can be found.
from piex.explorer import S3PipelineExplorer
piex = S3PipelineExplorer('ml-pipelines-2018')
The Datasets
The get_datasets method returns a pandas.DataFrame with information about the
available datasets, their data modalities, task types and task subtypes.
datasets = piex.get_datasets()
datasets.shape
(453, 4)
datasets.head()
dataset
data_modality
task_type
task_subtype
314
124_120_mnist
image
classification
multi_class
315
124_138_cifar100
image
classification
multi_class
316
124_153_svhn_cropped
image
classification
multi_class
317
124_174_cifar10
image
classification
multi_class
318
124_178_coil100
image
classification
multi_class
datasets = piex.get_datasets(data_modality='multi_table', task_type='regression')
datasets.head()
dataset
data_modality
task_type
task_subtype
311
uu2_gp_hyperparameter_estimation
multi_table
regression
multivariate
312
uu3_world_development_indicators
multi_table
regression
univariate
The Experiments
The list of tests that have been executed can be obtained with the method get_tests.
This method returns a pandas.DataFrame that contains a row for each experiment that has been run on each dataset.
This dataset includes information about the dataset, the configuration used for the experiment, such as the
template, the checkpoints or the budget, and information about the execution, such as the timestamp, the exact
software version, the host that executed the test and whether there was an error or not.
Just like the get_datasets, any keyword arguments will be used to filter the results.
import pandas as pd
tests = piex.get_tests()
tests.head().T
0
1
2
3
4
budget
NaN
NaN
NaN
NaN
NaN
checkpoints
[900, 1800, 3600, 7200]
[900, 1800, 3600, 7200]
[900, 1800, 3600, 7200]
[900, 1800, 3600, 7200]
[900, 1800, 3600, 7200]
commit
4c7c29f
4c7c29f
4c7c29f
4c7c29f
4c7c29f
dataset
196_autoMpg
26_radon_seed
LL0_1027_esl
LL0_1028_swd
LL0_1030_era
docker
False
False
False
False
False
error
NaN
NaN
NaN
NaN
NaN
hostname
ec2-52-14-97-167.us-east-2.compute.amazonaws.com
ec2-18-223-109-53.us-east-2.compute.amazonaws.com
ec2-18-217-79-23.us-east-2.compute.amazonaws.com
ec2-18-217-239-54.us-east-2.compute.amazonaws.com
ec2-18-225-32-252.us-east-2.compute.amazonaws.com
image
NaN
NaN
NaN
NaN
NaN
insert_ts
2018-10-24 20:05:01.872
2018-10-24 20:05:02.778
2018-10-24 20:05:02.879
2018-10-24 20:05:02.980
2018-10-24 20:05:03.081
pipeline
categorical_encoder/imputer/standard_scaler/xg...
categorical_encoder/imputer/standard_scaler/xg...
categorical_encoder/imputer/standard_scaler/xg...
categorical_encoder/imputer/standard_scaler/xg...
categorical_encoder/imputer/standard_scaler/xg...
status
done
done
done
done
done
test_id
20181024200501872083
20181024200501872083
20181024200501872083
20181024200501872083
20181024200501872083
timeout
NaN
NaN
NaN
NaN
NaN
traceback
NaN
NaN
NaN
NaN
NaN
tuner_type
NaN
NaN
NaN
NaN
NaN
update_ts
2018-10-24 22:05:55.386
2018-10-24 22:05:57.508
2018-10-24 22:05:56.337
2018-10-24 22:05:56.112
2018-10-24 22:05:56.164
data_modality
single_table
single_table
single_table
single_table
single_table
task_type
regression
regression
regression
regression
regression
task_subtype
univariate
univariate
univariate
univariate
univariate
metric
meanSquaredError
rootMeanSquaredError
meanSquaredError
meanSquaredError
meanSquaredError
dataset_id
196_autoMpg_dataset_TRAIN
26_radon_seed_dataset_TRAIN
LL0_1027_esl_dataset_TRAIN
LL0_1028_swd_dataset_TRAIN
LL0_1030_era_dataset_TRAIN
problem_id
196_autoMpg_problem_TRAIN
26_radon_seed_problem_TRAIN
LL0_1027_esl_problem_TRAIN
LL0_1028_swd_problem_TRAIN
LL0_1030_era_problem_TRAIN
target
class
log_radon
out1
Out1
out1
size
24
160
16
52
32
size_human
24K
160K
16K
52K
32K
test_features
7
28
4
10
4
test_samples
100
183
100
199
199
train_features
7
28
4
10
4
train_samples
298
736
388
801
801
pd.DataFrame(tests.groupby(['data_modality', 'task_type']).size(), columns=['count'])
count
data_modality
task_type
graph
community_detection
5
graph_matching
18
link_prediction
2
vertex_nomination
2
image
classification
57
regression
1
multi_table
classification
1
regression
1
single_table
classification
1405
collaborative_filtering
1
regression
430
time_series_forecasting
175
text
classification
17
timeseries
classification
37
tests = piex.get_tests(data_modality='graph', task_type='link_prediction')
tests[['dataset', 'pipeline', 'checkpoints', 'test_id']]
dataset
pipeline
checkpoints
test_id
1716
59_umls
NaN
[900, 1800, 3600, 7200]
20181031040541366347
2141
59_umls
graph/link_prediction/random_forest_classifier
[900, 1800, 3600, 7200]
20181031182305995728
The Experiment Results
The results of the experiments can be seen using the get_experiment_results method.
These results include both the cross validation score obtained by the pipeline during
the tuning, as well as the score obtained by this pipeline once it has been fitted
using the training data and then used to make predictions over the test data.
Just like the get_datasets, any keyword arguments will be used to filter the results,
including the test_id.
results = piex.get_test_results(test_id='20181031182305995728')
results[['test_id', 'pipeline', 'score', 'cv_score', 'elapsed', 'iterations']]
test_id
pipeline
score
cv_score
elapsed
iterations
7464
20181031182305995728
graph/link_prediction/random_forest_classifier
0.499853
0.843175
900.255511
435.0
7465
20181031182305995728
graph/link_prediction/random_forest_classifier
0.499853
0.854603
1800.885417
805.0
7466
20181031182305995728
graph/link_prediction/random_forest_classifier
0.499853
0.854603
3600.005072
1432.0
7467
20181031182305995728
graph/link_prediction/random_forest_classifier
0.785568
0.860000
7200.225256
2366.0
The Best Pipeline
Information about the best pipeline for a dataset can be obtained using the get_best_pipeline method.
This method returns a pandas.Series object with information about the pipeline that obtained the
best cross validation score during the tuning, as well as the template that was used to build it.
Note: This call will download some data in the background the first time that it is run,
so it might take a while to return.
piex.get_best_pipeline('185_baseball')
id 17385666-31da-4b6e-ab7f-8ac7080a4d55
dataset 185_baseball_dataset_TRAIN
metric f1Macro
name categorical_encoder/imputer/standard_scaler/xg...
rank 0.307887
score 0.692113
template 5bd0ce5249e71569e8bf8003
test_id 20181024234726559170
pipeline categorical_encoder/imputer/standard_scaler/xg...
data_modality single_table
task_type classification
Name: 1149699, dtype: object
Apart from obtaining this information, we can use the load_best_pipeline method
to load its JSON specification, ready to be using in an mlblocks.MLPipeline object.
pipeline = piex.load_best_pipeline('185_baseball')
pipeline['primitives']
['mlprimitives.feature_extraction.CategoricalEncoder',
'sklearn.preprocessing.Imputer',
'sklearn.preprocessing.StandardScaler',
'mlprimitives.preprocessing.ClassEncoder',
'xgboost.XGBClassifier',
'mlprimitives.preprocessing.ClassDecoder']
The Best Template
Just like the best pipeline, the best template for a given dataset can be obtained using
the get_best_template method.
This returns just the name of the template that was used to build the best pipeline.
template_name = piex.get_best_template('185_baseball')
template_name
'categorical_encoder/imputer/standard_scaler/xgbclassifier'
This can be later on used to explore the template, obtaining its default hyperparameters:
defaults = piex.get_default_hyperparameters(template_name)
defaults
{'mlprimitives.feature_extraction.CategoricalEncoder#1': {'copy': True,
'features': 'auto',
'max_labels': 0},
'sklearn.preprocessing.Imputer#1': {'missing_values': 'NaN',
'axis': 0,
'copy': True,
'strategy': 'mean'},
'sklearn.preprocessing.StandardScaler#1': {'with_mean': True,
'with_std': True},
'mlprimitives.preprocessing.ClassEncoder#1': {},
'xgboost.XGBClassifier#1': {'n_jobs': -1,
'n_estimators': 100,
'max_depth': 3,
'learning_rate': 0.1,
'gamma': 0,
'min_child_weight': 1},
'mlprimitives.preprocessing.ClassDecoder#1': {}}
Or obtaining the corresponding tunable ranges, ready to be used with a tuner:
tunable = piex.get_tunable_hyperparameters(template_name)
tunable
{'mlprimitives.feature_extraction.CategoricalEncoder#1': {'max_labels': {'type': 'int',
'default': 0,
'range': [0, 100]}},
'sklearn.preprocessing.Imputer#1': {'strategy': {'type': 'str',
'default': 'mean',
'values': ['mean', 'median', 'most_frequent']}},
'sklearn.preprocessing.StandardScaler#1': {'with_mean': {'type': 'bool',
'default': True},
'with_std': {'type': 'bool', 'default': True}},
'mlprimitives.preprocessing.ClassEncoder#1': {},
'xgboost.XGBClassifier#1': {'n_estimators': {'type': 'int',
'default': 100,
'range': [10, 1000]},
'max_depth': {'type': 'int', 'default': 3, 'range': [3, 10]},
'learning_rate': {'type': 'float', 'default': 0.1, 'range': [0, 1]},
'gamma': {'type': 'float', 'default': 0, 'range': [0, 1]},
'min_child_weight': {'type': 'int', 'default': 1, 'range': [1, 10]}},
'mlprimitives.preprocessing.ClassDecoder#1': {}}
Scoring Templates and Pipelines
The S3PipelineExplorer class also allows cross validating templates and pipelines
over any of the datasets.
Scoring a Pipeline
The simplest use case is cross validating a pipeline over a dataset.
For this, we must pass the ID of the pipeline and the name of the dataset to the method score_pipeline.
The dataset can be the one that was used during the experiments or a different one.
piex.score_pipeline(pipeline['id'], '185_baseball')
(0.6921128080904511, 0.09950216269594728)
piex.score_pipeline(pipeline['id'], 'uu4_SPECT')
(0.8897656842904123, 0.037662864373452655)
Optionally, the cross validation configuration can be changed
piex.score_pipeline(pipeline['id'], 'uu4_SPECT', n_splits=3, random_state=43)
(0.8869488536155202, 0.019475563687443638)
Scoring a Template
A Template can also be tested over any dataset by passing its name, the dataset and, optionally,
the cross validation specification. You have to make sure to choose template that is relevant for
the task/data modality for which you want to use it.
If no hyperparameters are passed, the default ones will be used:
piex.score_template(template_name, 'uu4_SPECT', n_splits=3, random_state=43)
(0.8555346666968675, 0.028343173498423108)
You can get the default hyperparameters, and update the hyperparameters by setting values
in the dictionary:
With this anyone can tune the templates that we have for different task/data modality
types using their own AutoML routine. If you choose to do so, let us know the score you
are getting and the pipeline and we will add to our database.
hyperparameters = piex.get_default_hyperparameters(template_name)
hyperparameters['xgboost.XGBClassifier#1']['learning_rate'] = 1
piex.score_template(template_name, 'uu4_SPECT', hyperparameters, n_splits=3, random_state=43)
(0.8754554700753094, 0.019151608028236813)
History
0.2.0
Pin dependency versions to ensure reproducibility
Allow S3PipelineExplorer usage without AWS credentials
Fix minor bug in S3PipelineExplorer
0.1.1
Improved documentation
Minor bugfixes and improvements to the explorer API
0.1.0
First release on PyPI
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.