Skip to content

Latest commit

 

History

History
 
 

LaNAS_NASBench101

LaNAS on NASBench-101

This folder has everything you need to test LaNAS on NASBench-101. Before you start, please download a preprocessed NASBench-101 from AlphaX (see section Download the dataset).

place nasbench_dataset in LaNAS/LaNAS_NASBench101
python MCTS.py

The program will stop once it finds the global optimum. The search usually takes a few hours to a day. Once it finishes, The search results will be written into the last row in results.txt. Here is an example to interpret the result.

[[0.9313568472862244, 1], [0.9326255321502686, 47], [0.9332265059153239, 51], [0.9342948794364929, 72], [0.9343950351079305, 76], [0.93873530626297, 81], [0.9388020833333334, 224], [0.9388688604036967, 472], [0.9391693472862244, 639], [0.9407051205635071, 740], [0.9420072237650553, 831], [0.9423410892486572, 1545], [0.943175752957662, 3259]]

This means before the 47th sample, the best validation accuracy is 0.9326255321502686; and in this case LaNAS finds the best network using 3259 samples. The results of a new experiment will be appended as a new row in results.txt.

We also provided results of our past runs in our_past_results.txt, you can use that for comparisions; but feel free to reproduce the results with this release.

About NASBench-101

Please check AlphaX to see our encoding of NASBench.

About Predictor based Search Methods

Recent works show very excellent results using a predictor such as Graph Neural Network. These approaches are the same as the surrogate model used in Bayesian Optimization, except for using different predictors. However, the main issue is that these methods need to predict every architecture in the search space to perform well, and misses an acquisition (e.g. in Bayesian Optimization) to make the trade-off between exploration and exploitation. See this repository, a simple MLP can perform well if predict on all the architectures in NASBench.

In LaNAS, we used MLP to predict samples to assign an architecture to a partition. This is an engineering simplification and can be replaced by a hit-and-run sampler, i.e. sampling from a convex polytope.