-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2021 Matminer/ML Update (WIP) #149
Conversation
Agreed on removing automatminer. The lesson was already long and this gives you more breathing room on the rest to make sure people understand and provide exercises that enforce the learning process. Can you clarify what a matbench lesson would cover? It's not clear to me how it fits in. |
I think that is up to @mkhorton as I am not 100% sure if he was hoping to show the website, perhaps under some mp domain e.g., If we were to include it, I was thinking to show at the end once we got done examining the model in the notebook; something like
|
No need to have it under the "MP" URL umbrella. Can you make it interactive? |
@shyamd I could, do you mean interactive in terms of having students run code themselves? Or just interacting with the website? If it is the former, it is possible but it might be another instance of too much information. In addition to the website, using Matbench to record new results requires yet another python library with its own objects etc. Would require like another 5 code cells to just get the fundamentals of how to use it. What is your thought on just having a quick postscript (like, three sentences or less of text) with an image and a quick guided tour of the matbench website instead? I.e., "this matbench thing is available and useful for X, Y, Z, here's the website and if you're interested there's comprehensive docs on how to use it" |
I don't think it's too useful. The best way to learn is by doing. Is there not a convenient function in |
Yeah, you can get them with the from matminer.datasets.dataset_retrieval import load_dataset
df = load_dataset("matbench_mp_gap") Maybe I can show that? |
Yup, that sounds good. Maybe even make it an exercise where you show them how to load data and have them recreate one of the benchmarks? They will have access to some computing on CoCalc, but it won't be a ton, so nothing too hard like re-training CGNN. |
Having them recreate one of the whole 13-task benchmarks, even with like a super basic linear regression, is for sure going to nuke whatever compute CoCalc is allotting MP - just deserializing some of the larger datasets from disk takes a while. Maybe we could just have them do a single, smaller task? Like the steels prediction, which is just 312 compositions. Then again, that requires explaining nested cross validation and such. Regardless, I could include a short section on just looking at the benchmark data and explaining how to use it. I can plan on adding some short, interactive, matbench-flavored section into this PR this weekend. Will that give you enough time to review before the workshop? |
Oh yeah, that's good timing. I'm gonna setup some slots over the next two weeks for people to practice, which is the key deadline. |
If you want to include, I can definitely move this to an MP domain before the workshop. We can chat during our meeting next week. Will leave the decision on whether to include in the workshop lesson to you. |
� Conflicts: � requirements.txt
@shyamd just added a short primer on matbench, it's currently in a kind of "bonus" section which can be left out if we're running short on time. Lmk what you think! |
# Conflicts: # requirements.txt # workshop/lessons/08_ml_matminer/matminer-notes.ipynb
Can you enable it so that we can push to your branch. It's usually an option called "Allow edits from maintainers" that you have to check in this PR. Edit: NVM my fault. |
Issues closed
Changes
Created figrecipes pypi package (as it is no longer in matminer proper but the teaching notebook relies on it), added it to requirements - hopefully this does not cause depdendency hell, I tried to avoid that by taking @shyamd==
vs>=
advice from here ; updated nb code to reflect new figrecipes imports; also, made the graphs look betterNeed feedback before proceeding