03 - Spain #18
Replies: 12 comments
-
I tried with 0.5 , but that was silly. it should have been 0.3. |
Beta Was this translation helpful? Give feedback.
-
make sure to pip install the required libraries |
Beta Was this translation helpful? Give feedback.
-
My first thoughts was to just run the entire notebook to see what worked. Clicking run caused packages that weren't installed to be identified, I didn't know how to access the terminal, so I did a Next error I set it to 0.3 because 30% of the data should be used for training Then I continued to run through and read the comments. I then showed plt.show_me_the_stars() to .show() to show the confusion matrix. |
Beta Was this translation helpful? Give feedback.
-
Mission Spain On the train model, we have the following error: InvalidParameterError: The 'test_size' parameter of train_test_split must be a float in the range (0.0, 1.0), an int in the range [1, inf) or None. Got 1.5 instead. since we are splitting the data like this:
The first idea that came to my mind is to modify the With that value, I get a pretty decent model accuracy of: Model Accuracy: 99.17% On the evaluate model we are calling a function |
Beta Was this translation helpful? Give feedback.
-
I got error at step 15 while training the model. Upon online search, I reviewed this document - https://scikit-learn.org/1.5/modules/generated/sklearn.model_selection.train_test_split.html and made it 0.3 |
Beta Was this translation helpful? Give feedback.
-
Insert a cell above 1
Insert a cell above (new) 1
Change to: 0.3
Comment this out or |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Had to pip install seaborn Stretch goal - schedule pipeline run |
Beta Was this translation helpful? Give feedback.
-
The hardest one for me was the pipelines, what I did first was create a requirements.txt within the same directory under .../lab-materials/03/ using the following command pip freeze > requirements.txt and then I clicked on run as pipeline and set * as the file extension value. Then a non recurring pipeline gets created. You will then need switch nonrecurring pipeline to daily and should work. |
Beta Was this translation helpful? Give feedback.
-
Nedd to Install seaborn: pip install seaborn Correct the test size to 0.3 (30%) Split the data into training and testing sets30% of the data is used for testing, while the remaining 70% is used for training.X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.3, random_state=42) And the cmap from Blue to Blues Visualize the confusion matrix... |
Beta Was this translation helpful? Give feedback.
-
I used the chat to ask for a valid value of cmap. |
Beta Was this translation helpful? Give feedback.
-
The requirement in the lab mentions to get the precision and accuracy above 98%, however in the notebook only the accuracy is tested and verified to be above 98% if the test/train split is 30/70. The precision is calculated with a different fomula out of the confusion matrix, precision=TP/TP+FP which yields to only 95%. I am not sure that with LogisticRegression can solve this, a different model might be required, like RF or XGBoost or even an MLP based architecture... |
Beta Was this translation helpful? Give feedback.
-
This is the thread where you can paste your notes based on the mission.
Beta Was this translation helpful? Give feedback.
All reactions