How to predict with missing data column in test dataset #2315
alexmilanov
started this conversation in
General
Replies: 1 comment 6 replies
-
Hi @alexmilanov, could you possibly post the command you are running and the error message you are experiencing so I can take a look and see what's going wrong here? |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! I'm using Ludwig v.0.5.3 and I'm experimenting with Titanic dataset. First, I train my model with the data which contain the following columns:
PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
Once my model is trained, I'm invoking the following python code which is trying to predict the outcome for given Titanic passenger:
self.model.predict(dataset=self.testFilePath)
, where self.model is the aforementioned trained Ludwig model.The problem is that the test dataset is missing the Survived column:
PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
, so Ludwig is throwing error that this column is missing.Here is my model description:
{"input_features": [{"name": "Pclass", "type": "category"}, {"name": "Sex", "type": "category"}, {"name": "Age", "type": "number", "preprocessing": {"missing_value_strategy": "fill_with_mean"}}, {"name": "SibSp", "type": "number"}, {"name": "Parch", "type": "number"}, {"name": "Fare", "type": "number", "preprocessing": {"missing_value_strategy": "fill_with_mean"}}, {"name": "Embarked", "type": "category"}], "output_features": [{"name": "Survived", "type": "binary"}]}
My questions is how do I can use
predict
method similarly as with the--only_predict
parameter, so it can predict the outcome without evaluating the model performance? Is it possible at all?Beta Was this translation helpful? Give feedback.
All reactions