Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More extensive unit tests for run to / from xml #465

Open
janvanrijn opened this issue May 3, 2018 · 3 comments
Open

More extensive unit tests for run to / from xml #465

janvanrijn opened this issue May 3, 2018 · 3 comments

Comments

@janvanrijn
Copy link
Member

Make sure that the tests ensure that functions _create_description_xml() and _create_run_from_xml() are each others exact complements.

e.g.,

  • a parameter free classifier
  • a classifier with multiple components
  • a tagged run
  • runs that come from the server (and thus have global evaluation measures, measures per sample/fold set)
  • ... feel free to add
@janvanrijn
Copy link
Member Author

I added two unit tests:

  • test_run_description_deserialize_serialize
  • test_run_description_deserialize_serialize

the first one is deactivated, as the python api is indeed not ready for this yet (although it's not a bad thing, it doesn't seem to miss any obvious fields)
the second one runs nicely (had to add something to server for that)

@mfeurer
Copy link
Collaborator

mfeurer commented Jun 19, 2018

I just wanted to have a look at this issue because getting this right is crucial to the success of OpenML. However, I can't find your tests. Could you please post a link to the tests you added so that I can extend them?

@janvanrijn
Copy link
Member Author

Great catch.

They are actually in the PR that will be deleted.


    @unittest.skip('Function _create_description_xml does not write server fields yet.')
    def test_run_description_deserialize_serialize(self):
        openml.config.server = self.production_server
        run_id = list(openml.evaluations.list_evaluations(function='predictive_accuracy',
                                                          flow=[7707], size=1).keys())[0]

        run_xml_orig = openml._api_calls._perform_api_call('run/%d' %run_id)
        run_obj_orig = openml.runs.functions._create_run_from_xml(run_xml_orig)
        run_xml_prime = run_obj_orig._create_description_xml()
        # TODO: _create_description_xml does not add run id, uploader, etc
        self.assertEqual(run_xml_orig, run_xml_prime)

    def test_run_description_deserialize_serialize(self):
        model = DecisionTreeClassifier(max_depth=1)
        task = openml.tasks.get_task(119)
        run_orig = openml.runs.run_model_on_task(task, model)
        run_orig = run_orig.publish()
        run_new = openml.runs.get_run(run_orig.run_id)

        # evaluations might not be aligned (original run has some locally generated measures,
        # downloaded run might have or have not obtained server evals)
        run_orig.evaluations = None
        run_new.evaluations = None
        run_orig.fold_evaluations = None
        run_new.fold_evaluations = None
        run_orig.sample_evaluations = None
        run_new.sample_evaluations = None

        self.assertEqual(run_orig._create_description_xml(), run_new._create_description_xml())

We should merge these two functions to the development branch.

@mfeurer mfeurer closed this as completed Jul 19, 2018
@mfeurer mfeurer reopened this Jul 19, 2018
mfeurer added a commit that referenced this issue Sep 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants