expand test column dtypes to full scale #492

hussain-jafari · 2025-02-21T22:07:15Z

expand test column dtypes to full scale

Description=

Category: feature
JIRA issue: MIC-5866

Move test_column_dtypes to release suite framework.

Testing

Ran tests on 5 datasets (acs, cps, wic, ssa and census)

stevebachmeier · 2025-02-24T15:48:51Z

tests/integration/release/test_release.py

@@ -49,6 +48,25 @@ def test_row_noising_omit_row_or_do_not_respond(
    run_omit_row_or_do_not_respond_tests(dataset_name, config, original_data, noised_data)


+def test_column_dtypes(


I've asked this multiple times but remind me again - how will we run this test as it previously existed (on sample data) but NOT during release-testing? i.e. we need to continue running the previous test every night like we currently are.

The fixtures are set up to read in sample data if we don't run pytest with the --release flag.

rmudambi · 2025-02-24T17:31:13Z

tests/integration/release/test_release.py

+            # str dtype is 'object'
+            # Check that they are actually strings and not some other
+            # type of object.
+            actual_types = noised_data[col.name].dropna().apply(type)


Is using apply here vectorized? I don't think it is, but maybe we don't have any other options?

I believe it's not vectorized, but it might be the fastest way according to this answer:

https://stackoverflow.com/questions/55754713/fastest-way-to-find-all-data-types-in-a-pandas-series

move test to release folder

b8125b6

hussain-jafari requested review from albrja, patricktnast, rmudambi and stevebachmeier as code owners February 21, 2025 22:07

albrja approved these changes Feb 21, 2025

View reviewed changes

stevebachmeier reviewed Feb 24, 2025

View reviewed changes

rmudambi approved these changes Feb 24, 2025

View reviewed changes

hussain-jafari merged commit 53d213e into epic/full_scale_testing Feb 24, 2025
11 checks passed

hussain-jafari deleted the hjafari/feature/MIC-5866_expand_test_column_dtypes branch February 24, 2025 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expand test column dtypes to full scale #492

expand test column dtypes to full scale #492

hussain-jafari commented Feb 21, 2025

stevebachmeier Feb 24, 2025

hussain-jafari Feb 24, 2025

rmudambi Feb 24, 2025

hussain-jafari Feb 24, 2025

		@@ -49,6 +48,25 @@ def test_row_noising_omit_row_or_do_not_respond(
		run_omit_row_or_do_not_respond_tests(dataset_name, config, original_data, noised_data)


		def test_column_dtypes(

expand test column dtypes to full scale #492

expand test column dtypes to full scale #492

Conversation

hussain-jafari commented Feb 21, 2025