data integrity test: SIMBAD-resolvable sources? #64

arjunsavel · 2024-06-11T20:10:25Z

do we want to test whether sources are simbad-resolvable?

arjunsavel · 2024-06-11T20:10:49Z

Pretty sure tests will fail — still working on this.

arjunsavel · 2024-07-09T20:21:59Z

we want to instead do this for the ingest_sources function. will make the tests take too long, and tests will fail if SIMBAD is down.

arjunsavel · 2024-07-09T20:22:17Z

SIMBAD as part of a set of monthly tests?

arjunsavel · 2024-07-09T20:34:59Z

todo: make this a ~monthly check.

kelle · 2024-07-09T20:54:34Z

In SIMPLE, we called it scheduled_checks.py so it's not auto-discovered by pytest.

Here's also the action file to use for inspiration. This runs every 30 days:
https://github.com/SIMPLE-AstroDB/SIMPLE-db/blob/main/.github/workflows/scheduled-tests.yml

kelle

This looks super cool but I don't really understand the difference between the two tests. I think it will be more clear if you make the initial query a fixture that's only executed once.

tests/scheduled_checks.py

kelle · 2024-08-14T19:13:20Z

tests/scheduled_checks.py

+        # Examine DB for each input, displaying results when more than one source matches
+        t = db.search_object(
+            simbad_names, output_table="Sources", fmt="astropy", fuzzy_search=False
+        )
+        if len(t) != 1:
+            duplicate_rows += [row]


Wow, this is a very thorough test. This is making sure every possibly SIMBAD identifier also identifies only one database source, right? This seems extra to me. It seems like just checking that the first identifier matches only one database source should be enough.

I think it currently does the latter? simbad_results is only 3 rows for the 3 sources in the database.

I don't see how simbad_results can be only 3 rows. A quick search for WASP-76 yielded about a dozen names. The logic here should have that looping over each one.

Actually, I see why it's 3 rows now: the DB has 3 rows and that's what Simbad returns back. Each row can have multiple names so for example simbad_names can contain ['BD+01 316b', 'WASP-76b'] for a single row and search_object considers both of them when searching (since it can take a list as an input).

tests/scheduled_checks.py

kelle

Just trying to wrap my head around this test so I ended up making some readability comments.

Co-authored-by: Kelle Cruz <[email protected]>

arjunsavel · 2024-08-15T00:35:52Z

thanks for all the comments!

tests/scheduled_checks.py

dr-rodriguez · 2024-08-16T21:06:43Z

Tried running in CodeSpaces but this doesn't run as-is, so some more iteration is needed; see the comments above.

dr-rodriguez · 2024-08-16T21:16:38Z

Pushed a fix to the various issues I noticed above. The script runs successfully for me in CodeSpaces.

.github/workflows/run_scheduled_tests.yml

Co-authored-by: Kelle Cruz <[email protected]>

arjunsavel · 2024-10-22T19:53:07Z

todo: rename the files to be the source names

add data integrity test. simbad-resolve sources

ea79a31

arjunsavel marked this pull request as draft June 11, 2024 20:10

arjunsavel added 5 commits June 11, 2024 16:12

change data to actually have real sources

644a755

use real, not fake sources!

837e705

only submit to simbad once

43b2f71

check for aliases, too.

1d8f01d

make the data simbad resolvable!

b5b6325

arjunsavel marked this pull request as ready for review July 9, 2024 20:12

arjunsavel requested review from dr-rodriguez and kelle July 9, 2024 20:12

arjunsavel marked this pull request as draft July 9, 2024 20:34

make the simbad resolve and alias check a scheduled check!

b967c24

arjunsavel marked this pull request as ready for review July 31, 2024 18:27

kelle requested changes Aug 1, 2024

View reviewed changes

tests/scheduled_checks.py Outdated Show resolved Hide resolved

tests/scheduled_checks.py Outdated Show resolved Hide resolved

tests/scheduled_checks.py Outdated Show resolved Hide resolved

dr-rodriguez reviewed Aug 2, 2024

View reviewed changes

tests/scheduled_checks.py Outdated Show resolved Hide resolved

dr-rodriguez reviewed Aug 2, 2024

View reviewed changes

tests/scheduled_checks.py Outdated Show resolved Hide resolved

arjunsavel added 3 commits August 14, 2024 11:28

combine into single test

4705829

more accurate docstring, remove duplicate count var

b670c7c

do not break during the loop

0473c72

arjunsavel requested review from kelle and dr-rodriguez August 14, 2024 15:33

kelle reviewed Aug 14, 2024

View reviewed changes

tests/scheduled_checks.py Show resolved Hide resolved

kelle reviewed Aug 14, 2024

View reviewed changes

tests/scheduled_checks.py Outdated Show resolved Hide resolved

kelle reviewed Aug 14, 2024

View reviewed changes

tests/scheduled_checks.py Outdated Show resolved Hide resolved

kelle reviewed Aug 14, 2024

View reviewed changes

tests/scheduled_checks.py Show resolved Hide resolved

kelle reviewed Aug 14, 2024

View reviewed changes

arjunsavel and others added 3 commits August 14, 2024 20:28

Update tests/scheduled_checks.py

58ac3dc

Co-authored-by: Kelle Cruz <[email protected]>

Update tests/scheduled_checks.py

ccb93df

Co-authored-by: Kelle Cruz <[email protected]>

Update tests/scheduled_checks.py

e7c2009

Co-authored-by: Kelle Cruz <[email protected]>

dr-rodriguez reviewed Aug 16, 2024

View reviewed changes

tests/scheduled_checks.py Outdated Show resolved Hide resolved

dr-rodriguez reviewed Aug 16, 2024

View reviewed changes

tests/scheduled_checks.py Outdated Show resolved Hide resolved

Fix for breaking errors

2e3e4fd

dr-rodriguez approved these changes Aug 16, 2024

View reviewed changes

kelle reviewed Oct 22, 2024

View reviewed changes

.github/workflows/run_scheduled_tests.yml Outdated Show resolved Hide resolved

arjunsavel and others added 2 commits October 22, 2024 15:51

update scheduled checks to astrodbkit

e70a705

Update .github/workflows/run_scheduled_tests.yml

8687065

Co-authored-by: Kelle Cruz <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data integrity test: SIMBAD-resolvable sources? #64

data integrity test: SIMBAD-resolvable sources? #64

arjunsavel commented Jun 11, 2024 •

edited

Loading

arjunsavel commented Jun 11, 2024

arjunsavel commented Jul 9, 2024

arjunsavel commented Jul 9, 2024

arjunsavel commented Jul 9, 2024

kelle commented Jul 9, 2024 •

edited

Loading

kelle left a comment

kelle Aug 14, 2024

arjunsavel Aug 15, 2024

dr-rodriguez Aug 16, 2024

dr-rodriguez Aug 16, 2024 •

edited

Loading

kelle left a comment

arjunsavel commented Aug 15, 2024

dr-rodriguez commented Aug 16, 2024

dr-rodriguez commented Aug 16, 2024

arjunsavel commented Oct 22, 2024

data integrity test: SIMBAD-resolvable sources? #64

Are you sure you want to change the base?

data integrity test: SIMBAD-resolvable sources? #64

Conversation

arjunsavel commented Jun 11, 2024 • edited Loading

arjunsavel commented Jun 11, 2024

arjunsavel commented Jul 9, 2024

arjunsavel commented Jul 9, 2024

arjunsavel commented Jul 9, 2024

kelle commented Jul 9, 2024 • edited Loading

kelle left a comment

Choose a reason for hiding this comment

kelle Aug 14, 2024

Choose a reason for hiding this comment

arjunsavel Aug 15, 2024

Choose a reason for hiding this comment

dr-rodriguez Aug 16, 2024

Choose a reason for hiding this comment

dr-rodriguez Aug 16, 2024 • edited Loading

Choose a reason for hiding this comment

kelle left a comment

Choose a reason for hiding this comment

arjunsavel commented Aug 15, 2024

dr-rodriguez commented Aug 16, 2024

dr-rodriguez commented Aug 16, 2024

arjunsavel commented Oct 22, 2024

arjunsavel commented Jun 11, 2024 •

edited

Loading

kelle commented Jul 9, 2024 •

edited

Loading

dr-rodriguez Aug 16, 2024 •

edited

Loading