Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: jsonschema.exceptions.ValidationError when loading information_schema #437

Open
ReubenFrankel opened this issue Jun 6, 2024 · 1 comment
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@ReubenFrankel
Copy link

ReubenFrankel commented Jun 6, 2024

Overview

Trying to debug this issue from Slack and getting a jsonschema.exceptions.ValidationError when loading information_schema views with target-jsonl:

2024-06-06 23:40:34,704 | INFO     | tap-postgres.information_schema-columns | Beginning full_table sync of 'information_schema-columns'...
2024-06-06 23:40:34,704 | INFO     | tap-postgres.information_schema-columns | Tap has custom mapper. Using 1 provided map(s).
Traceback (most recent call last):
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/loaders/target-jsonl/venv/bin/target-jsonl", line 8, in <module>
    sys.exit(main())
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/loaders/target-jsonl/venv/lib/python3.8/site-packages/target_jsonl.py", line 92, in main
    state = persist_messages(
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/loaders/target-jsonl/venv/lib/python3.8/site-packages/target_jsonl.py", line 54, in persist_messages
    validators[o['stream']].validate((o['record']))
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/loaders/target-jsonl/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 130, in validate
    raise error
jsonschema.exceptions.ValidationError: 1 is not of type 'string', 'null'

Failed validating 'type' in schema['properties']['ordinal_position']:
    {'type': ['string', 'null']}

On instance['ordinal_position']:
    1
2024-06-06 23:40:35,979 | INFO     | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "sync_duration", "value": 1.2741827964782715, "tags": {"stream": "information_schema-columns", "context": {}, "status": "failed"}}
2024-06-06 23:40:35,979 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 49, "tags": {"stream": "information_schema-columns", "context": {}}}
2024-06-06 23:40:35,979 | ERROR    | tap-postgres.information_schema-columns | An unhandled error occurred while syncing 'information_schema-columns'
Traceback (most recent call last):
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/extractors/tap-postgres/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 1190, in sync
    for _ in self._sync_records(context=context):
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/extractors/tap-postgres/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 1113, in _sync_records
    self._write_record_message(record)
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/extractors/tap-postgres/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 856, in _write_record_message
    self._tap.write_message(record_message)
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/extractors/tap-postgres/venv/lib/python3.8/site-packages/singer_sdk/io_base.py", line 164, in write_message
    singer_write_message(message)
  File "/home/reuben/Documents/taps/tap-postgres/.meltano/extractors/tap-postgres/venv/lib/python3.8/site-packages/singer_sdk/_singerlib/messages.py", line 244, in write_message
    sys.stdout.flush()
BrokenPipeError: [Errno 32] Broken pipe

Reproduce

docker run --rm -e POSTGRES_HOST_AUTH_METHOD=trust -p 5432:5432 postgres
git clone [email protected]:MeltanoLabs/tap-postgres.git
cd tap-postgres
meltano install
meltano invoke tap-postgres | meltano invoke target-jsonl

Workaround

Deselect the information_schema streams (related to #54):

meltano select tap-postgres --exclude 'information_schema-*'
meltano select tap-postgres --all

Or just select the public schema:

meltano select tap-postgres 'public-*'
@edgarrmondragon
Copy link
Member

imo information_schema should be skipped by default during schema inspection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
Status: No status
Development

No branches or pull requests

2 participants