Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[destination-postgres] not finishing loading 30Gb from IBM-DB2 #51580

Open
1 task
mosonyi opened this issue Jan 16, 2025 · 2 comments
Open
1 task

[destination-postgres] not finishing loading 30Gb from IBM-DB2 #51580

mosonyi opened this issue Jan 16, 2025 · 2 comments
Labels
area/connectors Connector related issues community connectors/destination/postgres team/destinations Destinations team's backlog type/bug Something isn't working

Comments

@mosonyi
Copy link

mosonyi commented Jan 16, 2025

Connector Name

destination-postgres

Connector Version

2.0.0

What step the error happened?

During the sync

Relevant information

I’m trying to load a huge table with many attributes and records. Unfortunately, the calculation of _airbyte_meta column takes a lot of time. If I let run the query generated by Airbyte but without _airbyte_meta, the data load into Postgres DB is quick.

Facts

  • Airbyte version: 1.3.1, runs in kubernetes
  • I created a connection with Full-Sync on only 1 table.
  • The source DB Table (IBM-DB2) downloaded size is around 30GB and contains 9.04million records and 27 columns.
  • The destination DB is postgres 16.

Steps to reproduce

  1. Download from source to airbyte's local object store (which is postgres as airbyte_internal.lz_ipv_rd_raw__stream_tablename) is successful.
  2. Insert into destination (into postgres) step starts but it never ends. Airbyte sends the SQL Command to postgres, I can see it in the list of the running processes in Postgresql.

Workaround

If I delete 5 million records from airbyte_internal.lz_ipv_rd_raw__stream_tablename and start the sync-process again, it will be successful.
(I perform a kind of batched load in this case)

Our investigation

I get the SQL query from postgres log:

  1. I tried to run it without modification -> No success, it is an infinite process, same as when I run on the airbyte's UI.
  2. I edited the SQL query: If I remove the _airbyte_meta the query runs and the data loads.

If you need the original SQL I can send it to you in private, because of sensitive data.

Relevant log output

No relevant log output, while the process seems to be doing nothing.

Contribute

  • Yes, I want to contribute
@marcosmarxm
Copy link
Member

@mosonyi would be possible to upgrade the connector to the latest version? After 2.1.0 postgres has enabled resumable full refresh which allows to stop and restart from a failed sync.

@marcosmarxm marcosmarxm changed the title Huge data load to Postgresql fails [destination-postgres] not finishing loading 30Gb from IBM-DB2 Jan 16, 2025
@mosonyi
Copy link
Author

mosonyi commented Jan 17, 2025

@marcosmarxm I tried today with Postgres v2.4.1 as I found this as the latest version. Unfortunately no change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues community connectors/destination/postgres team/destinations Destinations team's backlog type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants