Respect _MAX_RECORDS_LIMIT #24

qbatten · 2023-01-30T18:04:46Z

Since we're overriding get_records from SQLStream, let's add in these two lines so we maintain same behavior as SQLStream.

brief convo about this here

tayloramurphy · 2023-06-06T21:31:53Z

@qbatten any chance you want to make a PR for this?

techtangents · 2024-03-12T06:04:39Z

Ability to limit the number of rows would be very useful for my team. We need to ingest some large tables and can't read them all in one go due to our postgres replication timeout.

tayloramurphy · 2024-03-12T14:18:13Z

cc @visch @pnadolny13

visch · 2024-03-12T14:28:49Z

@tayloramurphy It's a quick one, I can take this one.

SQLAlchemy and pre_pool should theoretically already handle this without a MAX_RECORDS_LIMIT and it should be automatic, but I can make a seperate issue for that and handle it after we get this one in place.

visch · 2024-03-13T02:23:18Z

@techtangents This issue doesn't address the issue you're after. MAX_RECORDS_LIMIT was put together for running pytest not for running taps.

I ended up implementing what you were after here anyways. Here's the PR #393

I think the "right" way to approach this issue is to look at the connection pool #394 , it is harder to do this though as we'd still want to fail eventually if postgres times out, and it'd require more effort to get in place

techtangents · 2024-03-13T02:48:34Z

@visch apologies and thank you.

To be clear, we'd just want it to read, say, 500k records then stop. Then the next time we run the tap, it'd read another 500k records then stop. This is what the "limit" parameter on the pipelinewise-tap-postgres does.

techtangents · 2024-03-13T02:50:25Z

@visch I'm not a Python guy, but, that PR looks like exactly what we want. Really appreciate you jumping in and helping with this.

Closes #24 I wanted to get rid of the `get_records` function all together, but with this addition we have to keep the override.

visch self-assigned this Mar 13, 2024

visch mentioned this issue Mar 13, 2024

Database timeouts fail the tap, even though we'd expect the pool to swap out timed out connecitons #394

Open

visch mentioned this issue Mar 13, 2024

feat: Add max records limit #393

Merged

visch mentioned this issue Mar 13, 2024

feat: Add Max Records Limit meltano/sdk#2312

Open

visch closed this as completed in #393 Mar 22, 2024

visch added a commit that referenced this issue Mar 22, 2024

feat: Add max records limit (#393)

f35ff84

Closes #24 I wanted to get rid of the `get_records` function all together, but with this addition we have to keep the override.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Respect _MAX_RECORDS_LIMIT #24

Respect _MAX_RECORDS_LIMIT #24

qbatten commented Jan 30, 2023 •

edited

Loading

tayloramurphy commented Jun 6, 2023

techtangents commented Mar 12, 2024

tayloramurphy commented Mar 12, 2024

visch commented Mar 12, 2024

visch commented Mar 13, 2024

techtangents commented Mar 13, 2024

techtangents commented Mar 13, 2024

Respect _MAX_RECORDS_LIMIT #24

Respect _MAX_RECORDS_LIMIT #24

Comments

qbatten commented Jan 30, 2023 • edited Loading

tayloramurphy commented Jun 6, 2023

techtangents commented Mar 12, 2024

tayloramurphy commented Mar 12, 2024

visch commented Mar 12, 2024

visch commented Mar 13, 2024

techtangents commented Mar 13, 2024

techtangents commented Mar 13, 2024

qbatten commented Jan 30, 2023 •

edited

Loading