-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Respect _MAX_RECORDS_LIMIT #24
Comments
@qbatten any chance you want to make a PR for this? |
Ability to limit the number of rows would be very useful for my team. We need to ingest some large tables and can't read them all in one go due to our postgres replication timeout. |
@tayloramurphy It's a quick one, I can take this one. SQLAlchemy and |
@techtangents This issue doesn't address the issue you're after. I ended up implementing what you were after here anyways. Here's the PR #393 I think the "right" way to approach this issue is to look at the connection pool #394 , it is harder to do this though as we'd still want to fail eventually if postgres times out, and it'd require more effort to get in place |
@visch apologies and thank you. To be clear, we'd just want it to read, say, 500k records then stop. Then the next time we run the tap, it'd read another 500k records then stop. This is what the "limit" parameter on the pipelinewise-tap-postgres does. |
@visch I'm not a Python guy, but, that PR looks like exactly what we want. Really appreciate you jumping in and helping with this. |
Closes #24 I wanted to get rid of the `get_records` function all together, but with this addition we have to keep the override.
Since we're overriding get_records from SQLStream, let's add in these two lines so we maintain same behavior as SQLStream.
brief convo about this here
The text was updated successfully, but these errors were encountered: