Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LIMIT is not pushed down #233

Open
2 tasks done
adriangb opened this issue May 31, 2024 · 4 comments
Open
2 tasks done

LIMIT is not pushed down #233

adriangb opened this issue May 31, 2024 · 4 comments

Comments

@adriangb
Copy link

What happens?

If I do SELECT * FROM pg.very_large_table LIMIT 1; the query issued does not include a limit and thus the entire table is copied.

To Reproduce

D ATTACH '' AS pg (TYPE POSTGRES);
D SELECT * FROM pg.some_table; -- warm up schema cache
D SET pg_debug_show_queries = true;
D SELECT * FROM pg.schema_name.very_large_table LIMIT 1;
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ

        COPY (SELECT <lots of columns> FROM "schema_name"."very_large_table" ) TO STDOUT (FORMAT binary);

I don't see any pagination happening either.

My network usage goes through the roof as well (indicating lots of data is being copied).

OS:

MacOS

PostgreSQL Version:

15 (timescale)

DuckDB Version:

v0.10.4-dev124 cf5b770ccb

DuckDB Client:

CLI

Full Name:

Adrian Garcia Badaracco

Affiliation:

Pydantic

Have you tried this on the latest main branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree
@adriangb
Copy link
Author

I'll note that on other tables in this database the CTID scan is used. But it seems like it still copies everything over, maybe just in parallel.

@adriangb
Copy link
Author

If I enable filter push down and push down a filter that can hit an index and gives me no results then the query is fast. But I would have thought the scanner should be able to push down a limit when there are no aggregations.

@Mytherin
Copy link
Contributor

Thanks for reporting! This is currently an expected restriction - limit is not pushed down into Postgres yet. You can use postgres_query as a work-around

select * from postgres_query('pg', 'select * from very_large_table limit 1');

@adriangb
Copy link
Author

Thanks for clarifying, makes sense 😄. I guess I'll have to do something like WITH (select * from postgres_query('pg', 'select * from very_large_table limit 1')) AS data SELECT <aggregations, etc> FROM data ....

Feel free to close this issue if there's no plan to implement this or leave it open to track if you think it's a valid feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants