Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add standalone partition pages in user guide #3192

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

braaannigan
Copy link
Contributor

Description

This page is a standalone user guide page that focuses on working with partitioned datasets.

Related Issue(s)

closes #3191

Copy link

github-actions bot commented Feb 4, 2025

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@braaannigan braaannigan changed the title docs: Add standalone partition pages in user guide docs: add standalone partition pages in user guide Feb 4, 2025
(
dt.merge(
source=source_data,
predicate="target.country = source.country AND target.num = source.num",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you merge on partitions and you are certain the source data only holds one partition or multiple, then you can should an explicit partition predicate. Especially when we have streaming mode enabled by default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ion-elgreco, can I confirm I understand what you are suggesting here: that the code is fine but we clarify in the text that having partitions in the predicate is very important for performance if only a subset of partitions are likely to be matched?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ion-elgreco I've re-worked this entire section to emphasise the points as I understood them

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, s.id = t.id AND t.id in (1,2,3)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ion-elgreco Ah, I didn't know that. We're all learning something!

I've re-worked that section accordingly

docs/usage/working-with-partitions.md Outdated Show resolved Hide resolved
Liam Brannigan and others added 5 commits February 6, 2025 11:03
Signed-off-by: Liam Brannigan <[email protected]>
Signed-off-by: Liam Brannigan <[email protected]>
Signed-off-by: Liam Brannigan <[email protected]>
Signed-off-by: Liam Brannigan <[email protected]>
Signed-off-by: Martin Andersson <[email protected]>
Signed-off-by: Liam Brannigan <[email protected]>
@github-actions github-actions bot added the binding/python Issues for the Python package label Feb 6, 2025
@github-actions github-actions bot removed the binding/python Issues for the Python package label Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User guide: Add standalone page for working with partitions
3 participants