-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Null handing issue with Partial upserts and partialUpsertStrategies #14924
Comments
All, Is there any work around atall to achieve this?? |
Hi @raghukn, the idea behind current behavior is very intentional, and desired by most use cases. For example:
Having said that, I can see how your requirement is also valid. Wondering, why full upsert doesn't work for you? cc: @Jackie-Jiang for comment on whether there should be a mode in partial upsert to support this. |
Thanks @mayankshriv -- Below is my usecase (DeNormalizing table and its extension table)
Now, In Pinot Realtime table, I have union of all columns from Tab1 and TabX1. Since these 2 tables share same value of PK, I can merge the rows using Partial Update behavior. It works fine, except for fact, that I am unable to remove any of the values previously set by send NULL in the payload. I agree what I am trying is a remote/complex case, but there must a way to UNSET values even with simple PARTIAL UPDATE flows, without needing to Send whole row all the times. |
Raghu, the partial update handler is a simple handler that takes old record and new record and returns a new record. May be you can add another handler and I think the handler is already configurable |
@kishoreg that sounds good, If there is a handler that I can implement to get different behavior for NULL values, its is sufficient. Let me explore on this. |
@raghukn thanks for reporting the issue. if you want to have behavior that "incoming null not ignored", you can create a new merger. The current overwrite merger works as "Overwrite if new value not null" |
Hi can you assign this to me? I'm interested in working on it |
Hi @deemoliu @raghukn .... I can create a quick PR for this as I've gone through the code. So there are two solutions I believe:
Let me know which one to move forward with? I'm in the favor of |
@himanish-star , solution 1 works great as this is Partial update scenario & you will be updating provided values only (including null) |
Raising a PR, right away |
@raghukn I've updated the PR to use |
@himanish-star -- I would still think the behavior of not doing anything about a NULL value being sent to partial update flow as bug. NULL is sent for that column and one would expect that being updated to target column. But anyways, if there is a way to set a column to NULl that is good for my usecase. Thanks! |
@himanish-star please go with solution2 because the current behavior for "Overwrite" merger is an expected behavior in many use cases. @raghukn one of the most common use case of partial upsert is to fill the missing field values for a existing row (primary key). Solution1 will break it. |
Hi @deemoliu , I've updated the PR with approach 2 and I've also added unit tests. I'll update the docs once the PR is checked-in. |
As described in the partial update behavior - https://docs.pinot.apache.org/basics/data-import/upsert
Below case is described:
Partial Update is very desirable. But this behavior of ignoring fields present in payload but has null value, is not desirable. The fact that some one is sending null value show he / she intends to overwrite existing value. Other wise they would not have sent this column / value in the first place.
How do we get this behavior work correctly and incoming null not ignored?
The text was updated successfully, but these errors were encountered: