-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[oneDPL][ranges] support size limit for output for merge algorithm #1942
Conversation
33cd332
to
d443dbe
Compare
9ebcfb6
to
0066210
Compare
3f648a7
to
5b078ad
Compare
98a7acb
to
c81b4c1
Compare
76c3c16
to
c0c8ba4
Compare
c0c8ba4
to
ffea24a
Compare
Co-authored-by: Sergey Kopienko <[email protected]> Co-authored-by: Dan Hoeflinger <[email protected]>
73e0bad
to
6915343
Compare
9780273
to
4002984
Compare
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_utils.h
Outdated
Show resolved
Hide resolved
Co-authored-by: Dmitriy Sobolev <[email protected]>
083c090
to
d0dcc89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All my feedback has been resolved, so LGTM. We have an agreement to target some change from #2003 for 2022.9.0.
I believe that some others (@SergeyKopienko @dmitriy-sobolev @mmichel11) may have been more involved in reviewing and looking at details in changes from their feedback recently, so I suggest getting approval also from one or more of them.
Finally, we should wait for green CI (other than the docs, looks like that is systemic and unrelated).
___merge_path_out_lim(_Tag, _ExecutionPolicy&& __exec, _It1 __it_1, _Index1 __n_1, _It2 __it_2, _Index2 __n_2, | ||
_OutIt __it_out, _Index3 __n_out, _Comp __comp) | ||
{ | ||
return __serial_merge_out_lim(__it_1, __it_1 + __n_1, __it_2, __it_2 + __n_2, __it_out, __it_out + __n_out, __comp); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is funny :) @MikeDvorskiy you explained to me that using indexes as the parameters of this function is to avoid computing the indexes twice. But in the serial implementation you do not really use indexes but switch back to the iterators :) which only confirms my impression that ___merge_path_out_lim
should use iterators as its parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I told that I was going to avoid twice computation for new output size
const _Index3 __n_out = std::min<_Index3>(__n_1 + __n_2, std::ranges::size(__out_r));
And in parallel version ___merge_path_out_lim
__n_1, __n_2, n_out
are used.
As far as we already computed these values in __pattern_merge
, I just pass ones into ___merge_path_out_lim
.
Also I kept in mind that __serial_merge_out_lim
may be used (re-called) from the iterator-based merge in the future. That's why I kept iterators in the signature of __serial_merge_out_lim
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, you do not need __n_out
for the serial version. In the serial loop, it's not a problem to check on each iteration if the end of the output is reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
..
Basically, only parallel version of ___merge_path_out_lim needs const _Index3 __n_out = std::min<_Index3>(__n_1 + __n_2, std::ranges::size(__out_r));
logic.
So, we can move the mentioned logic inside the function. (and pass iterators from the __pattern_merge
).
Co-authored-by: Alexey Kukanov <[email protected]>
Co-authored-by: Alexey Kukanov <[email protected]>
Co-authored-by: Alexey Kukanov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-approving after minor changes
[oneDPL][ranges] support size limit for output for merge algorithm.
The change is according to https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3179r2.html#range_as_output
Update: Changes to draft status, causing faced to design issue, connected with different return types from the merge patterns -
__result_and_scratch_storage/__result_and_scratch_storage_base
. As an option - to have one common type of __result_and_scratch_storage for the all needs (ate least for pattern dpcpp merge patterns).Update 2: the issue mentioned above has been resolved.