Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(multithread): add rayon & par_iter for multithreaded execution #180

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

NishantJoshi00
Copy link

Description

This is a feature-gated implementation of multi-threading for this project. This utilizes rayon for performing multi-threading.

This PR is raised against #128.

Reasoning

Following are some of the decisions taken while building the solution. Feel free to comment on any of them, as I might have missed something during the process.

Reason for using #[cfg(...)]

  • In the current implementation of the code, all the smart pointers used aren't thread-safe. This being the case, thread-safety wasn't a requirement for the implementation as these pointers are much more performant than their thread-safe alternatives Rc<T> -> Arc<T>. While some might require multithreading others might want to go with the legacy implementation and I all the smart pointers are made to be thread-safe this could hamper the performance of the legacy implementation. This is avoided by using #[cfg(... gates on variables and arguments.

Reason for TargetDirPool

  • Adding multi-threading to the entire implementation isn't difficult the main problem is to tackle the locking mechanism that cargo has in place. In order to do so multiple target directories were created and they were assigned for execution, as well as reused to take advantage of caching

Reason for rayon

  • While using the traditional multi-threading model was always into consideration, rayon had some added benefits. Being lightweight and guaranteeing data-race freedom, with the added benefit of having great abstraction.

Ignored tests: 2

  • Here I have ignored 2 test cases as the order of the output was not what was expected in the assertion. I wasn't able to implement tests to replace them. Please, feel free to modify them as necessary

@taiki-e
Copy link
Owner

taiki-e commented Jan 29, 2023

Thanks for the PR!

  • I would prefer to make this an option that can be opt-in with an environment variable or CLI option, rather than a feature flag. If the user needs to reinstall cargo-hack to try parallelized cargo-hack, that would be a pain.
  • What does the output from parallelized cargo-hack look like? I'm concerned that if warnings/errors from different threads are alternately displayed, the messages may not be readable to the user.

@NishantJoshi00
Copy link
Author

NishantJoshi00 commented Feb 1, 2023

I agree with what you are presenting, I actually started working on the second issue that you suggested, a little while before raising this PR, but sadly I wasn't able to include those changes in this PR at the moment.

However, I am working on it and will add those changes to this branch. Speaking about the first issue that you mentioned, I was also trying to achieve the same, but to achieve the current performance on the binary, some of the functionality used cannot support multi-threading, I do acknowledge that this tradeoff had to be made but, wasn't able to test how much a performance bottleneck is caused by using Atomically safe wrappers.

I'll try to benchmark it in the meantime, but theoretically, that will cause some performance bottleneck if the application is not running on the multithreaded mode as that is the intention of that change. Still, I'll try to find other ways to achieve something similar or to find a workaround for the feature-gated solution.
Thanks for the feedback

@taiki-e
Copy link
Owner

taiki-e commented Feb 5, 2023

performance bottleneck

I haven't measured this, but

  • As for Rc -> Arc, the number of clones is not very large and I think it should not cause problems.
  • As for the TargetDirPool, that's not a problem, since we don't need to access it if parallelization is not enabled.
  • Mutex for Progress may cause a performance bottleneck because the lock is held even during outputting commands.
    However, the only field that changes after parallelization in Progress is count, so instead of using Mutex entire Progress, using Arc<AtomicUsize> on count and using fetch_add(1, Relaxed) to increment counter should probably be enough to lower its cost.

@Pat-Lafon
Copy link

Hello! I was wondering about this kind of functionality and came across this pr. I'm curious about it's status? I'm sure it would be super useful for projects with larger sets of features(where cargo-hack is needed the most!).

@NishantJoshi00
Copy link
Author

Hey @Pat-Lafon, I wasn't able to pick up on the comments left by @taiki-e, but I am planning to pick it up as soon as possible. I was able to come up with a solution to unify the logs that the application will emit. But, haven't implemented it in this branch just yet. Will try to pick it up in the very near future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants