-
-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Git Blame Cache #1848
Comments
Adding a bit of context here, as I worked on this with @holodorum . I'm working on a product that tracks code ownership / code age for all files in many repos. For that, we'd like to have very fast git blame statistics. The solution we came up with is to create checkpoints of specific commits and store the full blame coverage for those. When we want to calculate the blame for a newer commit, we only have to go as far back as the latest checkpoint. It stores chunks with (commit_id, start_line, end_line) tuples. We're storing the chunks in RocksDB so I think it's too specific to include in gitoxide itself, but we were wondering if there is some appetite to restructure the gix-blame code to make it easy to add a cache. |
Thanks a lot for sharing - I am very interested in this line of work! Making the code more cache-friendly and generally having more eyes concerned with performance on it is very much in my interest, so please feel free to join the efforts in making As for the implementation of a cache, there already is Please note that making such changes should serve an actual product or tool so they serve a purpose/satisfy a demand, while having stakeholders that are interested in keeping it functioning. |
Thanks for your response! I've submitted an initial pull request that could speed up the blame process by using an existing blame as a starting point. In the full implementation, we perform a tree diff between the commit for which we have a cache and the new commit. The files that were modified or added can then be blamed using the updated I wasn't aware of |
@Byron It's definitely serving an actual problem for my product (in stealth), but it's a tiny bootstrapped startup so I can't guarantee any long-term commitment yet :) Please be aware of that when reviewing / accepting changes from our side. |
Thanks for being so upfront about it, much appreciated! Accepting cache-support in some shape or form should, and this is my hope, also mean that you can submit fixes and improvements to the core-algorithm in future, without having to fork and maintain your own. |
Summary 💡
Would there be any interest for a git blame cache. I came across this thread, where this is discussed with some links to existing implementations.
This thesis also implemented something similar for a company.
With the recent improvements in speed achieved by @cruessler I think this might be a nice addition to beat
git
and I would be interested to give implementing it a shot.Motivation 🔦
No response
The text was updated successfully, but these errors were encountered: