Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added benchmark notebook to generate videos and run benchmarks #285

Closed
wants to merge 11 commits into from

Conversation

ahmadsharif1
Copy link
Contributor

@ahmadsharif1 ahmadsharif1 commented Oct 23, 2024

Add a script to run latency benchmarks and chart them using matplotlib.

Add the resulting chart to the README.md file.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 23, 2024
@ahmadsharif1 ahmadsharif1 marked this pull request as ready for review October 23, 2024 20:35
@@ -118,6 +118,16 @@ The instructions below assume you're on Linux.
pip install torchcodec
```

## Benchmark results

The following results were obtained by running the [benchmark_decoders.ipynb](./benchmarks/decoders/benchmark_decoders.ipynb) on a lightly-loaded 22-core machine. We first get the operation latency for various seek and decode patterns in a loop from a single python thread
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our README and other files restrict the line length to ~80, we should try to keep it this way (it also makes review easier to locate comments on specific lines).

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ahmadsharif1

General comments:

  • ipynb files are pretty difficult to manage with versioning systems like git. Changing a single line on those leads to tons of generated json data, and it makes reviewing quite difficult. It might be best to consider converting that script to a pure .py file but keep its notebook flavor by having cells with lines that start with #%%. E.g. the same thing we do for our simple_example.py. Modern IDEs are capable of interpreting those as notebooks.
  • The plots are great but the resulting image is pretty long, and it takes a massive amount of estate on the README (see it on your fork here: https://github.com/ahmadsharif1/torchcodec/blob/bench1/README.md). It might be worth having that image in a separate file e.g. BENCHMARKS.md or something like that, that we can directly link to from the README?

Comments on the plot:

  • the FPS scale is only plotted at the very bottom, it makes it hard to get a sense of the actual fps value for the first few plots
  • the "decoder" legend is repeated: on the left, and on every single plot. It might be best to only keep one so as not to overload the plots

@scotts
Copy link
Contributor

scotts commented Oct 24, 2024

I think this benchmark is great, as are the results, but I think it's way too much for the README. For the README, we want to pick one experiment that we think is convincing to a new potential user, and show our performance there with one simple bar chart. As @NicolasHug suggest, we can always link to more detailed info from the README, but I don't think we want to include this image in the README itself.

If we compare our repo to an academic paper, the README is the introduction. Performance graphs sometimes are in intros - but usually as motivation and teasers for how great the rest of the paper is. Such performance graphs are usually easy to understand without detailed reading of the text, and they make one motivating point.

For other GitHub repos, I think how Decord and TorchTune show perf is more the approach we want to take.

With all that said, I think we can just pick one experiment out of this set, and display that on the README. For now, we can just link to the larger image. In the future, we'll want an article in our docs explaining the benchmark.

@ahmadsharif1
Copy link
Contributor Author

Thanks for the comments. I can pick just one experiment and show the results for that for now.

Later on we can publish more experiments in a separate directory with docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants