Notes for Research Software Hour
Our program will evolve with time and based on feedback from the audience.
Each week, we want to do some simple stuff, some advanced stuff, some stuff more towards software/programming, some stuff more towards scientific computing/Linux.
Here is how we track ideas and how you can contribute:
- We use this README to save ideas for topics. They do not have to be concrete yet, they can be "half-baked", and you are most welcome to send us a pull request with more ideas.
- We use the issue tracker to develop ideas further, collect suggestions, and prepare them for segments that we will use in future shows. You can comment on existing issues and submit new issues.
- Suggest your own code and get constructive feedback and suggestions for improvements.
Types of segments, from least preparation to most (example prep time)
- Q&A (0m)
- Stump us (0m)
- Poll to decide spontaneous topic (0m)
- "Commercial break" (2m)
- Point out other good stuff to listen to (5m)
- Normal pair programming (15m)
- Demonstrate a tool which we recently learned about
- Teach something (20m)
- Review some code (20m)
- Interviews (1h)
Our first focus is research software skills, such as:
- A tour of the Rust programming language
- Profiling code for CPU and memory bottlenecks
- Workflow management
- Pair programming with our normal work
- CMake best practices
- Demonstrating few Python packages that we use all the time
- Sample data, how would analyze it? Make repo, load, reformat, wrangle, plot.
- What are our favorite visualization tools and why?
There are tons of related tools for working with code:
- Go through git usage
- Adding automated testing to a repo
- Moving documentation from X to readthedocs
- Working with pip/conda for Python
- How we use GitHub and GitLab
- Jupyter, and Jupyter pitfalls and how to compensate
We can take the code from a listener, go through it, discuss and give suggestions for improvements. Suggest by opening an issue in this repository or contacting one of us.
- The code should have some open license, so that we can distribute it to our viewers and add it to our materials. In return, we distribute our improvements back to you. It will be streamed on a public, CC-BY licensed stream.
- We should be able to explain the gist of it quickly, and provide some interesting feedback to our community.
- We will link to the code, so that others can check it out themselves - both before and after.
- You can join us on stream and discuss with us live, but you can also remain anonymous if you want.
Go through sections of other courses, such as
- CodeRefinery
- High-performance computing
Linux/Unix is the key to what we do, and having a good grasp of it will improve almost all of your work (whatever operating system you use):
- porting to an HPC system / using our HPC systems
- bash, ssh, shell scripting, etc.
- tmux/screen
- getopts
- git-pr
- tldr
Everything is data, and sometimes your software problems are actually data problems:
- Data management
- Data repositories
- Data formats, feather format.
- Optimizing data I/O
Good software leads to good open science, and vice versa:
- How to get a DOI
- How to open source a code
- ReproHack
- Docker containers: how to run an image, how to containerize something
- Getting your Python project to PyPI
We answer any questions our watchers may have. Try to stump us!
- TIL: collections of short stuff someone has learned, 1-3 minutes each: https://github.com/jbranchaud/til/