You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: sections/reproducibility-containers.qmd
+27-3
Original file line number
Diff line number
Diff line change
@@ -4,12 +4,35 @@ title: "Reproducibility and Containers"
4
4
5
5
## Learning Objectives
6
6
7
-
- Think about dependency management, reproducibility, and software
8
-
- Become familiar with containers as a tool to improve computational reproducibility
7
+
- Think about dependency management, reproducibility and software management
9
8
- Discuss how the techniques from this class can improve reproducibility
10
9
10
+
11
11
- Slides: [Accelerating synthesis science through reproducible science](../images/2022-09-repro-sci.pdf)
12
12
13
+
## Summary
14
+
15
+
In this course we reviewed many tools and techniques for how to make research more reproducible and scalable. These tools focus around three main areas: the environment, the data, and the code.
16
+
17
+
### Reproducible Environments
18
+
19
+
- Virtual environments with `venv` and `virtualenvwrapper`
20
+
- Python dependencies with `requirements.txt`
21
+
- Containers with Docker
22
+
23
+
### Accessible Data
24
+
25
+
- Publishing with the Arctic Data Center
26
+
- Formats for large datasets: NetCDF and Zarr
27
+
28
+
### Scalable Python
29
+
30
+
- Parallel with `concurrent.futures`, `parsl`, `dask`
31
+
- N-dimensional data access with `xarray`
32
+
- Geospatial analysis with `geopandas` and `rasterio`
33
+
- Software design and python packages
34
+
35
+
13
36
## Software collapse
14
37
15
38
::: {layout-ncol="2"}
@@ -44,11 +67,12 @@ This approach combines the evolving approach to using a `Dockerfile` to precisel
44
67
45
68
:::
46
69
70
+
71
+
47
72
## Discussion
48
73
49
74
To wrap up this week, let's kick off a discussion with a couple of key questions.
50
75
51
76
- As we learn new tools for scalable and reproducible computing, what can we as software creators do to improve robustness and ease maintenance of our packages?
52
77
- Given the fragility of software ecosystems, are you worried about investing a lot of time in learning and building code for proprietary cloud systems? Can we compel vendors to keep their systems open?
53
-
- How much can technological solutions such as containers truly address the issues around depencies, maintainability, and sustainability of scientific software?
54
78
- What relative fraction of the research budget should funders invest in software infrastructure, data infrastructure, and research outcomes?
0 commit comments