Public link to this material: https://github.com/tum-db/indexed-algebra
This repository contains the required scripts to recreate all benchmarks for our VLDB 2023 paper Asymptotically Better Query Optimization Using Indexed Algebra.
The following steps explain how to reproduce the figures in the Evaluation, i.e., Table 2, and Figures 12, 13, 14, and 16.
# Extract umbra binaries
tar -xf umbra.tar.xz
cd umbra
# Load benchmark databases. The following scripts download and generate the
# datasets, before they load them into Umbra. These should take a couple of
# minutes each.
scripts/tpch/dbgen.sh
scripts/tpcds/dbgen.sh
scripts/job/dbgen.sh
# Execute the measurements in Umbra
# This will take about 30 minutes
bin/sql '' measure.sql
cd ..
The measurements should now be in a opt.csv
file.
The numbers used in the paper are located in data/opt.csv
, which can be used in
the following example to generate the figures.
The included R scripts generate the Latex figures.
# (optional) copy results to data directory
cp umbra/opt.csv data/opt.csv
# Install R dependencies
R --vanilla --interactive < <(echo "install.packages(c('data.table', 'cowplot', 'plyr', 'ggplot2', 'this.path', 'RColorBrewer', 'sqldf', 'tikzDevice', 'xtable'))")
# Generate the figures
./scripts/unnestingComparison.r
The images
subdirectory should now contain four .tikz
files containing the
Figures.
The following steps show how to reproduce the measurements of Figures 15a and 15b. The scripts assume that you ran the previous scripts to extract umbra and install the R dependencies.
# Measure the tableau public workload
# This takes about 10 minutes
scripts/TableauPublic/measure.sh
# Then measure the small TPC-H end-to-end evaluation
# This takes about 1 minute
scripts/umbra_small_tpch.sh > smallTPCH.csv
The scripts/TableauPublic directory should now contain an opt.csv and an execution.csv file, and there should be a smallTPCH.csv file. The following R script generate the Latex figures:
# (optional) copy results to data directory
cp TableauPublic/opt.csv data/tableaupublicopt.csv
cp TableauPublic/execution.csv data/tableaupublicexecution.csv
cp smallTPCH.csv data/smallTPCH.csv
# Generate the figures
./scripts/interactiveWorkloads.r
The following steps explain how to reproduce Figures 17.
# This assumes an installation of the measured systems on your machine.
# Please refer to their documentation on how to install them.
# Measure all systems
./dbmsComparison/measure.sh
The measurements should now be in a dbs.csv
file.
The numbers used in the paper are located in data/dbs.csv
, which we use in
the following example to generate the figure.
# (optional) copy results to data directory
cp dbs.csv data/dbs.csv
# For R dependencies see above
# Generate the figure
./scripts/dbmsComparison.r
The images
subdirectory should now contain a .tikz
file containing the
Figure.
To conveniently render the generated figures, we provide a small latex wrapper around the generated tikz files.
cd images
latexmk -pdf figures.tex
# figures.pdf contains a rendered PDF