Files
2026-05-27 21:00:28 +02:00

61 lines
1.9 KiB
Markdown

# Evaluation pipeline
Aggregates raw per-run measurements into figures (PDF + tikz) and a pgfkeys
`values.tex` for the paper.
## Run
With Docker (reproducible toolchain):
docker compose up --build
Outputs land in `out/`:
- `out/values.tex` -- all `\val{...}` keys for the paper
- `out/<experiment>/*.pdf` -- per-experiment figures
- `out/<experiment>/*.tex` -- tikz versions (via `make tikz`)
Without Docker, requires R (with renv), Python 3.12+, uv, LaTeX (for
tikzDevice's metric probe). Then:
make all
## Targets
- `make all` (= `figures values`)
- `make figures` -- PDFs for all experiments
- `make tikz` -- tikz `.tex` for all experiments
- `make values` -- regenerate `out/values.tex` only
- `make derive` -- aggregate raw data into `derived/<experiment>/*.csv`
- `make sanity` -- shape + NaN + solution-coverage checks on derived CSVs
## Configuration
`RAW_DATA_ROOT` controls where the aggregator reads raw aggregates from.
Precedence (highest first):
1. command line: `make RAW_DATA_ROOT=/mnt/data ...`
2. environment: `RAW_DATA_ROOT=/mnt/data make ...`
3. `local.mk` (copy from `local.mk.example`)
4. Makefile default: `../raw_data`
For Docker, the same variable picks the host path that gets bind-mounted as
`/raw_data` inside the container:
RAW_DATA_ROOT=/mnt/data docker compose up
## Layout
analysis/ Python: aggregation, sanity, gen_values, plugins
values/ one plugin per metric family (cpu, rtt, idt, ...)
figures/ R: ggplot scripts, common.R, renv.lock
out/ generated (gitignored)
derived/ intermediate CSVs (gitignored)
## Plugins
`analysis/values/*.py` each expose `compute(derived) -> (keys, sources)`.
Keys are pgfkeys paths (e.g. `datacenter-fq/sender-cpu/cake/mean-pct`);
`gen_values.py` merges them, wraps numbers in `\qty{}{}` / `\num{}` by
suffix, and writes one `values.tex`. Add a new metric by dropping a new
plugin into `analysis/values/`.