Benchmark

This document describes how the benchmarking and performance testing of Ribasim is handled. In Ribasim, the benchmarking includes and regression tests on the test models and regressive performance tests on the production models.

The idea of regression tests on the test models is to run models with various solvers, run models with a sparse Jacobian and a dense one and compare the outputs. It will possibly involve production models in the future. And runtime performance test is lined up for the next step (in issue #1698).

The idea of regressive performance tests on the production models is to test the performance of running the production models. It will report if the new changes in the code decrease the model’s performance or result in failed runs.

1 Benchmarking of the test models

1.1 Benchmark the ODE solvers

The benchmarking of the ODE solvers is done by running the test models with different ODE solvers and solver settings and comparing the output with the benchmark.

The settings include toggling the sparse and autodiff solver settings. Currently, 4 models are chosen to undergo the regression tests. They are trivial, basic, pid_control and subnetwork_with_sources.

The benchmark reference are the output files of a run of the test models with default solver settings. The output files basin.arrow and flow.arrow are used for comparison. Different margins are set for the comparison of the outputs, and the benchmark is considered passed if the output is within the margin. Since we are still in the process of evaluating the performance of different solvers, the margin is subject to change.

The regression tests are run on a weekly basis.

2 Benchmarking of the production model

Regressive performance tests on the production models are done by running the production models with the new changes of the code and comparing the runtime performance with the reference run. The references are the output files of a run of the production models with the default solver settings. The output file basin_state.arrow which records the end states of the basin is used for comparison. Since the development of the model is still ongoing, the benchmark is subject to change.

The regressive performance tests are currently run on a weekly basis.