Usage

1 Configuration file

Ribasim has a single configuration file, which is written in the TOML format. It contains settings, as well as paths to other input and output files. Ribasim expects the GeoPackage database database.gpkg as well as optional Arrow input files to be available in the input_dir.

# start- and endtime of the simulation
# can also be set to a date-time like 1979-05-27T07:32:00
starttime = 2019-01-01 # required
endtime = 2021-01-01   # required

# Coordinate Reference System
# The accepted strings are documented here:
# https://proj.org/en/9.4/development/reference/functions.html#c.proj_create
crs = "EPSG:4326"      # required

# input files
input_dir = "."         # required
results_dir = "results" # required

ribasim_version = "2025.4.0" # required

# Specific tables can also go into Arrow files rather than the database.
# For large tables this can benefit from better compressed file sizes.
# This is optional, tables are retrieved from the database if not specified in the TOML.
[basin]
time = "basin/time.arrow"

[interpolation]
flow_boundary = "block"   # optional, default "block", can otherwise be "linear"
block_transition_period = 0  # optional, default 0

[allocation]
timestep = 86400         # optional (required if experimental.allocation = true), default 86400
[allocation.source_priority]
user_demand = 1000       # optional, default 1000
flow_boundary = 2000     # optional, default 2000
level_boundary = 3000    # optional, default 3000
basin = 4000             # optional, default 4000
subnetwork_inlet = 5000  # optional, default 5000

[solver]
algorithm = "QNDF"  # optional, default "QNDF"
saveat = 86400      # optional, default 86400, 0 saves every timestep, inf saves only at start- and endtime
dt = 60.0           # optional, remove for adaptive time stepping
dtmin = 0.0         # optional, default 0.0
dtmax = 0.0         # optional, default length of simulation
force_dtmin = false # optional, default false
abstol = 1e-5       # optional, default 1e-5
reltol = 1e-5       # optional, default 1e-5
water_balance_abstol = 1e-3 # optional, default 1e-3
water_balance_reltol = 1e-2 # optional, default 1e-2
maxiters = 1e9      # optional, default 1e9
sparse = true       # optional, default true
autodiff = true     # optional, default true
evaporate_mass = true  # optional, default true to simulate a correct mass balance

[logging]
# defines the logging level of Ribasim
verbosity = "info" # optional, default "info", can otherwise be "debug", "warn" or "error"

[results]
compression = true    # optional, default true, using zstd compression
compression_level = 6 # optional, default 6
subgrid = false       # optional, default false

[experimental]
# Experimental features, disabled by default
concentration = false # tracer calculations
allocation = false # allocation layer, replaced by 'first come first serve' when inactive

1.1 Solver settings

The solver section in the configuration file is entirely optional, since we aim to use defaults that will generally work well. Common reasons to modify the solver settings are to adjust the calculation or result stepsizes: dt, and saveat. If your model does not converge, or your performance is lower than expected, it can help to adjust other solver settings as well.

The default solver algorithm = "QNDF", which is a multistep method similar to Matlab’s ode15s (Shampine and Reichelt 1997). It is an implicit method that supports the default adaptive timestepping. The full list of available solvers is: QNDF, FBDF, Rosenbrock23, Rodas4P, Rodas5P, TRBDF2, KenCarp4, Tsit5, RK4, ImplicitEuler, Euler. Information on the solver algorithms can be found on the ODE solvers page.

By default Ribasim uses adaptive timestepping, though not all algorithms support adaptive timestepping. To use fixed timesteps, provide a timestep size in seconds; dt = 3600.0 corresponds to an hourly timestep. With adaptive timestepping, dtmin and dtmax control the minimum and maximum allowed dt. If a smaller dt than dtmin is needed to meet the set error tolerances, the simulation stops, unless force_dtmin is set to true. force_dtmin is off by default to ensure an accurate solution.

The default result stepsize, saveat = 86400 will save results after every day that passed. The calculation and result stepsize need not be the same. If you wish to save every calculation step, set saveat = 0. If you wish to not save any intermediate steps, set saveat = inf.

The water balance error is a measure of the error in the consistency with which the core keeps track of the water resources per Basin, for more details see here. water_balance_abstol and water_balance_reltol give upper bounds on this error, above which an error is thrown. A too large error generally indicates an error in the code or floating point truncation errors.

The Jacobian matrix provides information about the local sensitivity of the model with respect to changes in the states. For implicit solvers it must be calculated often, which can be expensive to do. There are several methods to do this. By default Ribasim uses a Jacobian derived automatically using ForwardDiff.jl with memory management provided by PreallocationTools.jl. If this is not used by setting autodiff = false, the Jacobian is calculated with a finite difference method, which can be less accurate and more expensive.

By default the Jacobian matrix is a sparse matrix (sparse = true). Since each state typically only depends on a small number of other states, this is generally more efficient, especially for larger models. The sparsity structure is calculated from the network and provided as a Jacobian prototype to the solver. For small or highly connected models it could be faster to use a dense Jacobian matrix instead by setting sparse = false.

The total maximum number of iterations maxiters = 1e9, can normally stay as-is unless doing extremely long simulations.

The absolute and relative tolerance for adaptive timestepping can be set with abstol and reltol. For more information on these and other solver options, see the DifferentialEquations.jl docs and the DifferentialEquations.jl FAQ.

Finally there’s the evaporate_mass = true setting, which determines whether mass is lost due to evaporation in water quality calculations, by default set to true. While physically incorrect, it is useful for a first correctness check on a model in terms of mass balance (Continuity tracer should always have a concentration of 1). To simulate increasing concentrations (e.g. salinity) due to evaporation, change the setting to false.

1.2 Interpolation settings

There are the following interpolation settings: - flow_boundary: The interpolation type of flow boundary timeseries. This is linear by default, but can also be set to block. - block_transition_period: When an interpolation type is set to block, this parameter determines an interval in time on either side of each data point which is used to smooth the transition between data points. See also the documentation for this interpolation type.

1.3 Allocation settings

There are the following allocation settings: - timestep: A float value in seconds which dictates the update interval for allocations; - source_priority: An integer per source type for the allocation algorithm: user_demand, boundary, level_demand, flow_demand, subnetwork_inlet. Flow boundaries and level boundaries are combined in the single category boundary. By default all nodes of the same type have the same source priority, so to obtain a strict source ordering the sources are sorted by node ID for each source priority within a subnetwork. When no default source priorities are specified, default defaults are applied (see the TOML example above).

1.4 Results settings

The following entries can be set in the configuration in the [results] section.

entry	type	description
compression	Bool	Whether to apply compression or not.
compression_level	Int	Zstandard compression level. Default is 6, higher compresses more.
subgrid	Bool	Compute and output more detailed water levels.

1.5 Logging settings

The following can be set in the configuration in the [logging] section.

entry	type	description
verbosity	String	Verbosity level: debug, info, warn, or error.

If verbosity is set to debug, the used Basin / profile dimensions (level, area and storage) are written to a CSV file in the results folder. This can be useful if you only provide 2 of the 3 columns and want to inspect the dimensions used in the computation.

The format of the CSV is: column 1 = node id, column 2 = level, column 3 = area and row 4 is storage.

Lets say you have 2 basins at node 1 and node 2. Dimensions node 1: level = [0, 1, 2], area = [2, 2, 4] and storage = [0, 2, 6], Dimensions node 1: level = [0, 1, 2], area = [4, 4, 8] and storage = [0, 4, 12].

Then the CSV will look like:

node_id	level	area	storage
1	0	2	0
1	1	2	2
1	2	4	6
2	0	4	0
2	1	4	4
2	2	8	12

1.6 Experimental features

Important

Experimental features are completely unsupported. They can break at any time and results will be wrong. Do not use them in production. If you’re interested in using an experimental feature, please contact us.

One can enable experimental features in the [experimental] section. Currently the following features can be enabled (all are disabled by default).

entry	type	description
concentration	Bool	Whether to enable tracer calculations or not.
allocation	Bool	Whether to activate the activation layer. Replaced by ‘first come first serve’ when deactivated

2 GeoPackage database and Arrow tables

The input and output tables described below all share that they are tabular files. The Node and Link tables always have to be in the GeoPackage database file, and results are always written to Apache Arrow files, sometimes also known as Feather files. All other tables can either be in the database or in separate Arrow files that are listed in the TOML as described above.

For visualization, the Node and Link tables typically have associated geometries. GeoPackage was used since it provides a standardized way to store tables with (and without) geometry columns in a SQLite database. If, like Ribasim, you can ignore the geometries, a GeoPackage is easy to read using SQLite libraries, which are commonly available. Furthermore GeoPackage can be updated in place when working on a model.

Arrow was chosen since it is standardized, fast, simple and flexible. It can be read and written by many different software packages. In Ribasim we use Arrow.jl. Results are written to Arrow, since for long runs Ribasim can produce tables with many rows. Arrow is well suited for large tabular datasets, and file size is kept small by using compression. The Arrow input files can be compressed with LZ4 or Zstd compression. Furthermore, in some of the columns, a small amount of different values are repeated many times. To reduce file sizes it may be a good idea to apply dictionary encoding to those columns. The Ribasim version that was used to create the results is written to each file in the ribasim_version schema metadata.

2.1 Table requirements

Below we give details per file, in which we describe the schema of the table using a syntax like this:

column	type	unit	restriction
node_id	Int32	-	sorted
storage	Float64	\(\text{m}^3\)	non-negative

This means that two columns are required, one named node_id, that contained elements of type Int32, and a column named storage that contains elements of type Float64. The order of the columns does not matter. In some cases there may be restrictions on the values. This is indicated under restriction.

Tables are also allowed to have rows for timestamps that are not part of the simulation, these will be ignored. That makes it easy to prepare data for a larger period, and test models on a shorted period.

When preparing the model for simulation, input validation is performed in the Julia core. The validation rules are described in the validation section.

2.2 Custom metadata

It may be advantageous to add metadata to rows. For example, basin areas might have names and objects such as weirs might have specific identification codes. Additional columns can be freely added to tables. The column names should be prefixed with meta_. They will not be used in computations or validated by the Julia core.

3 Node

Node is a table that specifies the ID and type of each node of a model. The ID must be unique among all nodes, and the type must be one of the available node types listed below.

Nodes are components that are connected together to form a larger system. The Basin is a central node type that stores water. The other node types influence the flow between Basins in some way. Counter intuitively, even systems you may think of as links, such as a canal, are nodes in Ribasim. This is because links only define direct instantaneous couplings between nodes, and never have storage of their own.

column	type	restriction
node_type	String	sorted, known node type
node_id	Int32	sorted per node_type
geom	Point	(optional)
name	String	(optional, does not have to be unique)
subnetwork_id	Int32	(optional)
source_priority	Int32	(optional, does not have to be unique)
cyclic_time	Bool	(optional, defaults to false)

Adding a point geometry to the node table can be helpful to examine models in QGIS, as it will show the location of the nodes on the map. The geometry is not used by Ribasim.

3.1 Cyclic time series

When cyclic_time is set to true for a node in the Node table, every time series associated with that node in the corresponding table(s) will be interpreted as cyclic. That is: the time series is exactly repeated left and right of the original time interval to cover the whole simulation period. For this it is validated that the first and last data values in the timeseries are the same. For instance, quarterly precipitation requires giving values for every quarter at the start of the quarter, and then the value for the first quarter again at the start of the next year.

Note that periods like months or years are not of constant length in the calendar, so over long simulation periods the timeseries can get out of sync with these periods on the calendar.

4 Link

Links define connections between nodes. The only thing that defines an link is the nodes it connects, and in what direction. There are currently 2 possible link types:

“flow”: Flows between nodes are stored on links. The effect of the link direction depends on the node type, Node types that have a notion of an upstream and downstream side use the incoming link as the upstream side, and the outgoing link as the downstream side. This means that links should generally be drawn in the main flow direction. But for instance between two LinearResistances the link direction does not affect anything, other than the sign of the flow on the link. The sign of the flow follows the link direction; a positive flow flows along the link direction, a negative flow in the opposite way.
“control”: The control links define which nodes are controlled by a particular control node. Control links should always point away from the control node. The links between the control node and the nodes it listens to are not present in Link, these are defined in DiscreteControl / condition

column	type	restriction
from_node_id	Int32	-
to_node_id	Int32	-
link_type	String	must be “flow” or “control”
geom	LineString or MultiLineString	(optional)
name	String	(optional, does not have to be unique)

Similarly to the node table, you can use a geometry to visualize the connections between the nodes in QGIS. For instance, you can draw a line connecting the two node coordinates.

5 Results

5.1 Basin - `basin.arrow`

The Basin table contains:

Results of the storage and level of each Basin, which are instantaneous values;
Results of the fluxes on each Basin, which are mean values over the saveat intervals. In the time column the start of the period is indicated.
The initial condition is written to the file, but the final state is not. It will be placed in a separate output state file in the future.
The inflow_rate and outflow_rate are the sum of the flows from other nodes into and out of the Basin respectively. The actual flows determine in which term they are counted, not the link direction.
The storage_rate is the net mean flow that is needed to achieve the storage change between timesteps.
The inflow_rate consists of the sum of all modelled flows into the basin: inflow_rate (horizontal flows into the basin, independent of link direction) + precipitation + drainage.
The outflow_rate consists of the sum of all modelled flows out of the basin: outflow_rate (horizontal flows out of the basin, independent of link direction) + evaporation + infiltration.
The balance_error is the difference between the storage_rate on one side and the inflow_rate and outflow_rate on the other side: storage_rate - (inflow_rate - outflow_rate). It can be used to check if the numerical error when solving the water balance is sufficiently small.
The relative_error is the fraction of the balance_error over the mean of the total_inflow and total_outflow.
The convergence is the scaled residual of the solver, giving an indication of which nodes converge the worst (are hardest to solve).

For a more in-depth explanation of the water balance error see here.

column	type	unit
time	DateTime	-
node_id	Int32	-
level	Float64	\(\text{m}\)
storage	Float64	\(\text{m}^3\)
inflow_rate	Float64	\(\text{m}^3/\text{s}\)
outflow_rate	Float64	\(\text{m}^3/\text{s}\)
storage_rate	Float64	\(\text{m}^3/\text{s}\)
precipitation	Float64	\(\text{m}^3/\text{s}\)
surface_runoff	Float64	\(\text{m}^3/\text{s}\)
evaporation	Float64	\(\text{m}^3/\text{s}\)
drainage	Float64	\(\text{m}^3/\text{s}\)
infiltration	Float64	\(\text{m}^3/\text{s}\)
balance_error	Float64	\(\text{m}^3/\text{s}\)
relative_error	Float64	-
convergence	Float64/Missing	-

The table is sorted by time, and per time it is sorted by node_id.

5.2 Flow - `flow.arrow`

The flow table contains calculated mean flows over the saveat intervals for every flow link in the model. In the time column the start of the period is indicated.

column	type	unit
time	DateTime	-
link_id	Int32	-
from_node_type	String	-
from_node_id	Int32	-
to_node_type	String	-
to_node_id	Int32	-
flow_rate	Float64	\(\text{m}^3/\text{s}\)
convergence	Float64/Missing	-

The table is sorted by time, and per time the same link_id order is used, though not sorted. The link_id value is the same as the fid written to the Link table, and can be used to directly look up the Link geometry. Flows from the “from” to the “to” node have a positive sign, and if the flow is reversed it will be negative. - The convergence is the scaled residual of the solver, giving an indication of which nodes converge the worst (are hardest to solve).

5.3 State - `basin_state.arrow`

The Basin state table contains the water levels in each Basin at the end of the simulation.

column	type	unit
node_id	Int32	-
level	Float64	\(\text{m}\)

To use this result as the initial condition of another simulation, see the Basin / state table reference.

5.4 DiscreteControl - `control.arrow`

The control table contains a record of each change of control state: when it happened, which control node was involved, to which control state it changed and based on which truth state.

column	type	unit
time	DateTime	-
control_node_id	Int32	-
truth_state	String	-
control_state	String	-

5.5 Allocation - `allocation.arrow`

The allocation table contains a record of allocation results: when it happened, for which node, in which allocation network, and what the demand, allocated flow and realized flow were. The realized values at the starting time of the simulation can be ignored.

column	type	unit
time	DateTime	-
subnetwork_id	Int32	-
node_type	String	-
node_id	Int32	-
priority	Int32	-
demand	Float64	\(\text{m}^3/\text{s}\)
allocated	Float64	\(\text{m}^3/\text{s}\)
realized	Float64	\(\text{m}^3/\text{s}\)

Note

The LevelDemand node allocations are listed as node type Basin. This is because one LevelDemand node can link to multiple Basins, and doesn’t receive flow by itself.

For Basins the values demand, allocated and realized are positive if the Basin level is below the minimum level given by a LevelDemand node. The values are negative if the Basin supplies due to a surplus of water.

Note

Currently the stored demand and abstraction rate are those at the allocation timepoint (and the abstraction rate is based on the previous allocation optimization). In the future these will be an average over the previous allocation timestep.

5.6 Allocation flow - `allocation_flow.arrow`

The allocation flow table contains results of the optimized allocation flow on every link in the model that is part of a subnetwork, for each time an optimization problem is solved (see also here). If in the model a primary network and subnetwork(s) are specified, there are 3 different types of optimization for the subnetwork: The column optimization_type provides the distinction between these optimization types.

internal_sources: first it will try using local sources internal to the subnetwork
collect_demands: collecting its total demand per priority (for allocating flow from the primary network to the subnetwork)
allocate: allocating flow within the subnetwork

column	type	unit
time	DateTime	-
link_id	Int32	-
from_node_type	String	-
from_node_id	Int32	-
to_node_type	String	-
to_node_id	Int32	-
subnetwork_id	Int32	-
priority	Int32	-
flow_rate	Float64	\(\text{m}^3/\text{s}\)
optimization_type	String	-

5.7 Allocation feasibility analysis - `allocation_analysis_feasibility.log`

When an allocation optimization problem turns out to be infeasible, an infeasibility analysis is performed. Some user friendly data is logged in the main log, but the full report of the analysis is written to this separate file. For details on the infeasibility analysis see here.

5.8 Allocation scaling analysis - `allocation_analysis_scaling.log`

When an allocation optimization problem turns out to be infeasible, a scaling analysis is performed. Some user friendly data is logged in the main log, but the full report of the analysis is written to this separate file. For details on the infeasibility analysis see here.

5.9 Concentration - `concentration.arrow`

If the experimental concentration feature is enabled, the results are written to concentration.arrow. This file records the Basin concentrations for each substance over time. The schema below is identical to the external Basin concentration input.

column	type	unit	restriction
node_id	Int32	-	Basin nodes only
time	DateTime	-
substance	String	-	can correspond to known Delwaq substances
concentration	Float64	\(\text{g}/\text{m}^3\)

5.10 Subgrid level - `subgrid_level.arrow`

This result file is only written if the model contains a Basin / subgrid table. See there for more information on the meaning of this output.

column	type	unit
time	DateTime	-
subgrid_id	Int32	-
subgrid_level	Float64	\(\text{m}\)

5.11 Solver statistics - `solver_stats.arrow`

This result file contains statistics about the solver, which can give an insight into how well the solver is performing over time. The data is solved by saveat (see configuration file). water_balance refers to the right-hand-side function of the system of differential equations solved by the Ribasim core.

The computation_time is the wall time in milliseconds spent on the given period. The first row tends to include compilation time as well. The dt is the size (in seconds) of the last calculation timestep (at the saveat timestep).

column	type	unit
time	DateTime	-
computation_time	Float64	\(\text{ms}\)
water_balance_calls	Int	-
linear_solves	Int	-
accepted_timesteps	Int	-
rejected_timesteps	Int	-
dt	Float64	\(\text{s}\)

References

Shampine, Lawrence F, and Mark W Reichelt. 1997. “The Matlab Ode Suite.” SIAM Journal on Scientific Computing 18 (1): 1–22.

1 Configuration file

1.1 Solver settings

1.2 Interpolation settings

1.3 Allocation settings

1.4 Results settings

1.5 Logging settings

1.6 Experimental features

2 GeoPackage database and Arrow tables

2.1 Table requirements

2.2 Custom metadata

3 Node

3.1 Cyclic time series

4 Link

5 Results

5.1 Basin - basin.arrow

5.2 Flow - flow.arrow

5.3 State - basin_state.arrow

5.4 DiscreteControl - control.arrow

5.5 Allocation - allocation.arrow

5.6 Allocation flow - allocation_flow.arrow

5.7 Allocation feasibility analysis - allocation_analysis_feasibility.log

5.8 Allocation scaling analysis - allocation_analysis_scaling.log

5.9 Concentration - concentration.arrow

5.10 Subgrid level - subgrid_level.arrow

5.11 Solver statistics - solver_stats.arrow

References

5.1 Basin - `basin.arrow`

5.2 Flow - `flow.arrow`

5.3 State - `basin_state.arrow`

5.4 DiscreteControl - `control.arrow`

5.5 Allocation - `allocation.arrow`

5.6 Allocation flow - `allocation_flow.arrow`

5.7 Allocation feasibility analysis - `allocation_analysis_feasibility.log`

5.8 Allocation scaling analysis - `allocation_analysis_scaling.log`

5.9 Concentration - `concentration.arrow`

5.10 Subgrid level - `subgrid_level.arrow`

5.11 Solver statistics - `solver_stats.arrow`