Usage
1 Configuration file
Ribasim has a single configuration file, which is written in the TOML format. It contains settings, as well as paths to other input and output files. Ribasim expects the GeoPackage database database.gpkg as well as optional Arrow input files to be available in the input_dir.
# start- and endtime of the simulation
# can also be set to a date-time like 1979-05-27T07:32:00
starttime = 2019-01-01 # required
endtime = 2021-01-01 # required
# Coordinate Reference System
# The accepted strings are documented here:
# https://proj.org/en/9.4/development/reference/functions.html#c.proj_create
crs = "EPSG:4326" # required
# input files
input_dir = "." # required
results_dir = "results" # required
ribasim_version = "2024.11.0" # required
# Specific tables can also go into Arrow files rather than the database.
# For large tables this can benefit from better compressed file sizes.
# This is optional, tables are retrieved from the database if not specified in the TOML.
[basin]
time = "basin/time.arrow"
[allocation]
timestep = 86400 # optional (required if use_allocation = true), default 86400
use_allocation = false # optional, default false
[solver]
algorithm = "QNDF" # optional, default "QNDF"
saveat = 86400 # optional, default 86400, 0 saves every timestep, inf saves only at start- and endtime
dt = 60.0 # optional, remove for adaptive time stepping
dtmin = 0.0 # optional, default 0.0
dtmax = 0.0 # optional, default length of simulation
force_dtmin = false # optional, default false
abstol = 1e-7 # optional, default 1e-7
reltol = 1e-7 # optional, default 1e-7
water_balance_abstol = 1e-3 # optional, default 1e-3
water_balance_reltol = 1e-2 # optional, default 1e-2
maxiters = 1e9 # optional, default 1e9
sparse = true # optional, default true
autodiff = false # optional, default false
evaporate_mass = true # optional, default true to simulate a correct mass balance
[logging]
# defines the logging level of Ribasim
verbosity = "info" # optional, default "info", can otherwise be "debug", "warn" or "error"
[results]
# These results files are always written
compression = true # optional, default true, using zstd compression
compression_level = 6 # optional, default 6
[experimental]
# Experimental features, disabled by default
concentration = false # tracer calculations
1.1 Solver settings
The solver section in the configuration file is entirely optional, since we aim to use defaults that will generally work well. Common reasons to modify the solver settings are to adjust the calculation or result stepsizes: dt
, and saveat
. If your model does not converge, or your performance is lower than expected, it can help to adjust other solver settings as well.
The default solver algorithm = "QNDF"
, which is a multistep method similar to Matlab’s ode15s
(Shampine and Reichelt 1997). It is an implicit method that supports the default adaptive timestepping. The full list of available solvers is: QNDF
, FBDF
, Rosenbrock23
, Rodas4P
, Rodas5P
, TRBDF2
, KenCarp4
, Tsit5
, RK4
, ImplicitEuler
, Euler
. Information on the solver algorithms can be found on the ODE solvers page.
By default Ribasim uses adaptive timestepping, though not all algorithms support adaptive timestepping. To use fixed timesteps, provide a timestep size in seconds; dt = 3600.0
corresponds to an hourly timestep. With adaptive timestepping, dtmin
and dtmax
control the minimum and maximum allowed dt
. If a smaller dt
than dtmin
is needed to meet the set error tolerances, the simulation stops, unless force_dtmin
is set to true
. force_dtmin
is off by default to ensure an accurate solution.
The default result stepsize, saveat = 86400
will save results after every day that passed. The calculation and result stepsize need not be the same. If you wish to save every calculation step, set saveat = 0
. If you wish to not save any intermediate steps, set saveat = inf
.
The water balance error is a measure of the error in the consistency with which the core keeps track of the water resources per Basin, for more details see here. water_balance_abstol
and water_balance_reltol
give upper bounds on this error, above which an error is thrown. A too large error generally indicates an error in the code or floating point truncation errors.
The Jacobian matrix provides information about the local sensitivity of the model with respect to changes in the states. For implicit solvers it must be calculated often, which can be expensive to do. There are several methods to do this. By default Ribasim uses a Jacobian derived automatically using ForwardDiff.jl with memory management provided by PreallocationTools.jl. If this is not used by setting autodiff = false
, the Jacobian is calculated with a finite difference method, which can be less accurate and more expensive.
By default the Jacobian matrix is a sparse matrix (sparse = true
). Since each state typically only depends on a small number of other states, this is generally more efficient, especially for larger models. The sparsity structure is calculated from the network and provided as a Jacobian prototype to the solver. For small or highly connected models it could be faster to use a dense Jacobian matrix instead by setting sparse = false
.
The total maximum number of iterations maxiters = 1e9
, can normally stay as-is unless doing extremely long simulations.
The absolute and relative tolerance for adaptive timestepping can be set with abstol
and reltol
. For more information on these and other solver options, see the DifferentialEquations.jl docs.
Finally there’s the evaporate_mass = true
setting, which determines whether mass is lost due to evaporation in water quality calculations, by default set to true. While physically incorrect, it is useful for a first correctness check on a model in terms of mass balance (Continuity tracer should always have a concentration of 1). To simulate increasing concentrations (e.g. salinity) due to evaporation, change the setting to false
.
1.2 Allocation settings
Currently there are the following allocation settings: - use_allocation
: A boolean which says whether allocation should be used or not; - timestep
: a float value in seconds which dictates the update interval for allocations.
1.3 Results settings
The following entries can be set in the configuration in the [results]
section.
entry | type | description |
---|---|---|
compression | Bool | Whether to apply compression or not. |
compression_level | Int | Zstandard compression level. Default is 6, higher compresses more. |
subgrid | Bool | Compute and output more detailed water levels. |
1.4 Logging settings
The following can be set in the configuration in the [logging]
section.
entry | type | description |
---|---|---|
verbosity | String | Verbosity level: debug, info, warn, or error. |
1.5 Experimental features
Experimental features are completely unsupported. They can break at any time and results will be wrong. Do not use them in production. If you’re interested in using an experimental feature, please contact us.
One can enable experimental features in the [experimental]
section. Currently the following features can be enabled (all are disabled by default).
entry | type | description |
---|---|---|
concentration | Bool | Whether to enable tracer calculations or not. |
2 GeoPackage database and Arrow tables
The input and output tables described below all share that they are tabular files. The Node and Edge tables always have to be in the GeoPackage database file, and results are always written to Apache Arrow files, sometimes also known as Feather files. All other tables can either be in the database or in separate Arrow files that are listed in the TOML as described above.
For visualization, the Node and Edge tables typically have associated geometries. GeoPackage was used since it provides a standardized way to store tables with (and without) geometry columns in a SQLite database. If, like Ribasim, you can ignore the geometries, a GeoPackage is easy to read using SQLite libraries, which are commonly available. Furthermore GeoPackage can be updated in place when working on a model.
Arrow was chosen since it is standardized, fast, simple and flexible. It can be read and written by many different software packages. In Ribasim we use Arrow.jl. Results are written to Arrow, since for long runs Ribasim can produce tables with many rows. Arrow is well suited for large tabular datasets, and file size is kept small by using compression. The Arrow input files can be compressed with LZ4 or Zstd compression. Furthermore, in some of the columns, a small amount of different values are repeated many times. To reduce file sizes it may be a good idea to apply dictionary encoding to those columns. The Ribasim version that was used to create the results is written to each file in the ribasim_version
schema metadata.
2.1 Table requirements
Below we give details per file, in which we describe the schema of the table using a syntax like this:
column | type | unit | restriction |
---|---|---|---|
node_id | Int32 | - | sorted |
storage | Float64 | \(m^3\) | non-negative |
This means that two columns are required, one named node_id
, that contained elements of type Int32
, and a column named storage
that contains elements of type Float64
. The order of the columns does not matter. In some cases there may be restrictions on the values. This is indicated under restriction
.
Tables are also allowed to have rows for timestamps that are not part of the simulation, these will be ignored. That makes it easy to prepare data for a larger period, and test models on a shorted period.
When preparing the model for simulation, input validation is performed in the Julia core. The validation rules are described in the validation section.
2.2 Custom metadata
It may be advantageous to add metadata to rows. For example, basin areas might have names and objects such as weirs might have specific identification codes. Additional columns can be freely added to tables. The column names should be prefixed with meta_
. They will not be used in computations or validated by the Julia core.
3 Node
Node is a table that specifies the ID and type of each node of a model. The ID must be unique among all nodes, and the type must be one of the available node types listed below.
Nodes are components that are connected together to form a larger system. The Basin is a central node type that stores water. The other node types influence the flow between Basins in some way. Counter intuitively, even systems you may think of as edges, such as a canal, are nodes in Ribasim. This is because edges only define direct instantaneous couplings between nodes, and never have storage of their own.
column | type | restriction |
---|---|---|
node_type | String | sorted, known node type |
node_id | Int32 | sorted per node_type |
geom | Point | (optional) |
name | String | (optional, does not have to be unique) |
subnetwork_id | Int32 | (optional) |
Adding a point geometry to the node table can be helpful to examine models in QGIS, as it will show the location of the nodes on the map. The geometry is not used by Ribasim.
4 Edge
Edges define connections between nodes. The only thing that defines an edge is the nodes it connects, and in what direction. There are currently 2 possible edge types:
- “flow”: Flows between nodes are stored on edges. The effect of the edge direction depends on the node type, Node types that have a notion of an upstream and downstream side use the incoming edge as the upstream side, and the outgoing edge as the downstream side. This means that edges should generally be drawn in the main flow direction. But for instance between two
LinearResistances
the edge direction does not affect anything, other than the sign of the flow on the edge. The sign of the flow follows the edge direction; a positive flow flows along the edge direction, a negative flow in the opposite way. - “control”: The control edges define which nodes are controlled by a particular control node. Control edges should always point away from the control node. The edges between the control node and the nodes it listens to are not present in
Edge
, these are defined inDiscreteControl / condition
column | type | restriction |
---|---|---|
from_node_type | String | - |
from_node_id | Int32 | - |
to_node_type | String | - |
to_node_id | Int32 | - |
edge_type | String | must be “flow” or “control” |
geom | LineString or MultiLineString | (optional) |
name | String | (optional, does not have to be unique) |
subnetwork_id | Int32 | (optional, denotes source in allocation network) |
Similarly to the node table, you can use a geometry to visualize the connections between the nodes in QGIS. For instance, you can draw a line connecting the two node coordinates.
5 Results
5.1 Basin - basin.arrow
The Basin table contains:
- Results of the storage and level of each Basin, which are instantaneous values;
- Results of the fluxes on each Basin, which are mean values over the
saveat
intervals. In the time column the start of the period is indicated. - The initial condition is written to the file, but the final state is not. It will be placed in a separate output state file in the future.
- The
inflow_rate
andoutflow_rate
are the sum of the flows from other nodes into and out of the Basin respectively. The actual flows determine in which term they are counted, not the edge direction. - The
storage_rate
is the net mean flow that is needed to achieve the storage change between timesteps. - The
inflow_rate
consists of the sum of all modelled flows into the basin:inflow_rate
(horizontal flows into the basin, independent of edge direction) +precipitation
+drainage
. - The
outflow_rate
consists of the sum of all modelled flows out of the basin:outflow_rate
(horizontal flows out of the basin, idependent of edge direction) +evaporation
+infiltration
. - The
balance_error
is the difference between thestorage_rate
on one side and theinflow_rate
andoutflow_rate
on the other side:storage_rate
- (inflow_rate
-outflow_rate
). It can be used to check if the numerical error when solving the water balance is sufficiently small. - The
relative_error
is the fraction of thebalance_error
over the mean of thetotal_inflow
andtotal_outflow
.
For a more in-depth explanation of the water balance error see here.
column | type | unit |
---|---|---|
time | DateTime | - |
node_id | Int32 | - |
storage | Float64 | \(\text{m}^3\) |
level | Float64 | \(\text{m}\) |
inflow_rate | Float64 | \(\text{m}^3/\text{s}\) |
outflow_rate | Float64 | \(\text{m}^3/\text{s}\) |
storage_rate | Float64 | \(\text{m}^3/\text{s}\) |
precipitation | Float64 | \(\text{m}^3/\text{s}\) |
evaporation | Float64 | \(\text{m}^3/\text{s}\) |
drainage | Float64 | \(\text{m}^3/\text{s}\) |
infiltration | Float64 | \(\text{m}^3/\text{s}\) |
balance_error | Float64 | \(\text{m}^3/\text{s}\) |
relative_error | Float64 | - |
The table is sorted by time, and per time it is sorted by node_id
.
5.2 Flow - flow.arrow
The flow table contains calculated mean flows over the saveat
intervals for every flow edge in the model. In the time column the start of the period is indicated.
column | type | unit |
---|---|---|
time | DateTime | - |
edge_id | Int32 | - |
from_node_type | String | - |
from_node_id | Int32 | - |
to_node_type | String | - |
to_node_id | Int32 | - |
flow_rate | Float64 | \(\text{m}^3/\text{s}\) |
The table is sorted by time, and per time the same edge_id
order is used, though not sorted. The edge_id
value is the same as the fid
written to the Edge table, and can be used to directly look up the Edge geometry. Flows from the “from” to the “to” node have a positive sign, and if the flow is reversed it will be negative.
5.3 State - basin_state.arrow
The Basin state table contains the water levels in each Basin at the end of the simulation.
column | type | unit |
---|---|---|
node_id | Int32 | - |
level | Float64 | \(\text{m}\) |
To use this result as the initial condition of another simulation, see the Basin / state table reference.
5.4 DiscreteControl - control.arrow
The control table contains a record of each change of control state: when it happened, which control node was involved, to which control state it changed and based on which truth state.
column | type |
---|---|
time | DateTime |
control_node_id | Int32 |
truth_state | String |
control_state | String |
5.5 Allocation - allocation.arrow
The allocation table contains a record of allocation results: when it happened, for which node, in which allocation network, and what the demand, allocated flow and realized flow were. The realized values at the starting time of the simulation can be ignored.
column | type |
---|---|
time | DateTime |
subnetwork_id | Int32 |
node_type | String |
node_id | Int32 |
priority | Int32 |
demand | Float64 |
allocated | Float64 |
realized | Float64 |
The LevelDemand node allocations are listed as node type Basin. This is because one LevelDemand node can link to multiple Basins, and doesn’t receive flow by itself.
For Basins the values demand
, allocated
and realized
are positive if the Basin level is below the minimum level given by a LevelDemand
node. The values are negative if the Basin supplies due to a surplus of water.
Currently the stored demand and abstraction rate are those at the allocation timepoint (and the abstraction rate is based on the previous allocation optimization). In the future these will be an average over the previous allocation timestep.
5.6 Allocation flow - allocation_flow.arrow
The allocation flow table contains results of the optimized allocation flow on every edge in the model that is part of a subnetwork, for each time an optimization problem is solved (see also here). If in the model a main network and subnetwork(s) are specified, there are 2 different types of optimization for the subnetwork: collecting its total demand per priority (for allocating flow from the main network to the subnetwork), and allocating flow within the subnetwork. The column collect_demands
provides the distinction between these two optimization types.
column | type |
---|---|
time | DateTime |
edge_id | Int32 |
from_node_type | String |
from_node_id | Int32 |
to_node_type | String |
to_node_id | Int32 |
subnetwork_id | Int32 |
priority | Int32 |
flow_rate | Float64 |
collect_demands | Bool |
5.7 Subgrid level - subgrid_level.arrow
This result file is only written if the model contains a Basin / subgrid table. See there for more information on the meaning of this output.
column | type |
---|---|
time | DateTime |
subgrid_id | Int32 |
subgrid_level | Float64 |
5.8 Solver statistics - solver_stats.arrow
This result file contains statistics about the solver, which can give an insight into how well the solver is performing over time. The data is solved by saveat
(see configuration file). water_balance
refers to the right-hand-side function of the system of differential equations solved by the Ribasim core.
column | type |
---|---|
time | DateTime |
water_balance_calls | Int |
linear_solves | Int |
accepted_timesteps | Int |
rejected_timesteps | Int |