Ribasim has a single configuration file, which is written in the TOML format. It contains settings, as well as paths to other input and output files. Ribasim expects the GeoPackage database database.gpkg as well as optional NetCDF input files to be available in the input_dir.
# start- and endtime of the simulation# can also be set to a date-time like 1979-05-27T07:32:00starttime=2019-01-01# requiredendtime=2021-01-01# required# Coordinate Reference System# The accepted strings are documented here:# https://proj.org/en/9.4/development/reference/functions.html#c.proj_createcrs="EPSG:4326"# required# input and results directories relative to the TOML fileinput_dir="input"# requiredresults_dir="results"# requiredribasim_version="2025.6.0"# required# Specific tables can also go into NetCDF files rather than the database.# For large tables this can benefit from better compressed file sizes.# This is optional, tables are retrieved from the database if not specified in the TOML.[basin]time="basin/time.nc"[interpolation]flow_boundary="block"# optional, default "block", can otherwise be "linear"block_transition_period=0# optional, default 0[allocation]timestep=86400# optional (required if experimental.allocation = true), default 86400[allocation.route_priority]level_boundary=1000# optional, default 1000basin=0# optional, default 0manning_resistance=10# optional, default 10linear_resistance=20# optional, default 20tabulated_rating_curve=-10# optional, default -10outlet=40# optional, default 40pump=50# optional, default 50[solver]algorithm="QNDF"# optional, default "QNDF"saveat=86400# optional, default 86400, 0 saves every timestep, inf saves only at start- and endtimedt=60.0# optional, remove for adaptive time steppingdtmin=0.0# optional, default 0.0dtmax=0.0# optional, default length of simulationforce_dtmin=false# optional, default falseabstol=1e-5# optional, default 1e-5reltol=1e-5# optional, default 1e-5water_balance_abstol=1e-3# optional, default 1e-3water_balance_reltol=1e-2# optional, default 1e-2maxiters=1e9# optional, default 1e9sparse=true# optional, default trueautodiff=true# optional, default trueevaporate_mass=true# optional, default true to simulate a correct mass balancedepth_threshold=0.1# optional, default 0.1level_difference_threshold=0.02# optional, default 0.02specialize=false# optional, default false[logging]# defines the logging level of Ribasimverbosity="info"# optional, default "info", can also be "debug", "warn" or "error"[results]compression=true# optional, default true, using deflate compressioncompression_level=1# optional, default 1 (0-9)subgrid=false# optional, default false[experimental]# Experimental features, disabled by defaultconcentration=false# tracer calculationsallocation=false# allocation layer, replaced by 'first come first serve' when inactive
1.1 Solver settings
The solver section in the configuration file is entirely optional, since we aim to use defaults that will generally work well. Common reasons to modify the solver settings are to adjust the calculation or result stepsizes: dt, and saveat. If your model does not converge, or your performance is lower than expected, it can help to adjust other solver settings as well.
The default solver algorithm = "QNDF", which is a multistep method similar to Matlab’s ode15s(Shampine and Reichelt 1997). It is an implicit method that supports the default adaptive timestepping. The full list of available solvers is: QNDF, FBDF, Rosenbrock23, Rodas4P, Rodas5P, TRBDF2, KenCarp4, Tsit5, RK4, ImplicitEuler, Euler. Information on the solver algorithms can be found on the ODE solvers page.
By default Ribasim uses adaptive timestepping, though not all algorithms support adaptive timestepping. To use fixed timesteps, provide a timestep size in seconds; dt = 3600.0 corresponds to an hourly timestep. With adaptive timestepping, dtmin and dtmax control the minimum and maximum allowed dt. If a smaller dt than dtmin is needed to meet the set error tolerances, the simulation stops, unless force_dtmin is set to true. force_dtmin is off by default to ensure an accurate solution.
The saveat setting controls the output frequency. It is the number of seconds between timestamps in the results. The default result stepsize, saveat = 86400 will save results daily. The calculation and result stepsize need not be the same. If you wish to save every calculation step, set saveat = 0. With the default adaptive timestepping that will result in irregular time series. If you wish to not save any intermediate steps, set saveat = inf. For output frequencies that are not of fixed length, like months or years, we suggest to resample from daily results during post-processing.
The water balance error is a measure of the error in the consistency with which the core keeps track of the water resources per Basin, for more details see here. water_balance_abstol and water_balance_reltol give upper bounds on this error, above which an error is thrown. A too large error generally indicates an error in the code or floating point truncation errors.
The Jacobian matrix provides information about the local sensitivity of the model with respect to changes in the states. For implicit solvers it must be calculated often, which can be expensive to do. There are several methods to do this. By default Ribasim uses a Jacobian derived automatically using ForwardDiff.jl with memory management provided by PreallocationTools.jl. If this is not used by setting autodiff = false, the Jacobian is calculated with a finite difference method, which can be less accurate and more expensive.
By default the Jacobian matrix is a sparse matrix (sparse = true). Since each state typically only depends on a small number of other states, this is generally more efficient, especially for larger models. The sparsity structure is calculated from the network and provided as a Jacobian prototype to the solver. For small or highly connected models it could be faster to use a dense Jacobian matrix instead by setting sparse = false.
The total maximum number of iterations maxiters = 1e9, can normally stay as-is unless doing extremely long simulations.
The absolute and relative tolerance for adaptive timestepping can be set with abstol and reltol. For more information on these and other solver options, see the DifferentialEquations.jl docs and the DifferentialEquations.jl FAQ.
The evaporate_mass = true setting determines whether mass is lost due to evaporation in water quality calculations, by default set to true. While physically incorrect, it is useful for a first correctness check on a model in terms of mass balance (Continuity tracer should always have a concentration of 1). To simulate increasing concentrations (e.g. salinity) due to evaporation, change the setting to false.
By default specialize = false to reduce the time it takes to initialize fully. It can be enabled for long-running simulations, trading initialization speed for simulation speed. Concretely, setting it will set the specialization level to NoSpecialize for false, and FullSpecialize for true. Additionally setting it to false also fixes the autodiff chunk size to 1, making a similar tradeoff as the specialization level.
There are two threshold parameters that control when reduction factors start smoothly reducing flow. Note that these are global settings, and cannot be set for individual nodes. depth_threshold = 0.1 is the water depth (level - profile bottom) in meters at which the low storage factor kicks in. This will limit any extraction from nearly empty Basins, and avoids drying them out completely. level_difference_threshold = 0.02 is the level difference below which several flows are reduced. Examples are approaching min_upstream_level from above and max_downstream_level from below, level difference across TabulatedRatingCurve and Outlet nodes, and UserDemand approaching the min_level from above. For details see the equations on the node reference pages, and the section on reduction factors
1.2 Interpolation settings
There are the following interpolation settings:
flow_boundary: The interpolation type of flow boundary timeseries. This is block by default, but can also be set to linear.
block_transition_period: When an interpolation type is set to block, this parameter determines an interval in time on either side of each data point which is used to smooth the transition between data points. See also the documentation for this interpolation type.
1.3 Allocation settings
There are the following allocation settings:
timestep: A float value in seconds which dictates the update interval for allocations;
route_priority: An integer per source type for the allocation algorithm. The prioritisable sources are: basin, level_boundary, linear_resistance, manning_resistance, tabulated_rating_curve, outlet and pump.
If you wish to set the route priority for specific nodes rather than a fallback per node type, these can be set in the Node table.
By default, all nodes of the same type have the same route priority. To obtain a strict source ordering, the sources are sorted by node ID for each route priority within a subnetwork.
When no default route priorities are specified, default values are applied (see the TOML example above).
The way the route priorities work is as follows: The priority that is assigned to a node is interpreted as the cost it takes for water to flow through that node. The flow through each node is multiplied with the priority, and we find the water distribution solution that has the lowest cost of fulfilling the desired demands.
1.4 Results settings
The following entries can be set in the configuration in the [results] section.
entry
type
description
compression
Bool
Whether to apply deflate compression or not.
compression_level
Int
Deflate compression level (0-9). Default is 1, higher compresses more.
subgrid
Bool
Compute and output more detailed water levels.
Results are written in NetCDF format with CF conventions.
1.5 Logging settings
The following can be set in the configuration in the [logging] section.
entry
type
description
verbosity
String
Verbosity level: debug, info, warn, or error.
If verbosity is set to debug, the used Basin / profile dimensions (level, area and storage) are written to a CSV file in the results folder. This can be useful if you only provide 2 of the 3 columns and want to inspect the dimensions used in the computation.
The format of the CSV is: column 1 = node id, column 2 = level, column 3 = area and column 4 is storage.
Lets say you have 2 basins at node 1 and node 2. Dimensions node 1: level = [0, 1, 2], area = [2, 2, 4] and storage = [0, 2, 6], Dimensions node 1: level = [0, 1, 2], area = [4, 4, 8] and storage = [0, 4, 12].
Then the CSV will look like:
node_id
level
area
storage
1
0
2
0
1
1
2
2
1
2
4
6
2
0
4
0
2
1
4
4
2
2
8
12
1.6 Experimental features
Important
Experimental features are completely unsupported. They can break at any time and results will be wrong. Do not use them in production. If you’re interested in using an experimental feature, please contact us.
One can enable experimental features in the [experimental] section. Currently the following features can be enabled (all are disabled by default).
entry
type
description
concentration
Bool
Whether to enable tracer calculations or not.
allocation
Bool
Whether to activate the activation layer. Replaced by ‘first come first serve’ when deactivated
2 GeoPackage database and NetCDF tables
The input and output tables described below all share that they are tabular files. The Node and Link tables always have to be in the GeoPackage database file, and results are always written to NetCDF files. All other tables can either be in the database or in separate NetCDF files that are listed in the TOML as described above.
For visualization, the Node and Link tables typically have associated geometries. GeoPackage was used since it provides a standardized way to store tables with (and without) geometry columns in a SQLite database. If, like Ribasim, you can ignore the geometries, a GeoPackage is easy to read using SQLite libraries, which are commonly available. Furthermore GeoPackage can be updated in place when working on a model.
NetCDF is the format used for results and external input tables. NetCDF files follow the CF conventions and are particularly useful for large time series data. NetCDF is well suited for large tabular datasets with multi-dimensional arrays, and file size is kept small by using compression. For long runs, Ribasim can produce tables with many rows, making NetCDF an excellent choice for storing results. However, note that NetCDF input tables are currently not automatically loaded in QGIS.
2.1 Using NetCDF for input tables
By default, all tables are stored in the GeoPackage database. However, you can manually configure to store and read data in external files. This is particularly useful when you quickly want to be able to switch between different time series on the same model.
2.1.1 Writing tables to NetCDF with Python
When building a model with the Ribasim Python API, you can specify that a table should be written to an external file by setting its filepath attribute:
from pathlib import Pathimport ribasimfrom ribasim.nodes import basin# Create a modelmodel = ribasim.Model( starttime="2020-01-01", endtime="2021-01-01", crs="EPSG:28992",)# Add a basin with profile datamodel.basin.add( ribasim.Node(1, ribasim.geometry.Point(0, 0)), [basin.Profile(level=[0.0, 1.0], area=[100.0, 1000.0])],)# Specify that the profile table should be written to NetCDF format (recommended)model.basin.profile.filepath = Path("profile.nc")# Specify the input directory (where external files will be saved)model.input_dir = Path("input")# Write the model - this creates the TOML, database, and NetCDF filemodel.write("my_model/ribasim.toml")
After running this code, you’ll have: - my_model/ribasim.toml - configuration file referencing the external file - my_model/input/database.gpkg - GeoPackage with Node and Link tables - my_model/input/profile.nc - NetCDF file with Basin profile data
The paths are relative to the input_dir specified in the model configuration. When the model is written, these tables will be saved to the specified files rather than the GeoPackage database, and the TOML configuration file will reference these external files.
The resulting TOML will contain:
[basin]profile="profile.nc"
For time series data, NetCDF is particularly well-suited:
# Add time-varying data and write to NetCDFmodel.basin.time.filepath = Path("basin-time.nc")
To revert a table back to being stored in the GeoPackage database instead of an external file, set the filepath to None:
# Store profile data back in the GeoPackagemodel.basin.profile.filepath =None
2.1.2 Using result files as initial conditions
One common use case is to use the final state of a simulation as the initial condition for a new simulation. See the Basin state documentation for details on how to copy result files to the input directory and reference them in your TOML.
2.2 Table requirements
Below we give details per file, in which we describe the schema of the table using a syntax like this:
column
type
unit
restriction
node_id
Int32
-
sorted
storage
Float64
\(\text{m}^3\)
non-negative
This means that two columns are required, one named node_id, that contained elements of type Int32, and a column named storage that contains elements of type Float64. The order of the columns does not matter. In some cases there may be restrictions on the values. This is indicated under restriction.
Tables are also allowed to have rows for timestamps that are not part of the simulation, these will be ignored. That makes it easy to prepare data for a larger period, and test models on a shorter period.
When preparing the model for simulation, input validation is performed in the Julia core. The validation rules are described in the validation section.
2.3 Custom metadata
It may be advantageous to add metadata to rows. For example, basin areas might have names and objects such as weirs might have specific identification codes. Additional columns can be freely added to tables. The column names should be prefixed with meta_. They will not be used in computations or validated by the Julia core.
3 Node
Node is a table that specifies the ID and type of each node of a model. The ID must be unique among all nodes, and the type must be one of the available node types listed below.
Nodes are components that are connected together to form a larger system. The Basin is a central node type that stores water. The other node types influence the flow between Basins in some way. Counter intuitively, even systems you may think of as links, such as a canal, are nodes in Ribasim. This is because links only define direct instantaneous couplings between nodes, and never have storage of their own.
column
type
restriction
node_type
String
sorted, known node type
node_id
Int32
sorted per node_type
geom
Point
(optional)
name
String
(optional, does not have to be unique)
subnetwork_id
Int32
(optional)
route_priority
Int32
(optional, does not have to be unique)
cyclic_time
Bool
(optional, defaults to false)
If not set, the route_priority is fixed per node type as explained in the allocation settings.
Adding a point geometry to the node table can be helpful to examine models in QGIS, as it will show the location of the nodes on the map. The geometry is not used by Ribasim.
3.1 Cyclic time series
When cyclic_time is set to true for a node in the Node table, every time series associated with that node in the corresponding table(s) will be interpreted as cyclic. That is: the time series is exactly repeated left and right of the original time interval to cover the whole simulation period. For this it is validated that the first and last data values in the timeseries are the same. For instance, quarterly precipitation requires giving values for every quarter at the start of the quarter, and then the value for the first quarter again at the start of the next year.
Note that periods like months or years are not of constant length in the calendar, so over long simulation periods the timeseries can get out of sync with these periods on the calendar.
3.2 Subnetwork ID
Nodes can only be controlled by allocation if they have a subnetwork_id. In the Node table, give for example subnetwork_id an integer value of 1 or 2.
When Pumps and Outlets are part of a subnetwork, they can be controlled by allocation. To accomplish this, besides a subnetwork_id, they must have allocation_controlled set to true in the static table for Pump or Outlet.
4 Link
Links define connections between nodes. The only thing that defines a link is the nodes it connects, and in what direction. There are currently 4 possible link types:
“flow”: Flows between nodes are stored on links. The effect of the link direction depends on the node type, Node types that have a notion of an upstream and downstream side use the incoming link as the upstream side, and the outgoing link as the downstream side. This means that links should generally be drawn in the main flow direction. But for instance between two LinearResistances the link direction does not affect anything, other than the sign of the flow on the link. The sign of the flow follows the link direction; a positive flow flows along the link direction, a negative flow in the opposite way.
“control”: The control links define which nodes are influenced by a particular control or demand node. No water flows over these links, only information. Control links should always point away from the control or demand node. The LevelDemand and FlowDemand nodes use control links to indicate which nodes it will assign demands to.
“listen”: The listen links define which nodes are listened to by control nodes. They point from the listened node to the control node. The control node tables define listening behavior (listen_node_id, variable, weight, look-ahead), and the Python API automatically adds missing listen links when writing a model.
“observation”: Observation links point from an Observation node to the node being observed. These links are not used in the simulation core, but allow attaching time series data for reference, validation, or visualization.
column
type
restriction
from_node_id
Int32
-
to_node_id
Int32
-
link_type
String
must be “flow”, “control”, “listen”, or “observation”
geom
LineString or MultiLineString
(optional)
name
String
(optional, does not have to be unique)
Similarly to the node table, you can use a geometry to visualize the connections between the nodes in QGIS. For instance, you can draw a line connecting the two node coordinates.
5 Results
Results are written in NetCDF format by default, which stores data as multidimensional arrays rather than flat tables. This section describes the structure of each NetCDF result file using CDL-like notation.
5.1 Basin - basin.nc
The Basin results contain:
Results of the storage and level of each Basin, which are instantaneous values;
Results of the fluxes on each Basin, which are mean values over the saveat intervals. In the time coordinate the start of the period is indicated.
The initial condition is written to the file, but the final state is not. It will be placed in a separate output state file in the future.
The inflow_rate and outflow_rate are the sum of the flows from other nodes into and out of the Basin respectively. The actual flows determine in which term they are counted, not the link direction.
The storage_rate is the net mean flow that is needed to achieve the storage change between timesteps.
The inflow_rate consists of the sum of all modelled flows into the basin: inflow_rate (horizontal flows into the basin, independent of link direction) + precipitation + drainage.
The outflow_rate consists of the sum of all modelled flows out of the basin: outflow_rate (horizontal flows out of the basin, independent of link direction) + evaporation + infiltration.
The balance_error is the difference between the storage_rate on one side and the inflow_rate and outflow_rate on the other side: storage_rate - (inflow_rate - outflow_rate). It can be used to check if the numerical error when solving the water balance is sufficiently small.
The relative_error is the fraction of the balance_error over the mean of the total_inflow and total_outflow.
The convergence is the scaled residual of the solver, giving an indication of which nodes converge the worst (are hardest to solve). A higher value indicates a node that is harder to solve.
For a more in-depth explanation of the water balance error see here.
The flow results contain calculated mean flows over the saveat intervals for every flow link in the model. In the time coordinate the start of the period is indicated.
The link_id value is the same as the fid written to the Link table, and can be used to directly look up the Link geometry. Flows from the “from” to the “to” node have a positive sign, and if the flow is reversed it will be negative. - The convergence is the scaled residual of the solver, giving an indication of which nodes converge the worst (are hardest to solve).
5.3 State - basin_state.nc
The Basin state file contains the water levels in each Basin at the end of the simulation.
To use this result as the initial condition of another simulation, see the Basin / state table reference.
5.4 DiscreteControl - control.nc
The control results contain a record of each change of control state: when it happened, which control node was involved, to which control state it changed and based on which truth state. For more information on control states and truth states, see the DiscreteControl reference.
The allocation results contain a record of allocation results: when it happened, for which node, in which allocation network, and what the demand, allocated flow and realized flow were. The realized values at the starting time of the simulation can be ignored.
The LevelDemand node allocations are listed as node type Basin. This is because one LevelDemand node can link to multiple Basins, and doesn’t receive flow by itself.
For Basins the values demand, allocated and realized are positive if the Basin level is below the minimum level given by a LevelDemand node. The values are negative if the Basin supplies due to a surplus of water.
Note
Currently the stored demand and abstraction rate are those at the allocation timepoint (and the abstraction rate is based on the previous allocation optimization). In the future these will be an average over the previous allocation timestep.
5.6 Allocation flow - allocation_flow.nc
The allocation flow results contain results of the optimized allocation flow on every link in the model that is part of a subnetwork, for each time an optimization problem is solved (see also here). If in the model a primary network and subnetwork(s) are specified, there are 3 different types of optimization for the subnetwork.
When an allocation optimization problem turns out to be infeasible, an infeasibility analysis is performed. Some user friendly data is logged in the main log, but the full report of the analysis is written to this separate file. For details on the infeasibility analysis see here.
When an allocation optimization problem turns out to be infeasible, a scaling analysis is performed in addition to the feasibility analysis described above. Some user friendly data is logged in the main log, but the full report of the analysis is written to this separate file. For details on the scaling analysis see here.
This result file contains statistics about the solver, which can give an insight into how well the solver is performing over time. The data is saved by saveat (see configuration file). water_balance refers to the right-hand-side function of the system of differential equations solved by the Ribasim core.
The computation_time is the wall time in milliseconds spent on the given period. The first row tends to include compilation time as well. The dt is the size (in seconds) of the last calculation timestep (at the saveat timestep).