library(dplyr)
library(gt) # For tables
library(ggplot2)
library(readsdr)
library(tidyr)

1 Overview

This tutorial shows a workflow that integrates System Dynamics (SD) software and R using the readsdr package. This workflow combines the strengths of both tools: SD software for model building and R for data science.

2 Model

In particular, I employ the SIR model with demography (see page 26 in Keeling and Rohani (2011)) to demonstrate the functionalities of the package. This model is useful for understanding the longer-term persistence and endemic dynamics of an infectious disease.

2.1 Stock-and-flow diagram

The model is built in Stella and can be downloaded here.

2.2 Equations

The dynamics of this system (Eq (2.1)) are governed by three compartments:

  • \(S_t\): The fraction of susceptible individuals at time t. These are individuals who can contract the disease.

  • \(I_t\): The fraction of infectious individuals at time t. These individuals can transmit the disease to others.

  • \(R_t\) : The fraction of recovered or immune individuals at time t. These individuals have recovered from the disease and are permanently immune.

\[\begin{equation} \begin{aligned} \dot{S}_t &= \nu -\beta S_t I_t - \mu S_t \\ \dot{I}_t &= \beta S_t I_t - \gamma I_t - \mu I_t \\ \dot{R}_t &= \gamma I_t - \mu R_t \\ \end{aligned} \tag{2.1} \end{equation}\]

In this model, \(\nu\) and \(\mu\) denote birth and death rates, respectively. These rates are configured identically (\(\nu = \mu\)) to ensure the population size remains constant over time. Furthermore, \(\beta\) corresponds to the effective contact rate and \(\gamma\) to the recovery rate. These parameters also determine the basic reproduction number (\(\Re_0 = \frac{\beta}{\mu +\gamma}\)).

Finally, compartment \(C_t\) (Eq (2.2)) is an accounting stock that keeps track of the number of cases over the simulation period, where \(N\) denotes the population size.

\[\begin{equation} \dot{C} = N \beta S_t I_t \tag{2.2} \end{equation}\]

3 Data analysis

After building the model, we need to convert it to a format compatible with R. The function read_xmile from the readsdr package accomplishes this task by translating the XMILE code in the .stmx file into a list (mdl) of various objects.

model_path <- "https://jandraor.github.io/models/SIR_demography.stmx"
mdl        <- read_xmile(model_path)


The sd_stocks function helps us verify a model’s stocks and their initial values. It outputs a data frame that can be used for further analysis. For instance, we can use this data frame to create an HTML table showing the initial state of the system. In this example, the table reveals that at the start of the simulation, 10% of the population is susceptible to the disease and almost 90% are immune.

sd_stocks(mdl) |> gt()
name init_value
S 0.1000
I 0.0001
R 0.8999
C 0.0000


Similarly, we can use the sd_constants function to verify parameter values.

sd_constants(mdl) |> gt()
name value
par_gamma 1.000000e+00
par_beta 1.000000e+01
par_nu 2.747253e-04
par_mu 2.747253e-04
N 1.000000e+06

3.1 Simulation

Of course, the purpose of building a model is to simulate it and understand its dynamic behaviour. In R, the deSolve package is commonly employed to simulate Ordinary Differential Equation (ODE) models. See Duggan (2016) for further details. The sd_simulate function conveniently simplifies this process by acting as a wrapper. To use it, we need the deSolve_components object in mdl list.

sim_result <- sd_simulate(mdl$deSolve_components, start_time = 0, 
                          stop_time = 1000, timestep = 1/64, 
                          integ_method = "euler")


We can use the tidyverse to tidy and plot the results. See Duggan (2018) for further details.

tidy_df <- sim_result |> 
  select(time, S, I, R, C) |> 
  pivot_longer(-time) |> 
  mutate(name = factor(name, levels = c("S", "I", "R", "C")))

ggplot(tidy_df, aes(time, value)) +
  geom_line(colour = "steelblue") +
  facet_wrap(~name, scales = "free") +
  labs(x = "Week", y = "Value") +
  theme_classic() +
  theme(axis.line = element_line(colour = "grey65"),
        axis.text = element_text(colour = "grey65"))

This plot shows that the cumulative number of cases (\(C\)) grows over time. This reflects the ongoing occurrence of new cases (incidence). While this model captures the instantaneous incidence rate (denoted by \(\dot{C}\)), it is often the case that we more interested in the periodic incidence rate, especially when fitting a model to incidence data. This periodic rate corresponds to the number of cases that occur over a specific time span (e.g., 1 day, 1 week). In our model, the weekly incidence rate at time \(t\) can be defined as \(C_t - C_{t-1}\). The function sd_net_change facilitates this calculation.

weekly_incidence <- sd_net_change(sim_result, "C")

ggplot(weekly_incidence, aes(time, value)) +
  geom_line(colour = "steelblue") +
  labs(y = "Cases per week", x = "Week") +
  theme_classic() +
  theme(axis.line = element_line(colour = "grey65"),
        axis.text = element_text(colour = "grey65"))

3.2 Sensitivity analysis

Certainly, the main advantage of simulation models is their ability to test different assumptions. For instance, we can evaluate the impact of a lower contact rate on the incidence dynamics . To do so, we read the model again and override the default value using the argument const_list in the read_xmile function. We can also override initial values with stock_list. The blue line in the graph indicates the simulation with the new, lower contact rate, whereas the grey line represents the simulation with the original contact rate.

mdl2 <- read_xmile(model_path, const_list = list(par_beta  = 8))

sim_alt1 <- sd_simulate(mdl2$deSolve_components, start_time = 0, 
                        stop_time = 1000, timestep = 1/64, 
                        integ_method = "euler")

wkl_inc2 <- sd_net_change(sim_alt1, "C")

ggplot(wkl_inc2, aes(time, value)) +
  geom_line(colour = "steelblue") +
  geom_line(data = weekly_incidence, colour = "grey50") +
  labs(y = "Cases per week", x = "Week") +
  theme_classic() +
  theme(axis.line = element_line(colour = "grey65"),
        axis.text = element_text(colour = "grey65"))

In practice, we often need to test more than one value and assess the impact on multiple variables. The sd_sensitivity_run function streamlines this process by allowing us to perform sensitivity analysis efficiently. In this case, we explore two levels of initial immunity and two contact rate values. This combination leads to four different scenarios for analysis. The function can handle either a data frame containing modified constants (using const_list argument) or a data frame with initial value changes (using stock_list argument). If you provide both, they must have the same number of rows. For simplicity, we illustrate the dynamics using the instantaneous incidence rate.

c_df <- data.frame(par_beta = c(8, 8, 10, 10))

s_df <- data.frame(S = c(0.1, 0.2, 0.1, 0.2),
                   I = 1e-4) |> 
  mutate(R = S - I)

bind_cols(c_df, s_df) |> 
  mutate(Scenario = 1:4, .before = everything()) |> gt()
Scenario par_beta S I R
1 8 0.1 1e-04 0.0999
2 8 0.2 1e-04 0.1999
3 10 0.1 1e-04 0.0999
4 10 0.2 1e-04 0.1999
sens_run <- sd_sensitivity_run(mdl2$deSolve_components,
                               consts_df = c_df,
                               stocks_df = s_df,
                               start_time = 0, 
                               stop_time = 1000,
                               integ_method = "euler",
                               timestep     = 1/128)
ggplot(sens_run, aes(time, C_in, group = as.factor(iter))) +
  geom_line(colour = "steelblue") +
  facet_wrap(~iter, ncol = 1, scales = "free_y") +
  labs(y = "Instantaneous incidence rate", x = "Week") +
  theme_classic() +
  theme(axis.line = element_line(colour = "grey65"),
        axis.text = element_text(colour = "grey65"))

3.3 Policy test

Finally, we can finally test the effect of a simple intervention: reducing the effective contact rate to zero for 12 weeks, starting at week 65. The function sd_what_if_from_time facilitates this exploration.

# Intervention
itv_df <- sd_what_if_from_time(65,
                               up_to_time   = 77,
                               par_list     = list(par_beta = 0),
                               ds_inputs    = mdl$deSolve_components,
                               start_time   = 0,
                               stop_time    = 1000,
                               timestep     = 1/128,
                               integ_method = "euler")

In this plot, the blue lines indicate the intervention scenario, whereas the grey lines denote the scenario without the intervention.

tidy_itv_df <- itv_df |> select(time, par_beta, C_in) |> 
  rename('Effective contact rate' = par_beta,
         'Instantaneous incidence rate' = C_in) |> 
  pivot_longer(-time)

ref_sim <- sim_result |> select(time, par_beta, C_in) |> 
  rename('Effective contact rate' = par_beta,
         'Instantaneous incidence rate' = C_in) |> 
  pivot_longer(-time)

ggplot(tidy_itv_df, aes(time, value)) +
  geom_line(colour = "steelblue", alpha = 0.5) +
  geom_line(data = ref_sim, colour = "grey60", 
            alpha = 0.5) +
  facet_wrap(~name, scales = "free_y", ncol = 1) +
  labs(y = "Value", x = "Week") +
  theme_classic()

4 Computational environment

sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.4.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Dublin
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] tidyr_1.3.0        readsdr_0.3.0.9000 ggplot2_3.4.4      gt_0.9.0          
## [5] dplyr_1.1.2       
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.4      jsonlite_1.8.7    highr_0.10        compiler_4.3.1   
##  [5] tidyselect_1.2.0  stringr_1.5.0     xml2_1.3.5        jquerylib_0.1.4  
##  [9] png_0.1-8         scales_1.3.0      yaml_2.3.7        fastmap_1.1.1    
## [13] R6_2.5.1          labeling_0.4.3    generics_0.1.3    curl_5.0.2       
## [17] knitr_1.43        tibble_3.2.1      bookdown_0.35     munsell_0.5.0    
## [21] bslib_0.5.1       pillar_1.9.0      rlang_1.1.2       utf8_1.2.4       
## [25] stringi_1.7.12    cachem_1.0.8      deSolve_1.36      xfun_0.40        
## [29] sass_0.4.7        cli_3.6.1         withr_2.5.2       magrittr_2.0.3   
## [33] digest_0.6.33     grid_4.3.1        rstudioapi_0.15.0 lifecycle_1.0.4  
## [37] vctrs_0.6.5       evaluate_0.21     glue_1.6.2        farver_2.1.1     
## [41] fansi_1.0.6       colorspace_2.1-0  purrr_1.0.2       rmarkdown_2.24   
## [45] tools_4.3.1       pkgconfig_2.0.3   htmltools_0.5.6

References

Duggan, Jim. 2016. System Dynamics Modeling with R. Lecture Notes in Social Networks. Springer International Publishing. https://link.springer.com/book/10.1007/978-3-319-34043-2.
———. 2018. Input and output data analysis for system dynamics modelling using the tidyverse libraries of R.” System Dynamics Review 34 (3): 438–61. https://doi.org/10.1002/sdr.1600.
Keeling, Matt J., and Pejman Rohani. 2011. Modeling Infectious Diseases in Humans and Animals. Princeton University Press.