EcoDiet explored on a realistic example

The introduction employed a simplistic expemple of food web to familiarize the user with the basic commands and options of the EcoDiet package. Here we will use a more realistic example (although still artificial!) to run the different EcoDiet configurations, compare their results and hence higlight the complementarity in the different data used.

The data corresponds to 10 trophic groups with stomach content data, and very distinct isotopic measures.

realistic_stomach_data_path <- system.file("extdata", "realistic_stomach_data.csv",
                                           package = "EcoDiet")
realistic_stomach_data <- read.csv(realistic_stomach_data_path)
knitr::kable(realistic_stomach_data)

X	Cod	Pout	Sardine	Shrimps	Crabs	Bivalves	Worms	Zooplankton
Cod	0	0	0	0	0	0	0	0
Pout	1	0	0	0	0	0	0	0
Sardine	9	0	0	0	0	0	0	0
Shrimps	4	4	29	0	24	0	0	0
Crabs	1	24	0	0	0	0	0	0
Bivalves	0	3	0	0	11	0	0	0
Worms	16	30	0	1	24	0	0	0
Zooplankton	0	27	6	3	0	0	0	0
Phytoplankton	0	0	14	10	0	16	0	20
Detritus	0	0	0	12	19	18	18	0
full	21	30	29	19	29	27	18	20

realistic_biotracer_data_path <- system.file("extdata", "realistic_biotracer_data.csv",
                                           package = "EcoDiet")
realistic_biotracer_data <- read.csv(realistic_biotracer_data_path)
knitr::kable(realistic_biotracer_data[c(1:3, 31:33, 61:63), ])

	group	d13C	d15N
1	Cod	-12.94144	19.18913
2	Cod	-14.96070	20.23939
3	Cod	-13.77822	19.48809
31	Pout	-13.47127	18.57353
32	Pout	-13.16888	17.58714
33	Pout	-14.23085	17.38938
61	Sardine	-14.56111	16.95231
62	Sardine	-15.04729	17.15197
63	Sardine	-14.63688	16.90906

library(EcoDiet)

plot_data(biotracer_data = realistic_biotracer_data,
          stomach_data = realistic_stomach_data)

#> Warning: Use of `biotracer_data$group` is discouraged.
#> ℹ Use `group` instead.

Yes, we are aware that isotopic data is usually messier, but isn’t it a beautiful plot?

The configuration without literature data

We define the configuration we are in, and preprocess the data:

literature_configuration <- FALSE

data <- preprocess_data(biotracer_data = realistic_biotracer_data,
                        trophic_discrimination_factor = c(0.8, 3.4),
                        literature_configuration = literature_configuration,
                        stomach_data = realistic_stomach_data)
#> The model will investigate the following trophic links:
#>               Bivalves Cod Crabs Detritus Phytoplankton Pout Sardine Shrimps
#> Bivalves             0   0     1        0             0    1       0       0
#> Cod                  0   0     0        0             0    0       0       0
#> Crabs                0   1     0        0             0    1       0       0
#> Detritus             1   0     1        0             0    0       0       1
#> Phytoplankton        1   0     0        0             0    0       1       1
#> Pout                 0   1     0        0             0    0       0       0
#> Sardine              0   1     0        0             0    0       0       0
#> Shrimps              0   1     1        0             0    1       1       0
#> Worms                0   1     1        0             0    1       0       1
#> Zooplankton          0   0     0        0             0    1       1       1
#>               Worms Zooplankton
#> Bivalves          0           0
#> Cod               0           0
#> Crabs             0           0
#> Detritus          1           0
#> Phytoplankton     0           1
#> Pout              0           0
#> Sardine           0           0
#> Shrimps           0           0
#> Worms             0           0
#> Zooplankton       0           0

In this configuration, priors are set for each trophic link identified as plausible by the user but the priors are not informed by literature data, and are thus uninformative:

plot_prior(data, literature_configuration)

The marginal prior distributions have different shape depending on the variables:

it is flat or uniform for η, the probabilities that a trophic link exists (all the probabilities of existence are thus equiprobable),
the marginal distributions for each diet proportion Π are peaking at zero, although the joint distribution for Πs is a flat Dirichlet prior, because all the diet proportions must sum to one.

plot_prior(data, literature_configuration, pred = "Pout")

We define the model, and test if it compiles well with a few iterations and adaptation steps:

filename <- "mymodel.txt"
write_model(file.name = filename, literature_configuration = literature_configuration, print.model = F)
mcmc_output <- run_model(filename, data, run_param="test")
#> 
#> Processing function input....... 
#> 
#> Done. 
#>  
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 316
#>    Unobserved stochastic nodes: 125
#>    Total graph size: 1104
#> 
#> Initializing model
#> 
#> Adaptive phase, 500 iterations x 3 chains 
#> If no progress bar appears JAGS has decided not to adapt 
#>  
#> 
#>  Burn-in phase, 500 iterations x 3 chains 
#>  
#> 
#> Sampling from joint posterior, 500 iterations x 3 chains 
#>  
#> 
#> Calculating statistics....... 
#> 
#> Done.
#> 
#>   /!\ Convergence warning:
#> Out of the 51 variables, 12 variables have a Gelman-Rubin statistic > 1.1.
#> You may consider modifying the model run settings.
#> The variables with the poorest convergence are: PI[10,7], PI[5,7], PI[1,3], PI[8,7], PI[6,2], PI[3,2], PI[1,6], PI[8,2], PI[9,3], PI[8,3].
#> JAGS output for model 'mymodel.txt', generated by jagsUI.
#> Estimates based on 3 chains of 1000 iterations,
#> adaptation = 500 iterations (sufficient),
#> burn-in = 500 iterations and thin rate = 1,
#> yielding 1500 total samples from the joint posterior. 
#> MCMC ran for 0.195 minutes at time 2025-02-24 11:59:21.218764.
#> 
#>              mean     sd    2.5%     50%   97.5% overlap0 f  Rhat n.eff
#> eta[4,1]    0.657  0.084   0.480   0.660   0.811    FALSE 1 1.000  1500
#> eta[5,1]    0.591  0.089   0.412   0.594   0.757    FALSE 1 1.002   765
#> eta[3,2]    0.086  0.058   0.012   0.073   0.231    FALSE 1 1.007   317
#> eta[6,2]    0.088  0.057   0.011   0.076   0.227    FALSE 1 1.002   834
#> eta[7,2]    0.448  0.103   0.247   0.445   0.656    FALSE 1 0.999  1500
#> eta[8,2]    0.212  0.083   0.077   0.203   0.391    FALSE 1 1.001  1143
#> eta[9,2]    0.735  0.088   0.544   0.741   0.887    FALSE 1 1.004   446
#> eta[1,3]    0.397  0.088   0.233   0.393   0.582    FALSE 1 1.007   273
#> eta[4,3]    0.649  0.082   0.486   0.650   0.792    FALSE 1 1.000  1500
#> eta[8,3]    0.804  0.072   0.645   0.813   0.916    FALSE 1 1.002   902
#> eta[9,3]    0.808  0.070   0.658   0.814   0.928    FALSE 1 1.002   878
#> eta[1,6]    0.121  0.059   0.033   0.113   0.260    FALSE 1 1.000  1500
#> eta[3,6]    0.785  0.072   0.631   0.790   0.909    FALSE 1 1.000  1500
#> eta[8,6]    0.155  0.062   0.053   0.150   0.299    FALSE 1 1.004   391
#> eta[9,6]    0.970  0.029   0.894   0.978   0.999    FALSE 1 1.004   772
#> eta[10,6]   0.872  0.059   0.738   0.880   0.966    FALSE 1 1.006   410
#> eta[5,7]    0.489  0.089   0.325   0.488   0.668    FALSE 1 1.007   281
#> eta[8,7]    0.968  0.032   0.886   0.977   0.999    FALSE 1 1.003  1500
#> eta[10,7]   0.226  0.072   0.107   0.219   0.386    FALSE 1 1.028    81
#> eta[4,8]    0.598  0.103   0.382   0.602   0.795    FALSE 1 1.003   636
#> eta[5,8]    0.516  0.107   0.308   0.517   0.723    FALSE 1 1.007   340
#> eta[9,8]    0.092  0.061   0.011   0.080   0.251    FALSE 1 1.001  1160
#> eta[10,8]   0.190  0.082   0.054   0.180   0.366    FALSE 1 1.003   871
#> eta[4,9]    0.951  0.048   0.828   0.966   0.999    FALSE 1 1.002  1500
#> eta[5,10]   0.956  0.043   0.841   0.970   0.999    FALSE 1 1.002  1500
#> PI[4,1]     0.339  0.254   0.000   0.308   0.901    FALSE 1 1.092    27
#> PI[5,1]     0.661  0.254   0.099   0.692   1.000    FALSE 1 1.092    27
#> PI[3,2]     0.100  0.184   0.000   0.005   0.663    FALSE 1 1.479    10
#> PI[6,2]     0.113  0.171   0.000   0.016   0.579    FALSE 1 1.574     8
#> PI[7,2]     0.496  0.297   0.000   0.511   0.968    FALSE 1 1.094    30
#> PI[8,2]     0.095  0.218   0.000   0.000   0.840    FALSE 1 1.207    21
#> PI[9,2]     0.196  0.169   0.000   0.165   0.581    FALSE 1 1.061    41
#> PI[1,3]     0.403  0.306   0.000   0.422   0.947    FALSE 1 1.699     6
#> PI[4,3]     0.150  0.176   0.000   0.095   0.636    FALSE 1 1.095    36
#> PI[8,3]     0.174  0.176   0.000   0.130   0.617    FALSE 1 1.166    18
#> PI[9,3]     0.273  0.248   0.000   0.215   0.839    FALSE 1 1.194    16
#> PI[1,6]     0.027  0.069   0.000   0.000   0.216    FALSE 1 1.351    17
#> PI[3,6]     0.325  0.196   0.006   0.316   0.747    FALSE 1 1.012   189
#> PI[8,6]     0.074  0.142   0.000   0.001   0.507    FALSE 1 1.145    34
#> PI[9,6]     0.318  0.213   0.013   0.297   0.791    FALSE 1 1.054    46
#> PI[10,6]    0.256  0.206   0.000   0.230   0.707    FALSE 1 1.009   209
#> PI[5,7]     0.278  0.192   0.000   0.287   0.615    FALSE 1 1.762     6
#> PI[8,7]     0.546  0.209   0.086   0.563   0.956    FALSE 1 1.625     7
#> PI[10,7]    0.177  0.281   0.000   0.000   0.875    FALSE 1 3.553     4
#> PI[4,8]     0.131  0.154   0.000   0.067   0.510    FALSE 1 1.053    44
#> PI[5,8]     0.261  0.215   0.000   0.214   0.823    FALSE 1 1.159    24
#> PI[9,8]     0.182  0.192   0.000   0.119   0.635    FALSE 1 1.058    49
#> PI[10,8]    0.426  0.278   0.000   0.424   0.999    FALSE 1 1.058    52
#> PI[4,9]     1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> PI[5,10]    1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> deviance  866.366 11.493 845.570 866.133 890.494    FALSE 1 1.002   687
#> 
#> **WARNING** Some Rhat values could not be calculated.
#> **WARNING** Rhat values indicate convergence failure. 
#> Rhat is the potential scale reduction factor (at convergence, Rhat=1). 
#> For each parameter, n.eff is a crude measure of effective sample size. 
#> 
#> overlap0 checks if 0 falls in the parameter's 95% credible interval.
#> f is the proportion of the posterior with the same sign as the mean;
#> i.e., our confidence that the parameter is positive or negative.
#> 
#> DIC info: (pD = var(deviance)/2) 
#> pD = 65.9 and DIC = 932.308 
#> DIC is an estimate of expected predictive error (lower is better).

You should now try to run the model until it converges (it should take around half an hour to run, so we won’t do it in this vignette):

mcmc_output <- run_model(filename, data, run_param="normal", parallelize = T)

Here are the figures corresponding to the results that have converged:

plot_results(mcmc_output, data)

plot_results(mcmc_output, data, pred = "Pout")

You can also plot the results for specific prey if you want a clearer figure:

plot_results(mcmc_output, data, pred = "Pout", 
             variable = "PI", prey = c("Bivalves", "Worms"))

The configuration with literature data

We now change the configuration to add literature data to the model:

literature_configuration <- TRUE

realistic_literature_diets_path <- system.file("extdata", "realistic_literature_diets.csv",
                                               package = "EcoDiet")
realistic_literature_diets <- read.csv(realistic_literature_diets_path)
knitr::kable(realistic_literature_diets)

X	Cod	Pout	Sardine	Shrimps	Crabs	Bivalves	Worms	Zooplankton	Phytoplankton	Detritus
Cod	0.0000000	0.0000000	0.0000000	0.0000000	0.0000000	0.0000000	0.0	0.0	0	0
Pout	0.4275065	0.0000000	0.0000000	0.0000000	0.0000000	0.0000000	0.0	0.0	0	0
Sardine	0.3603675	0.0000000	0.0000000	0.0000000	0.0000000	0.0000000	0.0	0.0	0	0
Shrimps	0.0300737	0.5295859	0.0002143	0.0000000	0.0082107	0.0000000	0.0	0.0	0	0
Crabs	0.1410430	0.3332779	0.0000000	0.0000000	0.0000000	0.0000000	0.0	0.0	0	0
Bivalves	0.0000000	0.0006130	0.0000000	0.0000000	0.3441081	0.0000000	0.0	0.0	0	0
Worms	0.0410093	0.1023676	0.0000000	0.0171336	0.4435377	0.0000000	0.0	0.0	0	0
Zooplankton	0.0000000	0.0341557	0.7381375	0.9121505	0.0000000	0.0000000	0.0	0.0	0	0
Phytoplankton	0.0000000	0.0000000	0.2616482	0.0000610	0.0000000	0.9966847	0.0	1.0	0	0
Detritus	0.0000000	0.0000000	0.0000000	0.0706550	0.2041434	0.0033153	1.0	0.0	0	0
pedigree	0.8000000	0.1000000	0.5000000	0.3000000	0.7000000	0.1000000	0.2	0.2	1	1

data <- preprocess_data(biotracer_data = realistic_biotracer_data,
                        trophic_discrimination_factor = c(0.8, 3.4),
                        literature_configuration = literature_configuration,
                        stomach_data = realistic_stomach_data,
                        literature_diets = realistic_literature_diets,
                        nb_literature = 12,
                        literature_slope = 0.5)
#> The model will investigate the following trophic links:
#>               Bivalves Cod Crabs Detritus Phytoplankton Pout Sardine Shrimps
#> Bivalves             0   0     1        0             0    1       0       0
#> Cod                  0   0     0        0             0    0       0       0
#> Crabs                0   1     0        0             0    1       0       0
#> Detritus             1   0     1        0             0    0       0       1
#> Phytoplankton        1   0     0        0             0    0       1       1
#> Pout                 0   1     0        0             0    0       0       0
#> Sardine              0   1     0        0             0    0       0       0
#> Shrimps              0   1     1        0             0    1       1       0
#> Worms                0   1     1        0             0    1       0       1
#> Zooplankton          0   0     0        0             0    1       1       1
#>               Worms Zooplankton
#> Bivalves          0           0
#> Cod               0           0
#> Crabs             0           0
#> Detritus          1           0
#> Phytoplankton     0           1
#> Pout              0           0
#> Sardine           0           0
#> Shrimps           0           0
#> Worms             0           0
#> Zooplankton       0           0

Now we see that the prior distributions are informed by the literature data:

when the literature diet input is > 0, the trophic link probabilities η are shifted toward one. Here this is the case for all prey but we could imagine that the user identify a species as a plausible prey whereas it has not been observed being consumed by the predator in the literature. In that case, the literature diet of 0 would drive η toward 0.
the average prior for the diet proportions Π is directly the literature diet input.

plot_prior(data, literature_configuration)

plot_prior(data, literature_configuration, pred = "Pout")

Again, we verify that the model compiles well:

filename <- "mymodel_literature.txt"
write_model(file.name = filename, literature_configuration = literature_configuration, print.model = F)
mcmc_output <- run_model(filename, data, run_param="test")
#> 
#> Processing function input....... 
#> 
#> Done. 
#>  
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 316
#>    Unobserved stochastic nodes: 125
#>    Total graph size: 1594
#> 
#> Initializing model
#> 
#> Adaptive phase, 500 iterations x 3 chains 
#> If no progress bar appears JAGS has decided not to adapt 
#>  
#> 
#>  Burn-in phase, 500 iterations x 3 chains 
#>  
#> 
#> Sampling from joint posterior, 500 iterations x 3 chains 
#>  
#> 
#> Calculating statistics....... 
#> 
#> Done.
#> 
#>   /!\ Convergence warning:
#> Out of the 51 variables, 13 variables have a Gelman-Rubin statistic > 1.1.
#> You may consider modifying the model run settings.
#> The variables with the poorest convergence are: PI[10,7], PI[8,7], PI[5,7], PI[7,2], PI[9,8], PI[10,8], PI[1,3], PI[3,2], PI[8,2], PI[8,6].
#> JAGS output for model 'mymodel_literature.txt', generated by jagsUI.
#> Estimates based on 3 chains of 1000 iterations,
#> adaptation = 500 iterations (sufficient),
#> burn-in = 500 iterations and thin rate = 1,
#> yielding 1500 total samples from the joint posterior. 
#> MCMC ran for 0.197 minutes at time 2025-02-24 11:59:35.859258.
#> 
#>              mean     sd    2.5%     50%   97.5% overlap0 f  Rhat n.eff
#> eta[4,1]    0.676  0.084   0.507   0.679   0.825    FALSE 1 1.006   284
#> eta[5,1]    0.616  0.086   0.443   0.618   0.778    FALSE 1 1.002   946
#> eta[3,2]    0.354  0.082   0.200   0.349   0.516    FALSE 1 1.006   301
#> eta[6,2]    0.354  0.080   0.209   0.351   0.518    FALSE 1 1.003   567
#> eta[7,2]    0.606  0.082   0.436   0.611   0.752    FALSE 1 1.001  1500
#> eta[8,2]    0.447  0.085   0.290   0.445   0.621    FALSE 1 1.009   339
#> eta[9,2]    0.812  0.067   0.659   0.819   0.927    FALSE 1 1.014   143
#> eta[1,3]    0.522  0.079   0.370   0.524   0.673    FALSE 1 1.005   361
#> eta[4,3]    0.718  0.073   0.567   0.723   0.842    FALSE 1 1.000  1500
#> eta[8,3]    0.849  0.055   0.725   0.857   0.938    FALSE 1 1.000  1500
#> eta[9,3]    0.851  0.056   0.724   0.857   0.941    FALSE 1 1.000  1500
#> eta[1,6]    0.156  0.064   0.057   0.149   0.300    FALSE 1 1.000  1500
#> eta[3,6]    0.786  0.070   0.639   0.790   0.906    FALSE 1 1.004   398
#> eta[8,6]    0.186  0.065   0.078   0.182   0.326    FALSE 1 1.005   331
#> eta[9,6]    0.970  0.029   0.895   0.979   0.999    FALSE 1 1.001  1500
#> eta[10,6]   0.879  0.055   0.755   0.886   0.966    FALSE 1 1.005   605
#> eta[5,7]    0.563  0.084   0.395   0.566   0.725    FALSE 1 1.007   270
#> eta[8,7]    0.972  0.028   0.891   0.980   0.999    FALSE 1 1.000  1500
#> eta[10,7]   0.359  0.078   0.214   0.358   0.514    FALSE 1 1.021   106
#> eta[4,8]    0.668  0.093   0.479   0.671   0.839    FALSE 1 1.001  1500
#> eta[5,8]    0.590  0.100   0.390   0.596   0.777    FALSE 1 1.000  1500
#> eta[9,8]    0.228  0.084   0.090   0.221   0.406    FALSE 1 1.001  1073
#> eta[10,8]   0.313  0.092   0.147   0.305   0.507    FALSE 1 1.003   736
#> eta[4,9]    0.955  0.044   0.840   0.967   0.999    FALSE 1 1.002  1500
#> eta[5,10]   0.957  0.041   0.846   0.969   0.999    FALSE 1 1.000  1500
#> PI[4,1]     0.142  0.289   0.000   0.000   1.000    FALSE 1 1.051    88
#> PI[5,1]     0.858  0.289   0.000   1.000   1.000    FALSE 1 1.051    88
#> PI[3,2]     0.110  0.226   0.000   0.002   0.930    FALSE 1 1.318    15
#> PI[6,2]     0.232  0.276   0.000   0.096   0.907    FALSE 1 1.199    16
#> PI[7,2]     0.532  0.340   0.000   0.567   0.999    FALSE 1 1.733     6
#> PI[8,2]     0.071  0.138   0.000   0.001   0.501    FALSE 1 1.226    15
#> PI[9,2]     0.054  0.097   0.000   0.006   0.347    FALSE 1 1.089    49
#> PI[1,3]     0.245  0.229   0.000   0.189   0.750    FALSE 1 1.428     8
#> PI[4,3]     0.137  0.137   0.000   0.103   0.467    FALSE 1 1.043    78
#> PI[8,3]     0.033  0.063   0.000   0.007   0.210    FALSE 1 1.089    53
#> PI[9,3]     0.584  0.250   0.096   0.600   0.994    FALSE 1 1.160    17
#> PI[1,6]     0.099  0.209   0.000   0.005   0.915    FALSE 1 1.088   441
#> PI[3,6]     0.357  0.283   0.000   0.362   0.908    FALSE 1 1.019   250
#> PI[8,6]     0.136  0.246   0.000   0.000   0.846    FALSE 1 1.222    15
#> PI[9,6]     0.323  0.272   0.000   0.278   0.979    FALSE 1 1.029    86
#> PI[10,6]    0.085  0.164   0.000   0.004   0.566    FALSE 1 1.138    40
#> PI[5,7]     0.132  0.175   0.000   0.030   0.560    FALSE 1 1.827     6
#> PI[8,7]     0.184  0.291   0.000   0.003   0.860    FALSE 1 2.901     4
#> PI[10,7]    0.684  0.413   0.000   0.933   1.000    FALSE 1 3.104     4
#> PI[4,8]     0.197  0.242   0.000   0.098   0.904    FALSE 1 1.080    41
#> PI[5,8]     0.076  0.148   0.000   0.005   0.528    FALSE 1 1.046    88
#> PI[9,8]     0.200  0.290   0.000   0.017   0.934    FALSE 1 1.577     7
#> PI[10,8]    0.528  0.356   0.000   0.583   0.998    FALSE 1 1.543     7
#> PI[4,9]     1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> PI[5,10]    1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> deviance  866.882 11.333 846.318 866.687 890.605    FALSE 1 1.001  1449
#> 
#> **WARNING** Some Rhat values could not be calculated.
#> **WARNING** Rhat values indicate convergence failure. 
#> Rhat is the potential scale reduction factor (at convergence, Rhat=1). 
#> For each parameter, n.eff is a crude measure of effective sample size. 
#> 
#> overlap0 checks if 0 falls in the parameter's 95% credible interval.
#> f is the proportion of the posterior with the same sign as the mean;
#> i.e., our confidence that the parameter is positive or negative.
#> 
#> DIC info: (pD = var(deviance)/2) 
#> pD = 64.2 and DIC = 931.095 
#> DIC is an estimate of expected predictive error (lower is better).

You should now try to run the model until it converges (it should take around half an hour to run, so we won’t do it in this vignette):

mcmc_output <- run_model(filename, data, run_param=list(nb_iter=100000, nb_burnin=50000, nb_thin=50, nb_adapt=50000), parallelize = T)

Here are the figures corresponding to the results that have converged:

plot_results(mcmc_output, data)

plot_results(mcmc_output, data, pred = "Pout")

You can save the figures as PNG using:

plot_results(mcmc_output, data, pred = "Pout", save = TRUE, save_path = ".")

Last, if you want to explore further in detail the a posteriori distribution of your parameters Π and η, you can run the following code line, which will store the values for all iterations into a data frame.

reshape_mcmc(mcmc_output, data)

- The configuration without literature data
- The configuration with literature data