Uncertainty & Sensitivity

John M. Drake

Key questions about uncertainty

  1. Which facators are most important to determining model behavior?
  2. How will the model outcome (trajectories, equilibria, etc.) change if the conditions (parameters, initial state) change?
  3. What range of outcomes are consistent with my knowledge of the observables? With my knowledge of the parameters?
  4. What processes do I need more information about and how much information do I need?
  5. If something changes (i.e. intervention), how will the model outputs change?

Discussion: What is the difference between uncertainty and sensitivity? When does it matter?

Different kinds of uncertainty/error

Models are idealizations and subject to approximation

  • Model mis-specification (structural epistemic uncertainty)
  • Inaccurate parameterization (parameter epistemic uncertainty)
  • The propagation of intrinsic noise/stochasticity (aleatory uncertainty)

\[ \begin{aligned} \frac{dX}{dt} &= - \beta XY \\ \frac{dY}{dt} &= \beta XY - \gamma Y \\ \frac{dZ}{dt} &= \gamma Y \end{aligned} \]

Discussion: Explain how these kinds of approximation are reflected in this model.

Note: The difference between epistemic and aleatory uncertainty is somewhat semantic

Simple models

In simple deterministic models with few state variables and few parameters we can often produce model visualizations to answer such questions

Invasion boundary for a model of Ebola virus transmission

A more complicated example: A model for the transmission of HIV among MSM


  • In 2000, \( \approx 30 \)% of gay men in San Francisco were infected with HIV, \( \approx 50 \)% of these were taking combination antiretroviral therapy (ART)
  • ART was effective at reducing AIDS death rates, but does not completely eliminate infectivity
  • It was unclear whether the net effect of increased distribution of ART would increase or decrease HIV in this population


The model

plot of chunk unnamed-chunk-1

Blower, S.M. et al. 2000. A tale of two futures: HIV and antiretroviral therapy in San Francisco. Science 287:650-654.

The model

Parameter Interpretation
\( X \) Susceptibles
\( Y \) Infected (R=resistant, S=sensitive, U=untreated, T=treated)
\( \pi \) Rate at which gay men join the sexually active community
\( \mu^{-1} \) Average time during which new partners are acquired
\( c \) Average number of new partners (per year)
\( p \) Probability of a drug resistant case transmitting drug-sensitive viruses
\( q^{-1} \) Average time for a drug-resistant infection to revert to drug-sensitive infection
\( \sigma \) Per capita effective treatment rate
\( e \) Relative efficacy of ART in treating drug-resistant infections
\( r \) Rate of emergence of resistance due to acquired infection
\( g \) Proportion of cases that give up ART per year
\( \nu \) Average rate of disease progression

Population-level treatment effect


  • It appears that ART could prevent \( \approx 15,000 \) cases over 20 years
  • How reliable is this result?
  • Model has 20 parameters but none is known exactly

Number of infections prevented as a function of the fraction of cases treated

Idea

Simulate from all plausible models and look at the distribution of outputs

  • How do we sample the space of plausible models?

Monte carlo sampling from parameter distribution

A helpful approach when you have access to the distribution of parameter estimates (usually when fitting is by MCMC, but occasionally when fitting is by maximum likelihood)


Latin hypercube sampling


  • To determine robustness of model predictions, we require a way of exploring the output of a family of models
  • Realistic models will often have many parameters to that high resolution exploration of its parameters space is computationally intractable
  • Latin hypercube sampling is a scheme for simulating random parameter sets that adqeuately cover the parameter space

Marino, S. et al. 2008. A methodology for performing global uncertainty and sensitivity analysis in systems biology. Journal of Theoretical Biology 254:178-196.

Latin hypercube sampling in R


require(lhs)
x <- runif(50)
y <- runif(50)
h <- 50
lhs<-maximinLHS(h,2)
par(mfrow=c(1,2))
plot(x,y,type='p', main='Random Uniform', xlab='', ylab='')
plot(lhs, type='p', main='LH Sampling', xlab='', ylab='')

plot of chunk unnamed-chunk-2

Latin hypercube sampling in R (3-D)


require(scatterplot3d)
x <- runif(150); y <- runif(150); z <- runif(150)
h <- 150
lhs<-maximinLHS(h,3)
par(mfrow=c(1,2))
scatterplot3d(x,y,z, type='p', main='Random Uniform', xlab='', ylab='', zlab='')
scatterplot3d(lhs, type='p', main='LH Sampling', xlab='', ylab='', zlab='')

plot of chunk unnamed-chunk-3

Using the R package lhs

The package lhs generates a point in in a unit d-dimensional space (i.e. where every dimension is on the interval \( [0,1] \))

  • This point needs to be rescaled to an interval from it's minimum to its maximum
  • This can be done by “stretching” the interval using the following formula for a random parameter \( \alpha \) between \( \alpha_{min} \) and \( \alpha_{max} \)


\[ \alpha_0 = U(\alpha_{max}-\alpha_{min})+\alpha_{min} \]

Applying lhs to the motivating example


  • Evidently, our best guesses are rather optimistic compares with the range of scenarios we believe to be plausible
  • It is not plausible that ART is counter-productive (an open question at the time of the study)

Assessing parameter importance


  • Correlation analysis can be used to investigate how model output is related to input parameters (but does not account for covariances among parameters, if there is any)
  • Partial rank correlation coefficients (PRCC) partition effects to each input variable
  • Limitation: PRCC only works when the relationship between inputs and outputs is monotonic

Summary

  1. A key problem is to distinguish variability that arises from intrinsic stochasticity and uncertainty that can be mitigated through the acquisition of better information.
  2. The effect of uncertainty in model parameters can be identified through Latin Hypercube Sampling coupled with Partial Rank Correlation analysis
  3. Other methods (e.g. Sobol's index, Sensitivity heat map) may be required to determine the effects of parameter interactions, direction of effect, or when input-output mapping is non-monotonic

Further reading: : Wu et al. 2013. Sensitivity analysis of infectious disease models: methods, advances and their application. Journal of the Royal Society Interface 10:20121018

Acknowledgements

Presentations and exercises draw significantly from materials developed with Pej Rohani, Ben Bolker, Matt Ferrari, Aaron King, and Dave Smith used during the 2009-2011 Ecology and Evolution of Infectious Diseases workshops and the 2009-2019 Summer Institutes in Statistics and Modeling of Infectious Diseases.

Licensed under the Creative Commons attribution-noncommercial license, http://creativecommons.org/licenses/bync/3.0/. Please share and remix noncommercially, mentioning its origin.