Modelling viral concentration in the TNA study

## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.

1 Introduction

Trial for a therapeutic vaccine (i.e. to treat the disease, rather than prevent it). Inserted the treatment in attenuated virus from classic PPR vaccine made resistant. This virus will hopefully transport the treatment to infected cells with the real (wild) virus.

25-30 individual animals (goats) split into 5 boxes
Boxes 1, 2, 5: treatment group with vaccine administered at days -1, 0, 1 with respect to the challenge (infection with real virus).
Boxes 6, 8: negative control group, with classical PPR vaccine (not expected to have any therapeutic effect) and no treatment at all.
Response variable:
- Viral concentration : quantitative variable. Not clear how are we going to quantify it.
  
  We have measures of “Threshold cycle” (Ct) aka “Quantification cycle” (Cq) which quantifies the relative concentration of virus inversely: number of cycles necessary to find the virus DNA, running from 0 (immediately detected, thus high concentration to 40 cycles (not detected)). Lower measures correspond to higher concentrations¹. Otherwise, we can use a measured derived from Ct that directly estimates the viral charge (number of copies of viral genes). Which variable and potential transformations of them are to be determined.
  
  Asked Etienne Loire (see answer 2020-11-27). Essentially, the calibration is a linear transformation, so from a modelling perspective is equivalent. It can be transformed, though, so it has nicer statistical properties.
Measurements:
- Each animal is evaluated for both response variables almost every day from 5 to 14. There are some missing values.

2 Statistical model

Let $y(t) | b$ be the observed Cq at day $t$ for a individual in box $b \in \{1, 2, 5, 6, 8\}$ from group $g$ (either control or test). We assume a Gaussian model, whose expected value is described additively by four components:

$\mu_0$, a constant baseline mean value for the control group;
$\alpha_t$, the mean difference of test group, relative to the control;
$\theta_b$, the mean difference of box $b$, relative to the corresponding group;
$f_g(t)$, the expected evolution over time of the group $g$ centred at 0. The non-parametric curves $f_g(t)$ were implemented using thin-plate regression splines (Wood 2017).

The model equations are as follows:

\[\begin{equation} \begin{aligned} y(t) | b & \sim \mathcal{N}(\mu_b(t),\, \sigma_y) \\ \mu_b(t) & = \mu_0 + \alpha_t \mathbb{1}_{\{g(b) = \text{test}\}} + \theta_b + f_g(t) \\ \theta_b & \sim \mathcal{N}(0,\, \sigma_\theta) \\ f_g(t) & = \beta_g \phi(t) + \sum_{k=1}^K \gamma^{(g)}_k B_k(i) \\ \gamma_k & \sim \mathcal{N}(0, \sigma_\gamma), \quad k = 1, \ldots, K \end{aligned} \label{eq:model} \end{equation}\] where $B_k$ are the thin-plate regression spline basis functions of the location $i$ for the current data, with $K = 6$ and penalty order $2$ and $\phi$ is a centred linear function of $t$.

The model was fit using brms (Bürkner 2017).

3 Results

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Figure 3.1: Posterior means by group.

$Predicted outcome by group.$

Figure 3.2: Predicted outcome by group.

## Warning: `stat(x)` was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(x)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Figure 3.3: Average effect of the test by day.

Figure 3.4: Posterior probability that the average effect of the test is positive, by day.

Figure 3.5: Predictive effect of the test by day.

Figure 3.6: Posterior probability that the predictive effect of the test is positive, by day.

Figure 3.7: Posterior effect of the boxes, relative to the group means.

Difference between box 2 and box 5 in the test group. The posterior mean difference is of about 0.8 cycles, and the probability of a positive difference is of about 92 %

Figure 3.8: Difference between box 2 and box 5 in the test group. The posterior mean difference is of about 0.8 cycles, and the probability of a positive difference is of about 92 %

Figure 3.9: Conditional effect of the boxes and the time.

3.1 Effect of the box 2

Here we quantify the effect of the treatment applied to box2 (vaccine applied at day 0), which seems the best performing, in relation with the control group.

Figure 3.10: Average effect of the Box 2 by day.

Figure 3.11: Posterior probability that the average effect of the Box 2 is positive, by day.

4 Conclusions

The model fails to accurately capture the behaviour of the data in the beginning of the period due to the boundary at 40 cycles (Fig. 5.2, annex on diagnostics). But this is an uninteresting part, and other than that, the model seems to work well.
The results on the right extreme of the time interval should also be taken with caution due to edge effects that can make curve estimation more unstable.
The control and test groups are essentially indistinguishable (Fig. 3.1). The effect of the test, if any, is very weak.
There is some slight evidence (prob. ~ 0.8) that supports a positive average effect of the test around day 9 (Fig. 3.4). However, the magnitude of the effect, if any, is of only 2 or 3 cycles (Fig. 3.3). Furthermore, there is an equivalent support for a negative effect around day 9. So, all this could be just random fluctuation.
In contrast, at the individual level, the probability that the treatment has a positive effect is essentially the same as the probability of a negative effect (Fig. 3.6).
The global effects of the boxes relative to the group are negligible, in any case under 1 cycle in magnitude (Fig. 3.7).
Although I don’t think there is much to gain, there are some possible improvements to the model:
- Account for the bounded support of the outcome by modelling the response rescaled into (0, 1) with a zero-one inflated Beta.
- Distributional regression: the variability seems to change with grp (and perhaps with t)

5 Annex: model diagnostics and validation

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: vc ~ 1 + grp + (1 | box) + +s(t, k = 8, by = grp, id = "grp") 
##    Data: tna_data (Number of observations: 232) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Smooth Terms: 
##                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sds(stgrpcontrol_1)    66.96     17.04    38.79   104.81 1.00     1735     2105
## sds(stgrptest_1)       67.78     15.95    42.83   104.49 1.00     1747     2282
## 
## Group-Level Effects: 
## ~box (Number of levels: 5) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.91      0.88     0.05     3.29 1.01      787     1349
## 
## Population-Level Effects: 
##                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept          29.91      0.94    28.04    31.75 1.00     2077     1338
## grptest            -0.47      1.17    -2.73     1.95 1.00     2190     1657
## st:grpcontrol_1     5.26      9.81   -13.87    24.50 1.00     5031     2687
## st:grptest_1        2.05      9.76   -17.27    21.08 1.00     5326     2927
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     3.06      0.15     2.78     3.39 1.00     5207     2570
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.

## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.

## Warning: Argument 'pars' is deprecated. Please use 'variable' instead.

## Warning: Removed 464 rows containing missing values (`geom_segment()`).

Figure 5.1: Observed vs posterior predictive modes, and highest-density credible intervals at 50% and 89%.

## Using 10 posterior draws for ppc type 'dens_overlay_grouped' by default.

Figure 5.2: Posterior predictive simulations from the model vs. observed empirical distribution.

Figure 5.3: Horizontally-jittered residuals (mode, 50 and 89% HDI credible intervals) agains time.

Figure 5.4: Quantile residuals.

References

Bürkner, Paul-Christian. 2017. “Brms : An r Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1). https://doi.org/10.18637/jss.v080.i01.

Wood, Simon N. 2017. Generalized Additive Models : An Introduction with r. Second. Texts in Statistical Science. Chapman & Hall/CRC. http://www.worldcat.org/isbn/1498728332.

https://www.thermofisher.com/fr/fr/home/life-science/pcr/real-time-pcr/real-time-pcr-learning-center/real-time-pcr-basics/real-time-pcr-understanding-ct.html ↩︎