Phase III pilot trial for population suppression.

1 Introduction

The present pilot trial for population suppression of Anopheles arabiensis is structured over 3 distinct areas, each divided into 3 sectors.

Sector number 3, in the west bank area was selected for the sit treatment, were releases of sterile males were performed. The other 8 sectors were used for control.

A total of 94 sampling units of 1-hectare in size and each belonging to one of four land-class-land-use (LCLU) classes were selected and distributed across sectors. 3 adult traps were installed in each sampling unit.

A series of larvae surveys have been conducted in breeding sites within sectors.

A series of swarm surveys have been conducted in the sit sector following some of the releases of sterile males.

See the descriptive report for more details on the experimental set up and a description of the collected data.

A series of 18 ground releases of about 3k - 20k marked sterile males were performed from 11 points spread over the sit sector (Table 1.1, Figure 1.1).

Table 1.1: Release events of marked sterile males in the sit sector.
Release number Release date Number of sterile males released
1 2014-05-11 12300
2 2014-05-15 9700
3 2014-05-27 8400
4 2014-08-06 8100
5 2014-08-19 14744
6 2015-01-16 11505
7 2015-02-13 7450
8 2015-03-27 11480
9 2015-05-22 16600
10 2016-02-11 10000
11 2016-02-26 6900
12 2016-04-28 9000
13 2016-07-21 13700
14 2016-10-16 15000
15 2016-12-16 2900
16 2016-12-30 20000
17 2017-01-23 14000
18 2017-02-15 3000
Number of individuals released by date.

Figure 1.1: Number of individuals released by date.

The purpose of the present analysis is to quantify the suppression rate of the mosquito density in the field resulting from the releases of sterile males, and to elaborate on the effectiveness of the SIT as a strategy to control the malaria vector Anopheles arabiensis.

1.1 Sterile pressure

The sterile pressure is a calculated variable that estimates roughly the number of sterile individuals alive in the population at a given date in a logarithmic scale, based on a hypothesised daily survival probability. It takes into account the number of sterile individuals released prior to the target date, and the time elapsed since each release. Let \(\pi\) be the daily survival probability, the number of sterile individuals alive at time \(t\) is: \[\begin{equation} z_\pi(t) = \sum_{i:\; t(i) < t} R_i \exp(t \, \log\pi) \end{equation}\] where \(R_i\) are the released number of sterile individuals in release \(i\), for which the release times \(t(i)\) are prior to the target time.

The sterile pressure \(P(t) = \log(z_\pi(t))\) is then calculated at each observation time. We have used \(\pi = 0.9\), which yields the temporal estimates of sterile population and pressure shown in Figure 1.2. Other values have been tested, without improved association.

Sterile population and corresponding pressure over time estimated as the log-number of sterile individuals in the population, assuming a daily survival probability of 0.9.

Figure 1.2: Sterile population and corresponding pressure over time estimated as the log-number of sterile individuals in the population, assuming a daily survival probability of 0.9.

2 Exploratory analysis of the field survey data

Given the experimental design, where a series of releases of sterile males have been performed in a sit sector, we can appreciate the potential impact of the intervention in terms of:

  1. Difference in the population density in the sit sector, with respect to the other sectors.

    This requires the distinction between the variation that would be naturally expected between sectors and the excess variation that can be attributed to the impact of the intervention.

  2. Variation in the population density within the sit sector, as a function of the time since the last release of sterile males.

    Indeed, we can expect the population density to progressively drop following a release down to a minimum value from which it slowly recovers up to normal values after some time, if left alone.

Furthermore, variations in the population densities should reflect to different extents in adult captures, larvae surveys and swarmings.

In the present section we explore these two effects on the three types of surveys that have been conducted during the experiment.

2.1 Adult surveys

Figure 2.1 shows the number of adult Anopheles arabiensis captured in each trap, at each collection day, by sex and sector. Most often, the traps were empty, since they were actually resting sites and don’t accumulate captures. Only the 3% of non-zero values are represented in the figure.

Number of individuals catched in adult traps, by survey date, sex, trap and sector, in a logarithmic scale. Only non-zero catches are represented, as the vast majority of surveys are 0. Sterile males were released in sector 3, highlighted in orange, at the intervention times represented with light vertical lines.

Figure 2.1: Number of individuals catched in adult traps, by survey date, sex, trap and sector, in a logarithmic scale. Only non-zero catches are represented, as the vast majority of surveys are 0. Sterile males were released in sector 3, highlighted in orange, at the intervention times represented with light vertical lines.

Still, most of the raw outcomes are clustered on the lower-end of the scale, with only a few extreme observations that are actually visible. Thus, we need to summarise these data from various angles in order to describe the patterns properly.

Consider the capture rates (CR), measured as the average number of adult Anopheles captured per trap in a single survey. Figure 2.2 shows the capture rate across traps and sexes by sector and date.

There seems to be a initial drop in the average catches in the first months of the study (late 2014) at the sit sector, reaching almost zero by 2015 and staying at a low level with some sporadic peaks in 2016 and 2017.

However, the individual variability is very large, with a vast majority of zeroes and a few larger values for some specific surveys. This produces very noisy averages with large standard deviations when large values occur.

Furthermore, the peaks and drops are not obviously associated with the release times.

Capture rate by survey date and sector, across sexes and traps. The sit sector is highlighted in orange with a band of ±1 SD. Vertical lines represent release times.

Figure 2.2: Capture rate by survey date and sector, across sexes and traps. The sit sector is highlighted in orange with a band of ±1 SD. Vertical lines represent release times.

Figure 2.3 summarises further the data, aggregating the observations by year and displaying capture rates by year and sector.

Here again, the initial decline in the capture rate at the sit sector is apparent. Yet, the variations are still very high, and consistent with natural variation in the control groups.

Capture rate by year and sector, across sexes, traps and surveys. The sit sector is highlighted in orange with a band of ±1 SD. Vertical lines represent release times. Numbers at the right-hand side label the control sectors with highest capture rates.

Figure 2.3: Capture rate by year and sector, across sexes, traps and surveys. The sit sector is highlighted in orange with a band of ±1 SD. Vertical lines represent release times. Numbers at the right-hand side label the control sectors with highest capture rates.

Proportion of zeroes in catches by sector (points) and sex, across traps and time. Proportions in the sit sector are highlighted in orange.

Figure 2.4: Proportion of zeroes in catches by sector (points) and sex, across traps and time. Proportions in the sit sector are highlighted in orange.

Empirical distribution of the number non-zero catches by sector and sex, across traps and time, in a logarithmic scale. Distributions from the sit sector are highlighted in orange.

Figure 2.5: Empirical distribution of the number non-zero catches by sector and sex, across traps and time, in a logarithmic scale. Distributions from the sit sector are highlighted in orange.

From figures 2.4 and 2.5, there does not seem to be evidence of any decrease in the abundance of the wild population of Anopheles arabiensis in the sit sector, beyond the natural variation across sectors.

However, the impact of the releases of sterile males could be limited in time, and thus not noticeable when outcomes are averaged across time.

We explore this question next, by looking at results as a function of time since the last release.

Proportion of zeroes in catches by sector, and time since last release across traps and sexes. Proportions in the sit sector are highlighted in orange.

Figure 2.6: Proportion of zeroes in catches by sector, and time since last release across traps and sexes. Proportions in the sit sector are highlighted in orange.

There might be a trend, but the pattern is also consistent with random variation.

Distribution of non-zero catches by role and time since last release, across traps, control sectors and sexes. Distributions from the sit sector are highlighted in orange.

Figure 2.7: Distribution of non-zero catches by role and time since last release, across traps, control sectors and sexes. Distributions from the sit sector are highlighted in orange.

Average number of non-zero catches by role and time since last release, across traps, control sectors and sexes.

Figure 2.8: Average number of non-zero catches by role and time since last release, across traps, control sectors and sexes.

It’s difficult to tell whether there is an impact of the release of sterile males with these few catches, even after aggregating time over blocks of 5 days.

Relationship between catch and sterile pressure in the sit sector.

Figure 2.9: Relationship between catch and sterile pressure in the sit sector.

2.2 Larvae surveys

Figure 2.10 displays the larvae rates (number of larvae per dip) of each survey, by date and sector. Each line corresponds to one sector, and connects the larvae rates of the sector across time. In a few cases, 2 or 3 surveys were conducted in the same sector at the same date.

Larvae rate (number of larvae per dip) in breeding sites, by survey date (points) and sector (lines), in a logarithmic scale. The sit sector (#3) and the release dates are highlighted in orange.

Figure 2.10: Larvae rate (number of larvae per dip) in breeding sites, by survey date (points) and sector (lines), in a logarithmic scale. The sit sector (#3) and the release dates are highlighted in orange.

Figure 2.11 summarises further the data, aggregating the observations by year and displaying average larvae rates by year and sector.

Here, the larvae rate in the sit sector remains more or less constant, and well within the range of variation of the control sectors.

Larvae rate (number of larvae per dip) in breeding sites, by year (points) and sector (lines), in a logarithmic scale. The sit sector (#3) is highlighted in orange.

Figure 2.11: Larvae rate (number of larvae per dip) in breeding sites, by year (points) and sector (lines), in a logarithmic scale. The sit sector (#3) is highlighted in orange.

Figure 2.12 explores the relationship between the larvae rate and the time since the last release.

There is no obvious pattern. There could be some effect in the first 5 days, but it is not conclusive.

Larvae rate by role, and time since last release The sit sector is highlighted in orange.

Figure 2.12: Larvae rate by role, and time since last release The sit sector is highlighted in orange.

Relationship between larvae rates and sterile pressure in the sit sector.

Figure 2.13: Relationship between larvae rates and sterile pressure in the sit sector.

2.3 Swarm surveys

Surveys in mating swarms have been conducted immediately after some of the releases. Thus, we can assume that all the released sterile males are susceptible to be captured, neglecting some short-term mortality.

Furthermore, at least 15 days (more typically, 50) passed between surveys, which makes captures of individuals released before the very last release event very unlikely.

Figure 2.14 displays the percentage of sterile males in the sample.

Proportion of sterile males in the swarming samples. The size of the points represents the sample size.

Figure 2.14: Proportion of sterile males in the swarming samples. The size of the points represents the sample size.

Figure 2.15 shows a basic estimate of population size at each of the swarming events, independently from each other.

A narrow band around the estimate shows an asymptotic estimation of the standard error. However, this is certainly a bad approximation due to the small sample sizes with results near or on the boundaries of the parameter space.

Chapman estimator of population size.

Figure 2.15: Chapman estimator of population size.

There are wild variations in the survey results, hardly attributable to real changes in the population size.

Take, for instance, the results from February 2016 shown in Table 2.1.

Table 2.1: Results from swarm surveys in February 2016.
11 Feb. 2016 26 Feb. 2016
N sterile males (SM) released 10000 6900
N SM captured 3 122
N wild males captured 59 11
Days since last survey 264 15
Percentage of SM 5 92
Adjusted sterile-wild male ratio¹ 0 10
Population estimate 147515 617
Standard Error 1524 26
1 Adjustment by adding 1 in both the numerator and denominator.

In only 15 days, the swarming sample switched from being almost entirely (95%) composed of wild individuals to being almost entirely (92%) composed of sterile individuals.

It is possible that some of the sterile males released the 11th survived 15 days and were captured the 26th, inflating the proportion of sterile males in the second sample and biasing the estimate of the population down.

However, the most remarkable result is the population estimate of the 11th, which is the maximum of the series by an order of magnitude.

One explanation could be that the released sterile males did not reach the swarming spot for some reason. Perhaps the time between the release and the capture was too little, or the sterile mosquitoes were mostly attracted into another direction. Combined with the relatively high number of released individuals, the relatively few sterile males captured in the sample leads to a overly high estimate of the population.

There are obviously many sources of variation in these data, which make estimates quite unstable and unreliable. A more robust estimation method should take into account the whole series.

Nevertheless, the uncertainties will remain high, and it will be difficult to separate the impact of the intervention from the noise.

3 Modelling

3.1 Adult surveys

Let \(y\) be the number of Anopheles individuals of a specific sex captured in a trap during a survey. We assume that \(y\) follows a negative binomial distribution with a reciprocal dispersion parameter \(\phi\) (i.e. higher values of \(\phi\) imply reduced dispersion) and a expected value (capture rate) \(\mu\) which depends on several factors.

\[\begin{align} y & \sim \text{NB}(\mu, \phi) \\ \eta & = \log(\mu) = \alpha_0 + \beta_m \mathbb{1}_m + \mathbf{A}\pmb{\beta}_a + \beta_r \mathbb{1}_r + \beta_p\, P + u_s + u_t + s(t) \\ u_s & \sim \mathcal{N}(0, \sigma_u) \\ u_t & \sim \mathcal{N}(0, \sigma_t) \end{align}\]

where \(\alpha_0\) is a global intercept, interpreted as the average log-capture rate of female individuals in a control sector in the east bank at the beginning of the experiment; \(\beta_m\) is the additional effect of the male category of sex; \(\pmb{\beta}_a\) is a vector with the additional effects of the west and island areas and \(\mathbf{A} = (\mathbb{1}_w, \mathbb{1}_i)\) is the corresponding matrix of indicators; \(\beta_r\) is the additional effect of the sit role; \(\beta_p\) is the regression coefficient of the sterile pressure \(P\); \(u_s\) and \(u_t\) are the varying effects of the sector within an area and a role and of the trap within a sector, which are themselves variables with standard deviations \(\sigma_u\) and \(\sigma_t\) respectively; and \(s(t)\) is a smooth temporal effect that accounts for seasonal and global trends that affect equally to all sectors.

We are mostly interested in \(\beta_r\), which can be interpreted as the log-average effect of the intervention, in terms of a multiplicative factor for the expected capture rate in the sit sector, with respect to a control sector in the same area. Note that this effect (of the role) is difficult to separate from the specific effect of the sector \(u_s\), for the sit sector. This is inevitable in this study design where the treatment is applied in a single sector. Nevertheless, the observations in other sectors allows the model to identify the expected dispersion \(\sigma_s\) of the sector effects, and to attribute any excess variation in the sit sector to the treatment.

We are also interested in the regression coefficient \(\beta_p\) which quantifies the relative impact of the size of the sterile population.

I have used weakly informative priors for the model parameters, following the approach and principles described in Section 9.5 of Gelman, Hill, and Vehtari (2021), with adaptations detailed below.

Specifically, I gathered a general idea of the expected ranges of variation in the log-capture rates by computing the mean \(m_0 \approx -3.3\) and standard deviation \(s_0 \approx 1.5\) of empirical log-capture rates by year and sector. These were used as parameters for the global intercept, after centring all the other predictors: \(\alpha_0^c \sim \mathcal{N}(m_0, 2.5\, s_0)\). The priors for the regression coefficients depended on the scale of variation of the corresponding variables as \(\beta_\cdot \sim \mathcal{N}(0, 2.5\,s_0 / \text{sd}(x))\). Since the standard deviation of the indicator variables associated with the categorical factors (i.e., sex, area and role) were all of about 0.5, we used a common prior \(\mathcal{N}(0, 7)\) for all the corresponding coefficients, except for the \(\beta_r\) which is specified below. The prior scale for the varying effects \(u_s\) and \(u_t\), as well as for the standard deviation of the smooth effect \(s(t)\), were \(\sigma_\cdot \sim \text{Exp}(1/s_0)\), which yields a prior marginal standard deviation of \(\sqrt{2}\,s_0\).

The prior for \(\beta_r\) is more specific, since it is a target parameter which is partially confounded with the varying effect of the third sector as explained earlier. I used a Double-Exponential (a.k.a. Laplace) distribution, which provides more shrinkage and resembles more (although not exactly the same) like the marginal prior for the competing sector effect. The scale parameter, \(\sigma_r = s_0\) yields a prior standard deviation of \(\sqrt{2}\,s_0\). It has been chosen so that prior standard deviations of the confounded effects match with each other (Fig. 3.1).

Prior effects of the sit treatment and the sector.

Figure 3.1: Prior effects of the sit treatment and the sector.

Finally, the prior for the reciprocal dispersion parameter \(\phi\) was also exponential, with rate at the corresponding empirical value. This parameter controls the dispersion in the scale of the data as \[ \mathbb{V}[y] = \mu + \mu^2 / \phi. \] Thus, I chose the rate parameter of the exponential prior for \(\phi\) as \(\lambda_0 = \frac{s_0^2 - m_0}{m_0^2} \approx 0.5.\)

3.2 Larvae surveys

Let \(y\) be the number of larvae collected over \(n_d\) dips in a breeding site during a survey.

We assume that \(y\) follows a Log-normal distribution with a scale parameter \(\sigma\) and a location parameter \(\mu\) which depends on several factors.

\[\begin{align} y & \sim \text{LN}(\mu, \sigma) \\ \mu & = \log(n_d) + \alpha_0 + \mathbf{A}\pmb{\beta}_a + \beta_r \mathbb{1}_r + \beta_p\, P + u_s + s(t) \\ u_s & \sim \mathcal{N}(0, \sigma_u) \end{align}\]

where \(\alpha_0\) is a global intercept, interpreted as the average log-larvae rate in a control sector in the east bank at the beginning of the experiment; \(\pmb{\beta}_a\) is a vector with the additional effects of the west and island areas and \(\mathbf{A} = (\mathbb{1}_w, \mathbb{1}_i)\) is the corresponding matrix of indicators; \(\beta_r\) is the additional effect of the sit role; \(\beta_p\) is the regression coefficient of the sterile pressure \(P\); \(u_s\) is the varying effect of the sector within an area and a role, which is itself variable with standard deviation \(\sigma_u\); and \(s(t)\) is a smooth temporal effect that accounts for seasonal and global trends that affect equally to all sectors.

We are mostly interested in \(\beta_r\), which can be interpreted as the log-average effect of the intervention, in terms of a multiplicative factor for the expected larvae rate in the sit sector, with respect to a control sector in the same area. Note that this effect (of the role) is difficult to separate from the specific effect of the sector \(u_s\), for the sit sector. This is inevitable in this study design where the treatment is applied in a single sector. Nevertheless, the observations in other sectors allows the model to identify the expected dispersion \(\sigma_s\) of the sector effects, and to attribute any excess variation in the sit sector to the treatment.

We are also interested in the regression coefficient \(\beta_p\) which quantifies the relative impact of the size of the sterile population.

I have used weakly informative priors for the model parameters, following the approach and principles described in Section 9.5 of Gelman, Hill, and Vehtari (2021), with adaptations detailed below.

Specifically, I gathered a general idea of the expected ranges of variation in the log-larvae rates by computing the mean \(m_0 \approx 1.3\) and standard deviation \(s_0 \approx 1.9\) of empirical log-larvae rates. These were used as parameters for the global intercept, after centring all the other predictors: \(\alpha_0^c \sim \mathcal{N}(m_0, 2.5\, s_0)\). The priors for the regression coefficients depended on the scale of variation of the corresponding variables as \(\beta_\cdot \sim \mathcal{N}(0, 2.5\,s_0 / \text{sd}(x))\). Since the standard deviation of the indicator variables associated with the area were of about 0.5, I used a common prior \(\mathcal{N}(0, 10)\) for the corresponding coefficients. The prior scale for the varying effects \(u_s\), as well as for the standard deviation of the smooth effect \(s(t)\), were \(\sigma_\cdot \sim \text{Exp}(1/s_0)\), which yields a prior marginal standard deviation of \(\sqrt{2}\,s_0\).

The prior for \(\beta_r\) is more specific, since it is a target parameter which is partially confounded with the varying effect of the third sector as explained earlier. I used a Double-Exponential (a.k.a. Laplace) distribution, which provides more shrinkage and resembles more (although not exactly the same) like the marginal prior for the competing sector effect. The scale parameter, \(\sigma_r = s_0\) yields a prior standard deviation of \(\sqrt{2}\,s_0\). It has been chosen so that prior standard deviations of the confounded effects match with each other (Fig. 3.2).

Prior effects of the sit treatment and the sector.

Figure 3.2: Prior effects of the sit treatment and the sector.

Finally, the prior for the scale parameter \(\sigma\) was also exponential, with rate at the corresponding empirical value.As \[ \mathbb{V}[\log(y)] = \sigma^2, \] I chose the rate parameter of the exponential prior for \(\sigma\) as \(\lambda_0 = 0.53 \approx 1/s_0.\)

3.3 Swarm surveys

Let \(n_i\) and \(m_i\) be respectively the number of sterile and wild individuals captured in swarm \(i\), shortly after the release of \(N_i\) sterile individuals in a population of size \(M_i\) at that time and sector.

Assuming that the captures of individuals are independent and equally likely, we have that the expected ratio of individuals from each group in the sample should equal the corresponding ratio in the population: \(\mathbb{E}[m_i/n_i] = M_i / N_i\), and thus, given the number \(n_i\) of sterile individuals captured in the swarming event \(i\), \[\begin{equation} \tag{3.1} \mathbb{E}[m_i \mid n_i] = n_i M_i / N_i \end{equation}\]

Assuming a Negative binomial model for the number of wild individuals given the number of sterile captures, \[\begin{equation} \tag{3.2} m_i \mid n_i,\,\pi_i \sim = \text{NB}(n_i, \pi_i), \end{equation}\] with \(\pi_i = N_i / (M_i + N_i)\) the probability of capturing a sterile individual, equal to the proportion of sterile individuals in the mixed population.

Here we still rely on a probability that is likely close to the boundary of the parameter space.

With this, parametrisation, the expected value of the outcome is \[ \mu = n_i (1 - \pi_i) / \pi_i = n_i \frac{M_i / (M_i + N_i)}{N_i / (M_i + N_i)} = n_i M_i / N_i, \] and the the reciprocal dispersion \(\phi\) parameter is such that \[ \text{Var}(m_i \mid n_i) = n_i (1 - \pi_i) / \pi_i^2 = \mu + \mu^2 / \phi. \]

The linear predictor is found through a logarithmic link function:

\[ \eta = \log(\mu) = \log(n_i / N_i) + \log(M_i) \]

and we model \(\log(M_i)\) with an offset given by the logarithm of the proportion of collected individuals from the total number released.

4 Results

4.1 Adult surveys

Posterior multiplicative effect of the sit intervention on the capture rate.

Figure 4.1: Posterior multiplicative effect of the sit intervention on the capture rate.

Figure 4.1 shows a relatively neutral global effect of the sit intervention on the expected capture rate.

As anticipated, this effect is negatively correlated with the effect of the third sector, as shown in Figure 4.2 in the logarithmic scale. The joint posterior reveals an abundance in the sit sector centred around the line of neutrality with respect to the controls, possibly with some compensation effect between the treatment and the sector effects.

## Warning: Computation failed in `stat_binhex()`
## Caused by error in `compute_group()`:
## ! The package "hexbin" is required for `stat_binhex()`
Joint posterior distribution of the treatment effect and the effect of the sector 3, where the releases were performed.

Figure 4.2: Joint posterior distribution of the treatment effect and the effect of the sector 3, where the releases were performed.

Figure 4.3 shows the estimated posterior effect of the sterile pressure on the adult capture in the logarithmic scale, in contrast with the prior specification.

The value approximately represents the percent change in the capture rate for each percent increase in the sterile population.

The results suggest a positive effect of the sterile pressure, meaning that the abundance, or at least the capture rate, is associated with higher volumes of sterile males.

Posterior vs. prior effect of the sterile pressure

Figure 4.3: Posterior vs. prior effect of the sterile pressure

However, this can be a consequence of a temporally-lagged effect of the sterile males. Indeed, Figure ?? shows the expected capture rate in the sit sector over time and the calculated sterile pressure scaled for comparison.

Nevertheless, the results here must be taken with great care, as the predictive performance of the model is very limited. Indeed, the posterior predictive mass is essentially concentrated in zero, as a result of the overwhelming number of zeros in the outcomes, resulting in large uncertainties and unreliable estimations (Fig. 4.4).

Posterior predictive means and 95% credicbe intervals vs. observed outcome for a sample of 400 observations.

Figure 4.4: Posterior predictive means and 95% credicbe intervals vs. observed outcome for a sample of 400 observations.

In order to appreciate the impact of these effects, particularly that of the intervention, on the observed capture rate, figure 4.5 displays the predicted capture rates on different scenarios.

Expected capture rates by sex and experimental role in a typical trap on a typical sector of the East bank, at day 300 of the experiment, with an average sterile pressure. Median and 95% credible intervals.

Figure 4.5: Expected capture rates by sex and experimental role in a typical trap on a typical sector of the East bank, at day 300 of the experiment, with an average sterile pressure. Median and 95% credible intervals.

Expected capture rates with and without intervention by day of experiment and sex in a typical trap on a typical sector of the East bank, with an average sterile pressure. Median and 95% credible bands.

Figure 4.6: Expected capture rates with and without intervention by day of experiment and sex in a typical trap on a typical sector of the East bank, with an average sterile pressure. Median and 95% credible bands.

Figure 4.8 shows the relative variability of the sector effects in the logarithmic scale. Note the increased uncertainty of the third sector, due to the confounding with the sit intervention explained earlier.

Expected capture rates of adult males by sector, in the absence of intervention, for an intermediate sterile pressure (5) and day 300

Figure 4.7: Expected capture rates of adult males by sector, in the absence of intervention, for an intermediate sterile pressure (5) and day 300

Posterior effects of the sectors in the logarithmic scale. Not including the effect of the area.

Figure 4.8: Posterior effects of the sectors in the logarithmic scale. Not including the effect of the area.

Prior (in light blue) and posterior (grey) distribution for the reciprocal dispersion parameter of the negative binomial likelihood.

Figure 4.9: Prior (in light blue) and posterior (grey) distribution for the reciprocal dispersion parameter of the negative binomial likelihood.

4.2 Larvae surveys

Posterior multiplicative effect of the sit intervention on the capture rate.

Figure 4.10: Posterior multiplicative effect of the sit intervention on the capture rate.

Figure 4.10 shows a positive effect of the sit intervention on the expected larvae rate. Suggesting that the abundance would be multiple times greater in the sit sector than it would be in the absence of treatment.

However, as in the case with adult surveys, this effect is negatively correlated with the effect of the third sector, as shown in Figure 4.11 in the logarithmic scale. The joint posterior reveals a clear increased abundance in the sit sector with respect to the controls, which seems to be preferably attributed to the treatment, rather than to the sector. Still, there is considerable uncertainty associated and we cannot completely disentangle the two effects without imposing more strict assumptions.

## Warning: Computation failed in `stat_binhex()`
## Caused by error in `compute_group()`:
## ! The package "hexbin" is required for `stat_binhex()`
Joint posterior distribution of the treatment effect and the effect of the sector 3, where the releases were performed.

Figure 4.11: Joint posterior distribution of the treatment effect and the effect of the sector 3, where the releases were performed.

Figure 4.12 shows the estimated posterior effect of the sterile pressure on the adult capture in the logarithmic scale, in contrast with the prior specification.

The value approximately represents the percent change in the capture rate for each percent increase in the sterile population.

The results suggest a nearly negligible effect of the sterile pressure, meaning that the abundance, or at least the larvae rate, increases very little, if anything, with the release of more sterile males.

Posterior vs. prior effect of the sterile pressure

Figure 4.12: Posterior vs. prior effect of the sterile pressure

In order to appreciate the impact of these effects, particularly that of the intervention, on the observed capture rate, figure 4.13 displays the predicted capture rates on different scenarios.

Expected larvae rates by sex and experimental role in a typical trap on a typical sector of the East bank, at day 300 of the experiment, with an average sterile pressure. Median and 95% credible intervals.

Figure 4.13: Expected larvae rates by sex and experimental role in a typical trap on a typical sector of the East bank, at day 300 of the experiment, with an average sterile pressure. Median and 95% credible intervals.

Expected capture rates with and without intervention by day of experiment and sex in a typical trap on a typical sector of the East bank, with an average sterile pressure. Median and 95% credible bands.

Figure 4.14: Expected capture rates with and without intervention by day of experiment and sex in a typical trap on a typical sector of the East bank, with an average sterile pressure. Median and 95% credible bands.

Figure 4.16 shows the relative variability of the sector effects in the logarithmic scale. Note the increased uncertainty of the third sector, due to the confounding with the sit intervention explained earlier.

Expected larvae rates of adult males by sector, in the absence of intervention, for an intermediate sterile pressure (5) and day 300

Figure 4.15: Expected larvae rates of adult males by sector, in the absence of intervention, for an intermediate sterile pressure (5) and day 300

Posterior effects of the sectors in the logarithmic scale. Not including the effect of the area.

Figure 4.16: Posterior effects of the sectors in the logarithmic scale. Not including the effect of the area.

Prior (in light blue) and posterior (grey) distribution for the scale parameter of the Log-normal likelihood.

Figure 4.17: Prior (in light blue) and posterior (grey) distribution for the scale parameter of the Log-normal likelihood.

4.3 Swarm surveys

Estimated population size over time on the sit sector. Posterior median and 95% credible band.

Figure 4.18: Estimated population size over time on the sit sector. Posterior median and 95% credible band.

The uncertainties associated with the population estimates from the swarm data are overwhelmingly larger than the signal.

Nevertheless, we can say with certain confidence that the wild population is relatively small. Very likely of an order of magnitude in the hundreds.

5 Discussion of preliminary results

5.1 Adult surveys

Table 5.1 shows summary statistics by sector, all other factors (i.e. traps, date, sex) confounded.

This summary had been used in a preliminary analysis to identify sector 9 as the control sector of reference due to its similarity to the sit sector in terms of average and standard deviation of capture rates. However, these summaries change significantly after cleaning up the data (See the cleanup report) and sector number 9 is no longer the sector with most similar capture rates.

Moreover, I doubt that selecting for resemblance with respect to this particular measure is of any help. On the contrary, I’ll try to make the most of the data from all the control sectors, in order to understand the variability in a situation where the vast majority of outcomes are zero.

Table 5.1: Average and standard deviation (SD) of the capture rate (CR) by sector, with total number of traps and sampling units.
Area Sector Sampling Units N traps Avg. CR SD CR
East 1 9 27 0.01 0.15
East 2 9 27 0.01 0.12
East 3 10 30 0.08 0.67
Island 4 9 27 0.08 0.59
Island 5 14 42 0.03 0.34
Island 6 14 42 0.02 0.22
West 7 9 27 0.12 0.97
West 8 11 33 0.06 0.39
West 9 13 39 0.31 2.33
Table 5.2: Kruskal-Wallis non-parametric tests for the homogeneity of the distribution of captures across years, by sector.
Avg. CR (SD)
K-W test
sector 2014 2015 2016 2017 H statistic p-value
1 0.05 (0.39) 0 (0.07) 0 (0.05) 0.01 (0.14) 14.8 0.0020
2 0.03 (0.23) 0.01 (0.12) 0 (0.07) 0.01 (0.09) 5.9 0.1187
3 0.38 (1.36) 0.04 (0.63) 0.04 (0.35) 0 (0.07) 122.6 <2e-16
4 0.16 (0.67) 0.08 (0.68) 0.01 (0.15) 0 (0) 28.4 3e-06
5 0.03 (0.23) 0.04 (0.41) 0 (0.06) 0.03 (0.22) 4.9 0.1791
6 0.04 (0.39) 0.02 (0.22) 0 (0) 0 (0) 14.7 0.0021
7 0.03 (0.22) 0.01 (0.13) 0.06 (0.52) 1.04 (2.93) 150.1 <2e-16
8 0.05 (0.41) 0.06 (0.44) 0.04 (0.23) 0.13 (0.53) 11.3 0.0100
9 0.13 (0.51) 0.47 (3.23) 0.2 (0.98) 0.06 (0.24) 1.7 0.6302

The analysis went on testing the hypothesis that the distribution of the number of individuals captured in surveys within each sector remained homogeneous across years, using the Kruskal-Wallis nonparametric test.

This analysis focused on the sit sector (#3) and the selected control sector (#9). The procedure identified significant differences in the distributions of capture rates for different years in the sit sector, while the corresponding distributions in the control sector were non-significant. Together with the fact that the average capture rate decreased over the years in the sit sector (Fig. 5.1), this has been interpreted as evidence supporting the hypothesis that the intervention had a significant impact. However, the declining trend in the average capture rates of the sit sector is much less impressive when put in the context of its standard deviation.

Average capture rates (CR) in the sit and control sectors. Without (left) and with (right) standard deviation.

Figure 5.1: Average capture rates (CR) in the sit and control sectors. Without (left) and with (right) standard deviation.

Here I performed the same analysis for all sectors, with the cleaned up data. The distributions of capture rate across years in the sit sector are still significantly different according to Kruskal-Wallis and also according to a permutation test (not shown). Similarly, the corresponding distributions in sector #9 are again not significantly different.

However, several other control sectors (#1, #4, #6, #7, #8) display significant differences over the years. Only 3 control sectors are non-significant.

This suggests that this particular test is not reliably identifying the effect of the intervention. Possibly due to the violation of some or all of the test assumptions:

  • Conditional independence of the observations. The temporal trends introduce autocorrelation. Moreover, the observations in 2014 are concentrated in the last two thirds of the year, whereas those in 2017 are concentrated in the first third. Seasonal effects can produce differences that are wrongly attributed to the year, and yet more wrongly to the intervention.

  • Homogeneity of distribution. The outcomes are assumed to follow a common distribution \(F\), possibly centred at a different median values as a function of the year. However, the proportions of zeros are relatively variable over years.

    Table 5.3: Number of surveys with a given number of catches, by year, in sector 1.

    2014

    2015

    2016

    2017

    0

    224

    828

    724

    240

    1

    3

    4

    2

    1

    2

    2

    0

    0

    1

    5

    1

    0

    0

    0

    In sector 1, for instance, while 2.6% of the outcomes are greater than 0 in 2014, the proportion is of only 0.5% in 2015. Whereas the empirical medians are 0 for both years (Table 5.3).

In conclusion, whereas it seems clear that the distributions of capture rates within sectors are indeed variable over the years, this occurs in most sectors, even in control sectors far from the release area.

This precludes the hypothesis that the differences are due to the impact of the intervention.

Most likely, variations are due to either temporal autocorrelation, seasonal effects or other artefacts, and are present in all sectors, for some of which there were simply not enough data to be detected.

5.2 Larvae surveys

The larvae rates in the sit sector appear to drop from mid-2015 to 2017, below the majority of the observations in the control sectors (Fig. 2.10).

However, a single observation (on 2012-12-12, Table 5.4) with a low rate in a period of 18 months cannot justify, on its own, a claim about a underlying drop in the population.

Table 5.4: Larvae surveys in the sit sector. The only survey between May 2015 and December 2016 is bolded.
Date Breeding site type N larvae N dips Larvae rate
2014-05-29 animal water pool 8 10 0.8
2014-06-09 animal water pool 17 3 5.7
2015-03-28 cannal seepage 20 10 2.0
2015-04-13 animal water pool 31 3 10.3
2015-05-01 animal water pool 35 10 3.5
2015-12-12 cannal seepage 1 5 0.2
2016-12-09 animal water pool 23 5 4.6
2017-01-23 animal water pool 13 10 1.3
2017-03-14 animal water pool 20 3 6.7
2017-04-14 animal water pool 12 6 2.0

5.3 Swarm surveys

I have identified a few mistakes in the population estimates in the preliminary analyses. The Chapman estimator used there is:

\[ \frac{(K + 1) (n + 1)}{kj + 1} - 1, \] where \(K\) is the total number of released sterile males; \(n\) is the total total number of captured individuals (both sterile and wild); and \(k\) is the number of sterile males captured.

However, the results from the preliminary analyses used only the number of captured wild individuals (i.e. \(n - k\)) in place of \(n\), leading to incorrect estimates.

Moreover, the first term in the formula for the standard error of the estimator should be \((k + 1)\) instead of \((K + 1)\). Although I have not been able to recover the same exact numbers as reported.

Finally, the Chapman estimator includes the released individuals as part of the population. Since in our experiment, the marked individuals are extraneous to the population, the number of individuals released are considerably variable and the wild population density is not very high, I consider it more appropriate to subtract the released number from the estimates, in order to obtain an estimate of the wild population size.

6 Conclusions

  • The estimates from the model for the adult surveys suggest that the release of sterile males in the sit sector caused a reduction on the wild population.

    Indeed, after adjusting for a general temporal trend, and variation across areas, sectors, traps and sexes, the release times tend to coincide with higher capture rates, followed by a decrease in both sterile pressure and capture rates.

    However, the vast majority of zeroes in the observed outcomes and the large variability in the many factors at play make the predictions somewhat unreliable.

    The sit sector was located in the area where the wild population was the least abundant, making it more difficult to quantify precisely the relative changes.

  • The results from the larvae surveys suggest a positive effect of the intervention (i.e., increased larvae rate due to the treatment), which is contradictory with the prior experimental expectations. Adding to the contradiction, additional sterile pressure does not seem to have any significant impact on the larvae rate.

  • Both the adult and larvae surveys present a U-shaped temporal trend. This is hardly due to a seasonal effect, due to the extent of the period over almost 3 years.

    This can either reflect a real trend in mosquito abundance, for some unknown cause, or changes in experimental practices protocol adherence over such a long period.

  • Both the adult and larvae surveys show a consistent picture of the area effects. This suggests that the wild population of Anopheles arabiensis is least abundant in the East area (where the sit sector is located), followed by the Island, while it is significantly more abundant in the West.

  • The swarm data is the only source of information that has the potential to provide estimates on the absolute abundance of the population.

    It allows us to confirm that the wild population in the sit sector is relatively small with respect to the number of sterile individuals released.

    This explains the enormous variabilities in the experimental outcomes, the vast majority of zeroes in the adult captures, and as a consequence the difficulty of measuring the impact of the intervention on the wild population.

  • For the reasons explained above, with the present data I am not able to provide a reliable estimation of an effective suppression strategy in this sector.

Bibliography

Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and Other Stories. Analytical Methods for Social Research. Cambridge New York, NY Port Melbourne, VIC New Delhi Singapore: Cambridge University Press. https://doi.org/10.1017/9781139161879.