Bootstrap-t Confidence Interval

We want to approximate the sampling distribution of a pivotal quantity with bootstrap distribution so that we could construct a confidence interval. The method is similar to approximating a sampling distribution of a pivotal quantity using a t-distribution.

The Step-By-Step Approach

  1. Given a sample ๐’ฎ, an attribute a(๐’ฎ), and standard error \(\\widehat {SD}\[\\tilde a(\\mathcal S)\]\), calculate a(๐’ฎ) and standard error \(\\widehat {SD}\[\\tilde a(\\mathcal S)\]\) based on the sample.

  2. Generate B bootstrap samples s1*,โ€†โ€ฆ,โ€†sB* from sample s with replacement.

  3. For each of the B bootstrap samples, calculate a(๐’ฎB*) and error \(\\widehat {SD}\[\\tilde a(\\mathcal S\_B^\*)\]\). Find the value of \(z\_B^\*=\\frac{a(\\mathcal S\_B^\*)-a(\\mathcal S)}{\\widehat {SD}\[\\tilde a(\\mathcal S\_B^\*)\]}\)

  4. From the estimates of the sampling distribution z1*,โ€†โ€ฆ,โ€†zB*, find the quantiles clowerโ€„=โ€„Qz(p/2) and cupperโ€„=โ€„Qz(1โ€…โˆ’โ€…p/2).

  5. The 100(1โ€…โˆ’โ€…p)% bootstrap-t confidence interval is \(\\left (a(\\mathcal S)-c\_{upper} \\widehat {SD}\[\\tilde a(\\mathcal S)\], (a(\\mathcal S)-c\_{lower} \\widehat {SD}\[\\tilde a(\\mathcal S)\] \\right )\)

  • Note the positions and signs of the clower and cupper in the above definition

Casualty Age Example

In this example, when a(๐’ซ) is average age of the casualty in shark encounters, the sampling distribution of the pivotal quantity with nโ€„=โ€„8 is shown below.

First, letโ€™s define some functions to help our future computation.

casualty <- read.csv("sharks.csv")
popCasualty <- rownames(casualty)

# Put n or N in the denominator of SD
# instead of n-1 or N-1
sdn <- function( y.pop ) {
N = length(y.pop)
sqrt( var(y.pop)*(N-1)/(N) ) }
se.avg <- function(y.sam) {
sdn(y.sam)/sqrt(length(y.sam)) * sqrt((65-length(y.sam))/(65-1) )
}

popSize <- function(pop) {
if (is.vector(pop))
{if (is.logical(pop))
  
## Assume TRUE values
sum(pop) else length(pop)}
else nrow(pop)
}

getSample <- function(pop, size, replace=FALSE) {
N <- popSize(pop)
pop[sample(1:N, size, replace = replace)]
}

We now generate our sampling distribution and the bootstrap estimates of the sampling distribution. Then, we can also calculate a(๐’ฎ), $\widehat {SD}[\tilde a(\mathcal S)]$, a(๐’ฎB*), $\widehat {SD}[\tilde a(\mathcal S_B^*)]$ and zB*.

M <- 10^5
n = 8
set.seed(3421134)
samples <- sapply(1:M, FUN =function(m) sample(popCasualty, n, replace = FALSE) )
avePop <- mean(casualty[, "Age"])
avesSamp <- apply(samples, MARGIN = 2,
FUN = function(s){mean(casualty[s,"Age"])})
SEaveSamp <- apply(samples, MARGIN = 2,
FUN = function(s){se.avg(casualty[s,"Age"])})
ZPop <- (avesSamp - mean(casualty[,"Age"]))/SEaveSamp

samCasualty <- sample(popCasualty, n, replace = FALSE)
samStar <- sapply(1:M, FUN = function(m) sample(samCasualty, n, replace = TRUE))
aveSam <- mean(casualty[samCasualty, "Age"])
avesBoot <- apply(samStar, MARGIN = 2, FUN = function(s) {
mean(casualty[s, "Age"])
})
SEaveBoot <- apply(samples, MARGIN = 2, FUN = function(s) {
se.avg(casualty[s, "Age"])
})
ZBoot <- (avesBoot - aveSam)/SEaveSamp

We have our answers. The Bootstrap-t confidence interval using $\widehat {SD}[\tilde a(\mathcal S)]$ for the average age of the casualty is

samCasualtyAges = casualty[samCasualty, "Age"]
zStar.lower = quantile(ZBoot, 0.025)
zStar.upper = quantile(ZBoot, 0.975)
round(mean(samCasualtyAges) - c(zStar.upper, zStar.lower) * se.avg(samCasualtyAges),
2)

## 97.5%  2.5% 
## 19.69 40.94

And the Bootstrap-t confidence interval using $\widehat {SD}_*[\tilde a(\mathcal S)]$ for the average casualty age is

round(mean(samCasualtyAges) - quantile(ZBoot, c(0.975, 0.025)) * sd(avesBoot),
2)

## 97.5%  2.5% 
## 19.04 41.55