Dominic Pearce, The Institute of Genetics and Molecular Medicine, The University of Edinburgh

2018-04-25

```
library(survivALL)
library(Biobase)
library(knitr)
```

To determine and ensure reliable prognostic association as a measure of significance, *survivALL* can perform a non-parametric bootstrapping procedure. In short we calculate, for each point-of-separation a distribution of expected hazard ratios (HRs), against which we're able to compare our observed HRs as part of our analysis.

To achieve this, we randomly sample our survival data with replacement and then calculate survival statistics for all points-of-separation, exactly as we would for a biomarker under investigation. By repeating this procedure 1,000s or 10,000s of times, we produce our distribution of *expected* hazard ratios.

```
data(nki_subset)
#bootstrapping data should be in the format of 1 repeat per column
bs_mtx <- matrix(nrow = ncol(nki_subset), ncol = 20)
system.time(
for(i in 1:ncol(bs_mtx)){
bs_mtx[, i] <- allHR(measure = sample(1:ncol(nki_subset),
replace = TRUE),
srv = pData(nki_subset),
time = "t.dmfs",
event = "e.dmfs")
}
)
```

user system elapsed 24.313 0.296 24.632

```
kable(bs_mtx[1:20, 1:5])
```

NA | NA | NA | NA | NA |

-0.4014227 | NA | NA | -0.4576410 | NA |

0.2984494 | -0.4491361 | NA | 0.0792404 | -0.8907620 |

-0.5303545 | 0.0990314 | 0.4857405 | 0.7285488 | NA |

-0.1702408 | 0.6413651 | 0.8452752 | 1.0275861 | -0.7785881 |

0.1814099 | 0.9289172 | 1.1349544 | 0.0946139 | -0.3816460 |

-0.1740861 | -0.0305109 | 0.1757499 | 0.3099642 | -0.9078913 |

0.0197442 | 0.3146530 | 0.4616417 | -0.2501668 | -0.4518784 |

-0.3744622 | 0.5120723 | -0.0896996 | -0.4760801 | -0.3932269 |

-0.1658159 | 0.6869374 | 0.0243721 | -0.3161967 | -0.1424456 |

-0.0318268 | 0.8520492 | 0.0243721 | -0.1669160 | 0.1284438 |

0.1789310 | 1.0547973 | 0.1929272 | -0.3623228 | 0.3106466 |

0.2946136 | 1.1722670 | 0.4310035 | -0.2210120 | 0.4851842 |

0.3954521 | NA | 0.6013607 | -0.0617779 | 0.6461764 |

0.5589154 | NA | 0.7511898 | -0.0617779 | 0.8070691 |

0.6965117 | NA | 0.8487477 | -0.3131932 | 0.4366925 |

0.3651488 | NA | 0.9855458 | -0.1528381 | 0.5390877 |

0.2145058 | 1.1179661 | 1.0938748 | -0.0143247 | 0.6437619 |

0.0621976 | 1.2047041 | 0.7007348 | -0.1846323 | 0.6883963 |

0.1542357 | 1.2857949 | 0.7851356 | -0.1051644 | 0.8163849 |

Having calculated our bootstrapped data we then simply hand the matrix to either the `survivALL()`

or `plotALL()`

functions (using the `bs_dfr =`

argument) to handle the subsequent significance calculations. It should be noted that bootstrapping up to 10,000x can be a long process requiring an investment of time.