2.1 Data Extraction and Setup

This section will explore how we take the data from the tapestri pipeline and process it in R. This was previously done with an intermediate step through Tapestri Insights, however R package development from the Mission Bio team has now obviated that step. For the manuscript, we indeed used Tapestri Insights, but I now recommend the appraoch here. One major file we’ll work with the numeric genotype matrix, which contains numeric values representing:

NGT.csv - genotype call converted to categorical (0-reference, 1-heterozygous mutation, 2-homozygous mutation, 3-unknown). Calls are made by GATK/Haplotypecaller.