Experimentally confirmed human MTBC T cell epitope sequences were retrieved from the Immune Epitope Database on the 24th of April 2015. Only linear epitopes from the MTBC (ID: 77643) tested in human T cell assays, with no MHC restrictions were selected (1,730 epitopes). The sequence of each epitope was blasted using blastP91 against the reference strain (H37Rv) to obtain genomic coordinates. Epitopes with no coordinates in H37Rv or for which no accurate coordinates could be determined (due to multiple hits) and epitopes in repetitive regions such as PE/PPE genes, phages-related genes and transposases were excluded, rendering a final set of 1,226 epitopes. Those epitopes are distributed across 304 antigens and have some overlapping sequences. In order to proceed with the sequence analysis, alignments were obtained by concatenating all epitope sequences after excluding sequence redundancy. Alignments of non-epitope containing antigens were obtained by excluding the regions described as epitopes from each respective antigen. To assess how other regions of the genome are evolving, alignments for essential and non-essential genes were also obtained62.
Alignments of epitopes and non-epitope containing antigens, essential and nonessential genes, were used to calculate pairwise dN/dS ratios for L4.3/LAM, L4.6.1/Uganda, L4.10/PGG3 and L4.1.2/Haarlem sublineages. The dN/dS measures were calculated using all polymorphic sites within each sublineage and reflect therefore both within-sublineage substitutions and transient polymorphisms. Pairwise dN and dS values within each sublineage were calculated using the R package seqinr using the kaks function. To avoid having undetermined pairwise dN/dS values due to dN or dS being zero, a mean dN/dS was then calculated per sequenced isolate by dividing its mean pairwise dN by its mean pairwise dS with respect to all other sequenced isolates within each sublineage. The statistical differences between epitopes and non-epitope regions of antigens within each sublineage were accessed by using Wilcoxon rank sum tests with continuity correction implemented in R version 3.2.2.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.