Identification of Legionella effector domains

David Burstein; Francisco Amaro; Tal Zusman; Ziv Lifshitz; Ofir Cohen; Jack A Gilbert; Tal Pupko; Howard A Shuman; Gil Segal

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Identification of Legionella effector domains

DB David Burstein

FA Francisco Amaro

TZ Tal Zusman

ZL Ziv Lifshitz

OC Ofir Cohen

JG Jack A Gilbert

TP Tal Pupko

HS Howard A Shuman

GS Gil Segal

This method is extracted from research article: Nat Genet, Jan 2016

Uncovering the Legionella genus effector repertoire - strength in diversity and numbers

DOI: 10.1038/ng.3481

Request a Protocol

Ask a question

Favorite

Each Legionella effector ortholog group (LEOG) was represented by a hidden Markov model (HMM), which was constructed as follows. The proteins belonging to a given ortholog group were aligned by MAFFT⁷³ version v7.164b using the `einsi' strategy. HMMs were constructed from the multiple sequence alignments using hmmbuild from the HMMER suite⁷⁴ version 3.1b1.

Characterized domains were identified by comparing LEOG HMMs to domain databases using hhsearch version 2.0.15 from the HH-suite⁷⁵. Specifically, a hhsearch with e-value threshold of 10⁻⁵ was used to find similarities between the LEOG HMMs and HMMs derived from following databases: (1) NCBI's Conserved Domain Database (CDD)⁷⁶, (2) Pfam⁷⁷, and (3) SMART⁷⁸, which were downloaded from the HH-suite ftp site (ftp://toolkit.genzentrum.lmu.de/pub/HH-suite/). Resulting hits were manually curated to filter out domains of unknown functions and non-informative domains. Additional characterized domains were identified during the process of novel domain detection.

Novel domains were identified as follows. All against all BLAST⁷⁹ search of all 5,885 putative Legionella effectors was performed with e-value cutoff of 0.001. From the BLAST hits that received bit score > 40, we extracted maximal joined segments longer than 50 amino acids that were nearly non-overlapping (overlap < 10 amino acids). The extracted segments were searched using BLAST against the putative effector dataset using a threshold of 40 bit score. Hits of segments that had four or more hits were aligned and used to construct HMMs (as described above). These HMMs, representing conserved domains, were compared to each other using hhsearch. HMMs with homology probability score of ≥ 95% and e-value < 0.01 across at least 50% of their length were designated as describing the same domain. The detected domain HMMs were scanned for coiled-coil domains using COILS⁸⁰, and domains that were ≥ 80% covered by coiled-coil domains were labeled as coiled-coiled domains. The domain HMMs were further scanned against the HMM databases of CDD⁷⁶, Pfam⁷⁷, and SMART⁷⁸, and those with homology probability score ≥ 95% and e-value < 0.01 across at least 50% of their length were annotated according to the characterized domain (after excluding non-informative hits). The domain HMMs were used to scan the putative effectors dataset. A domain was considered as a novel Legionella effector domain if it did not overlap any characterized domain and appeared in at least 80% of the members of two different ortholog groups, each composed of at least two putative effectors.

In the effector-domain network each node represents an architecture, i.e., a combination of domains that was present in the same effector. An edge between two architecture nodes represents a domain that is shared by the two architectures. The size of each node is proportional to the number of putative effectors that had the architecture represented by the node. The network was visualized using the igraph package⁸¹ of R⁸². The domain architecture trees topology is of the species trees built based on 78 single copy genes as specified above. The trees were visualized using iTOL⁸³.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol