Allelome.PRO, a pipeline to define allele-specific genomic features from high-throughput sequencing data Journal Article

Author(s): Andergassen, Daniel; Dotter, Christoph P; Kulinski, Tomasz M; Guenzl, Philipp M; Bammer, Philipp C; Barlow, Denise P; Pauler, Florian M; Hudson, Quanah J
Article Title: Allelome.PRO, a pipeline to define allele-specific genomic features from high-throughput sequencing data
Affiliation IST Austria
Abstract: Detecting allelic biases from high-throughput sequencing data requires an approach that maximises sensitivity while minimizing false positives. Here, we present Allelome.PRO, an automated user-friendly bioinformatics pipeline, which uses high-throughput sequencing data from reciprocal crosses of two genetically distinct mouse strains to detect allele-specific expression and chromatin modifications. Allelome.PRO extends approaches used in previous studies that exclusively analyzed imprinted expression to give a complete picture of the ‘allelome’ by automatically categorising the allelic expression of all genes in a given cell type into imprinted, strain-biased, biallelic or non-informative. Allelome.PRO offers increased sensitivity to analyze lowly expressed transcripts, together with a robust false discovery rate empirically calculated from variation in the sequencing data. We used RNA-seq data from mouse embryonic fibroblasts from F1 reciprocal crosses to determine a biologically relevant allelic ratio cutoff, and define for the first time an entire allelome. Furthermore, we show that Allelome.PRO detects differential enrichment of H3K4me3 over promoters from ChIP-seq data validating the RNA-seq results. This approach can be easily extended to analyze histone marks of active enhancers, or transcription factor binding sites and therefore provides a powerful tool to identify candidate cis regulatory elements genome wide.
Journal Title: Nucleic Acids Research
Volume: 31
Issue 21
ISSN: 0305-1048
Publisher: Oxford University Press  
Date Published: 2015-12-02
Start Page: 1
End Page: 19
Copyright Statement: CC-BY 4.0
Sponsor: Austrian Science Fund [FWF P25185-B22, FWF F4302- B09, FWFW1207-B09]. Funding for open access charge: Austrian Science Fund.
DOI: 10.1093/nar/gkv727
Notes: We thank Florian Breitwieser for advice during the early stages of this project. High-throughput sequencing was conducted by the Biomedical Sequencing Facility (BSF) at CeMM in Vienna.
Open access: yes (OA journal)