Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Hierarchical Clustering of DNA K-Mer Counts in RNAseq Fastq Files Identifies Sample Heterogeneities

Version 1 : Received: 5 November 2018 / Approved: 7 November 2018 / Online: 7 November 2018 (14:24:53 CET)

A peer-reviewed article of this Preprint also exists.

Kaisers , W.; Schwender, H.; Schaal , H. Hierarchical Clustering of DNA k-mer Counts in RNAseq Fastq Files Identifies Sample Heterogeneities. Int. J. Mol. Sci. 2018, 19, 3687. Kaisers , W.; Schwender, H.; Schaal , H. Hierarchical Clustering of DNA k-mer Counts in RNAseq Fastq Files Identifies Sample Heterogeneities. Int. J. Mol. Sci. 2018, 19, 3687.

Abstract

We apply hierarchical clustering (HC) of DNA k-mer counts on multiple Fastq files. The tree structures produced by HC may reflect experimental groups and thereby indicate experimental effects, but clustering of preparation groups indicates the presence of batch effects. Hence, HC of DNA k-mer counts may serve as an unspecific diagnostic device. In order to provide a simple applicable tool we implemented sequential analysis of Fastq reads with low memory usage in an R package (seqTools) available on Bioconductor. The approach is validated by analysis of Fastq file batches containing RNAseq data. Analysis of three Fastq batches downloaded from ArrayExpress indicated experimental effects. Analysis of RNAseq data from two cell types (dermal fibroblasts and Jurkat cells) sequenced in our facility indicate presence of batch effects. The observed batch effects were also present in reads mapped to the human genome and also in reads filtered for high quality (Phred > 30). We propose, that hierarchical clustering of DNA k-mer counts provides an unspecific diagnostic tool and a quality criterion and for RNAseq experiments.

Keywords

Hierarchical clustering; DNA; Fastq; HcKmer

Subject

Biology and Life Sciences, Biochemistry and Molecular Biology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.