Biogazelle's publications

Non-coding after all: Biases in proteomics data do not explain observed absence of lncRNA translation products

Over the past decade, long non-coding RNAs (lncRNAs) have emerged as novel functional entities of the eukaryotic genome. However, the scientific community remains divided over the amount of true non-coding transcripts among the large number of unannotated transcripts identified by recent large scale and deep RNA-sequencing efforts. Here, we systematically exclude possible technical reasons underlying the absence of lncRNA-encoded proteins in mass spectrometry datasets, strongly suggesting that the large majority of lncRNAs is indeed not translated.

Verheggen et al.
Biases in proteomics data do not explain observed absence of lncRNA translation products

Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data

RNA-sequencing has become the gold standard for whole-transcriptome gene expression quantification. Multiple algorithms have been developed to derive gene counts from sequencing reads. While a number of benchmarking studies have been conducted, the question remains how individual methods perform at accurately quantifying gene expression levels from RNA-sequencing reads. We performed an independent benchmarking study using RNA-sequencing data from the well established MAQCA and MAQCB reference samples. RNA-sequencing reads were processed using five workflows (Tophat-HTSeq, Tophat-Cufflinks, STAR-HTSeq, Kallisto and Salmon) and resulting gene expression measurements were compared to expression data generated by wet-lab validated qPCR assays for all protein coding genes. All methods showed high gene expression correlations with qPCR data. When comparing gene expression fold changes between MAQCA and MAQCB samples, about 85% of the genes showed consistent results between RNA-sequencing and qPCR data. Of note, each method revealed a small but specific gene set with inconsistent expression measurements. A significant proportion of these method-specific inconsistent genes were reproducibly identified in independent datasets. These genes were typically smaller, had fewer exons, and were lower expressed compared to genes with consistent expression measurements. We propose that careful validation is warranted when evaluating RNA-seq based expression profiles for this specific gene set. 

Everaert et al.
Everaert et al.

Zipper plot: visualizing transcriptional activity of genomic regions

Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5′-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool (

Avila Cobos et al.

Model based classification for digital PCR: your Umbrella for rain

Standard data analysis pipelines for digital PCR estimate the concentration of a target nucleic acid by digitizing the end-point fluorescence of the parallel micro-PCR reactions using an automated hard threshold. While it is known that misclassification has a major impact on the concentration estimate and substantially reduces accuracy, the uncertainty of this classification is typically ignored. We introduce a model-based clustering method to estimate the probability that the target is present (absent) in a partition conditional on its observed fluorescence and the distributional shape in no template control samples. This methodology acknowledges the inherent uncertainty of the classification and provides a natural measure of precision, both at individual partition level and at the level of the global concentration. We illustrate our method on genetically modified organism, inhibition, dynamic range and mutation detection experiments. We show that our method provides concentration estimates of similar accuracy or better than the current standard along with a more realistic measure of precision. The individual partition probabilities and diagnostic density plots further allow for some quality control. An R implementation of our method, called Umbrella, is available, providing a more objective and automated data analysis procedure for absolute dPCR quantification.

Jacobs et al.

miSTAR: miRNA target prediction through modeling quantitative and qualitative miRNA binding site information in a stacked model structure

In microRNA (miRNA) target prediction, typically two levels of information need to be modeled: the number of potential miRNA binding sites present in a target mRNA and the genomic context of each individual site. Single model structures insufficiently cope with this complex training data structure, consisting of feature vectors of unequal length as a consequence of the varying number of miRNA binding sites in different mRNAs. To circumvent this problem, we developed a two-layered, stacked model, in which the influence of binding site context is separately modeled. Using logistic regression and random forests, we applied the stacked model approach to a unique data set of 7990 probed miRNA–mRNA interactions, hereby including the largest number of miRNAs in model training to date. Compared to lower-complexity models, a particular stacked model, named miSTAR (miRNA stacked model target prediction;, displays a higher general performance and precision on top scoring predictions. More importantly, our model outperforms published and widely used miRNA target prediction algorithms. Finally, we highlight flaws in crossvalidation schemes for evaluation of miRNA target prediction models and adopt a more fair and stringent approach.

Van Peer et al.

Depletion of tRNA-halves enables effective small RNA sequencing of low-input murine serum samples

The ongoing ascent of sequencing technologies has enabled researchers to gain unprecedented insights into the RNA content of biological samples. MiRNAs, a class of small non-coding RNAs, play a pivotal role in regulating gene expression. The discovery that miRNAs are stably present in circulation has spiked interest in their potential use as minimally-invasive biomarkers. However, sequencing of blood-derived samples (serum, plasma) is challenging due to the often low RNA concentration, poor RNA quality and the presence of highly abundant RNAs that dominate sequencing libraries. In murine serum for example, the high abundance of tRNA-derived small RNAs called 5′ tRNA halves hampers the detection of other small RNAs, like miRNAs. We therefore evaluated two complementary approaches for targeted depletion of 5′ tRNA halves in murine serum samples. Using a protocol based
on biotinylated DNA probes and streptavidin coated magnetic beads we were able to selectively deplete 95% of the targeted 5′ tRNA half molecules. This allowed an unbiased enrichment of the miRNA fraction resulting in a 6-fold increase of mapped miRNA reads and 60% more unique miRNAs detected. Moreover, when comparing miRNA levels in tumor-carrying versus tumor-free mice, we observed a three-fold increase in differentially expressed miRNAs.

Van Goethem et al.

Why non-coding RNA research for cancer is key

DNA is the hereditary code that is passed on from parents to their children. Every cell inside our body has the same code that contains the blueprint of life. If errors accumulate in the DNA of a given cell, it may lead to uncontrolled cellular growth and the formation of a tumor. During the act of reading the instruc- tions coded in DNA, another chemical substance is produced in the cell, namely RNA. Until recently, these RNA molecules were thought to be translated into proteins, the building blocks of cells and our body. However, an entirely new class of RNA has emerged, outnumbering the classic RNA molecules that give rise to proteins. These new so-called long non-coding RNAs (lncRNAs) are not translated into proteins but still have crucial functions in health and disease, including cancer. Many of these lncRNAs are active in only very specific cell types or under very specific conditions; as such, they offer great untapped potential for developing innovative diagnostic and therapeutic applications.


Long non-coding RNA expression profiling in the NCI60 cancer cell line panel using high-throughput RT-qPCR

Long non-coding RNAs (lncRNAs) form a new class of RNA molecules implicated in various aspects of protein coding gene expression regulation. To study lncRNAs in cancer, we generated expression profiles for 1707 human lncRNAs in the NCI60 cancer cell line panel using a high-throughput nanowell RT-qPCR platform. We describe how qPCR assays were designed and validated and provide processed and normalized expression data for further analysis. Data quality is demonstrated by matching the lncRNA expression profiles with phenotypic and genomic characteristics of the cancer cell lines. This data set can be integrated with publicly available omics and pharmacological data sets to uncover novel associations between lncRNA expression and mRNA expression, miRNA expression, DNA copy number, protein coding gene mutation status or drug response

Mestdagh et al.

Flexible analysis of digital PCR experiments using generalized linear mixed models

The use of digital PCR for quantification of nucleic acids is rapidly growing. A major drawback remains the lack of flexible data analysis tools. Published analysis approaches are either tailored to specific problem settings or fail to take into account sources of variability. We propose the generalized linear mixed models framework as a flexible tool for analyzing a wide range of experiments. We also introduce a method for estimating reference gene stability to improve accuracy and precision of copy number and relative expression estimates. We demonstrate the usefulness of the methodology on a complex experimental setup.

Vynck et al.

Melanoma addiction to the long non-coding RNA SAMMSON

Focal amplifications of chromosome 3p13–3p14 occur in about 10% of melanomas and are associated with a poor prognosis. The melanoma-specific oncogene MITF resides at the epicentre of this amplicon1. However, whether other loci present in this amplicon also contribute to melanomagenesis is unknown. Here we show that the recently annotated long non-coding RNA (lncRNA) gene SAMMSON is consistently co-gained with MITF. In addition, SAMMSON is a target of the lineage-specific transcription factor SOX10 and its expression is detectable in more than 90% of human melanomas. Whereas exogenous SAMMSON increases the clonogenic potential in trans, SAMMSON knockdown drastically decreases the viability of melanoma cells irrespective of their transcriptional cell state and BRAF, NRAS or TP53 mutational status. Moreover, SAMMSON targeting sensitizes melanoma to MAPK-targeting therapeutics both in vitro and in patient-derived xenograft models. Mechanistically, SAMMSON interacts with p32, a master regulator of mitochondrial homeostasis and metabolism, to increase its mitochondrial targeting and pro-oncogenic function. Our results indicate that silencing of the lineage addiction oncogene SAMMSON disrupts vital mitochondrial functions in a cancer-cell-specific manner; this silencing is therefore expected to deliver highly effective and tissue-restricted antimelanoma therapeutic responses.

Leucci et al.

Straightforward and sensitive RT-qPCR based gene expression gene analysis of FFPE samples

Fragmented RNA from formalin-fixed paraffin-embedded (FFPE) tissue is a known obstacle to gene expression analysis. In this study, the impact of RNA integrity, gene-specific reverse transcription and targeted cDNA preamplification was quantified in terms of reverse transcription polymerase chain reaction (RT-qPCR) sensitivity by measuring 48 protein coding genes on eight duplicate cultured cancer cell pellet FFPE samples and twenty cancer tissue FFPE samples. More intact RNA modestly increased gene detection sensitivity by 1.6 fold (earlier detection by 0.7 PCR cycles, 95% CI = 0.593– 0.850). Application of gene-specific priming instead of whole transcriptome priming during reverse transcription further improved RT-qPCR sensitivity by a considerable 4.0 fold increase (earlier detection by 2.0 PCR cycles, 95% CI = 1.73–2.32). Targeted cDNA preamplification resulted in the strongest increase of RT-qPCR sensitivity and enabled earlier detection by an average of 172.4 fold (7.43 PCR cycles, 95% CI = 6.83–7.05). We conclude that gene-specific reverse transcription and targeted cDNA preamplification are adequate methods for accurate and sensitive RT-qPCR based gene expression analysis of FFPE material. The presented methods do not involve expensive or complex procedures and can be easily implemented in any routine RT-qPCR practice.

Zeka et al.
Boxplot analysis for comparison of gene expression levels in preamplified samples

RT-qPCR-based quantification of small non-coding RNAs

MicroRNAs (miRNAs) are small non-coding RNA molecules that negatively regulate messenger RNA (mRNA) translation into protein. MiRNAs play a key role in gene expression regulation, and their involvement in disease biology is well documented. This has fueled the development of numerous tools for the quantifi cation of miRNA expression levels. These tools are based on three technologies: (microarray) probe hybridization, RNA sequencing, and reverse transcription quantitative polymerase chain reaction (RT-qPCR). In this chapter, we describe a quantifi cation system based on RT-qPCR technology, which is currently considered as the most sensitive, fl exible, and accurate method for quantifi cation of not only miRNA but also RNA expression in general. To this purpose, we have divided the protocol in three sections: reverse transcription (RT) reaction, optional preamplifi cation (PA), and fi nally qPCR. Three qualitycontrol (QC) steps are implemented in this workfl ow for assessment of RNA extraction effi ciency, sample purity (e.g., absence of inhibitors), and inter-run variations, by examining the detection level of different spike-in synthetic miRNAs. We conclude by demonstrating raw data preprocessing and normalization using expression data obtained from high-throughput miRNA profi ling of human RNA samples.

Zeka F, Mestdagh P, Vandesompele J
Purified RNA concentration in serum from healthy individuals measured by NanoDrop 1000

The impact of disparate isolation methods for extracellular vesicles on downstream RNA profiling

Despite an enormous interest in the role of extracellular vesicles, including exosomes, in cancer and their use as biomarkers for diagnosis, prognosis, drug response and recurrence, there is no consensus on dependable isolation protocols. We provide a comparative evaluation of 4 exosome isolation protocols for their usability, yield and purity, and their impact on downstream omics approaches for biomarker discovery. OptiPrep density gradient centrifugation outperforms ultracentrifugation and ExoQuick and Total Exosome Isolation precipitation in terms of purity, as illustrated by the highest number of CD63-positive nanovesicles, the highest enrichment in exosomal marker proteins and a lack of contaminating proteins such as extracellular Argonaute-2 complexes. The purest exosome fractions reveal a unique mRNA profile enriched for translation, ribosome, mitochondrion and nuclear lumen function. Our results demonstrate that implementation of high purification techniques is a prerequisite to obtain reliable omics data and identify exosome-specific functions and biomarkers.

Van Deun J, Mestdagh P, Sormunen R, Cocquyt V, Vermaelen K, Vandesompele J, Bracke M, De Wever O, Hendrix A
Gene Set Enrichment Analysis of UC versus ODG

A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium

We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.

SEQC consortium, Vandesompele J, Hellemans J
probe expression as the sum of effects from transcripts with a probe-match

miRBase Tracker: keeping track of microRNA annotation changes

Since 2002, information on individual microRNAs (miRNAs), such as reference names and sequences, has been stored in miRBase, the reference database for miRNA annotation. As a result of progressive insights into the miRNome and its complexity, miRBase underwent addition and deletion of miRNA records, changes in annotated miRNA sequences and adoption of more complex naming schemes over time. Unfortunately, miRBase does not allow straightforward assessment of these ongoing miRNA annotation changes, which has resulted in substantial ambiguity regarding miRNA identity and sequence in public literature, in target prediction databases, and in content on various commercially available analytical platforms. As a result, correct interpretation, comparison and integration of miRNA study results are compromised, which we demonstrate here by assessing the impact of ignoring sequence annotation changes. To address this problem, we developed miRBase Tracker (, an easy-to-use online database that keeps track of all historical and current miRNA annotation present in the miRBase database. Three basic functionalities allow researchers to keep their miRNA annotation up to date, reannotate analytical miRNA platforms and link published results with outdated annotation to the latest miRBase release. We expect miRBase Tracker to increase the transparency and annotation accuracy in the field of miRNA research.

Van Peer G, Lefever S, Anckaert J, Beckers A, Rihani A, Van Goethem A, Volders P-J, Zeka F, Ongenaert M, Mestdagh P, Vandesompele J
miRBas Tracker

Some cautionary notes on the petite “Holy Grail” of molecular diagnostics

The "Holy Grail" of molecular diagnostics is the sensitive and specific detection of a disease-associated statble biomarker in non-invasively-acquired patient material. Without realizing it, we may actually have found it.

Vandesompele J, Mestdagh P

Evaluation of quantitative miRNRNA expression platforms in the microRNRNA quality control (miRQC) study

MicroRNAs are important negative regulators of protein-coding gene expression and have been studied intensively over the past years. Several measurement platforms have been developed to determine relative miRNRNA abundance in biological samples using different technologies such as small RNRNA sequencing, reverse transcription–quantitative PCR (RTRT-qPCR) and (microarray) hybridization. In this study, we systematically compared 12 commercially available platforms for analysis of microRNRNA expression. We measured an identical set of 20 standardized positive and negative control samples, including human universal reference RNRNA, human brain RNRNA and titrations thereof, human serum samples and synthetic spikes from microRNRNA family members with varying homology. We developed robust quality metrics to objectively assess platform performance in terms of reproducibility, sensitivity, accuracy, specificity and concordance of differential expression. The results indicate that each method has its strengths and weaknesses, which help to guide informed selection of a quantitative microRNRNA gene expression platform for particular study goals.

Mestdagh P, Hartmann N, Baeriswyl L, Andreasen D, Bernard N, Chen C, Cheo D, D'Andrade P, DeMayo M, Dennis L, Derveaux S, Y Feng, Fulmer-Smentek S, Gerstmayer B, Gouffon J, Grimley C, Lader E, Lee K Y, Luo S, Mouritzen P, Narayanan A, Patel S, Peiffer S,
Hierarchically clustered heatmap indicating miRNA concordance between all platform combinations

The need for transparency and good practices in the qPCR literature

Two surveys of over 1,700 publications whose authors use quantitative real-time PCR (qPCR) reveal a lack of transparent and comprehensive reporting of essential technical information. Reporting standards are significantly improved in publications that cite the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines, although such publications are still vastly outnumbered by those that do not.
Bustin S A, Benes V, Garson J A, Hellemans J, Hugget J, Kubista M, Mueller R, Nolan T, Pfaffl M W, Wittwer C T, Schjerling P, Day P J, Abreu M, Aguado B, Beaulieu J-F, Beckers A, Bogaert S, Browne J A, Carrasco-Ramiro F, Ceelen L, Ciborowski K, Cornillie
MIQE impact on commercial assays used in 2012–2013 publications

Single-Nucleotide Polymorphisms and Other Mismatches Reduce Performance of Quantitative PCR Assays

Background: Genome-sequencing studies have led to an immense increase in the number of known singlenucleotide polymorphisms (SNPs). Designing primers that anneal to regions devoid of SNPs has therefore become challenging. We studied the impact of one or more mismatches in primer-annealing sites on different quantitative PCR (qPCR)-related parameters, such as quantitative cycle (Cq), amplification efficiency, and reproducibility. METHODS: We used synthetic templates and primers to assess the effect of mismatches at primer-annealing sites on qPCR assay performance. Reactions were performed with 5 commercially available master mixes. We studied the effects of the number, type, and position of priming mismatches on Cq value, PCR efficiency, reproducibility, and yield.

Results: The impact of mismatches was most pronounced for the number of mismatched nucleotides and for their distance from the 3 end of the primer. In addition, having 4 mismatches in a single primer or having 3 mismatches in one primer and 2 in the other was required to block a reaction completely. Finally, the degree of the mismatch effect was concentration independent for single mismatches, whereas concentration independence failed at higher template concentrations as the number of mismatches increased. 

Conclusions: Single mismatches located 5 bp from the 3 end have a moderate effect on qPCR amplification and can be tolerated. This finding, together with the concentration independence for single mismatches and the complete blocking of the PCR reaction for 3 mismatches, can help to chart mismatch behavior in qPCR reactions and increase the rate of successful primer design for sequences with a high SNP density or for homologous regions of sequence.

Lefever S, Pattyn F, Hellemans J, Vandesompele J
dCq values between results for perfect-match (PM) and mismatch (MM) reactions (rxns) as a function of MM type and master mix

Effective Alu Repeat Based RT-Qpcr Normalization in Cancer Cell Perturbation Experiments

Background: Measuring messenger RNA (mRNA) levels using the reverse transcription quantitative polymerase chain reaction (RT-qPCR) is common practice in many laboratories. A specific set of mRNAs as internal control reference genes is considered as the preferred strategy to normalize RT-qPCR data. Proper selection of reference genes is a critical issue, especially in cancer cells that are subjected to different in vitro manipulations. These manipulations may result in dramatic alterations in gene expression levels, even of assumed reference genes. In this study, we evaluated the expression levels of 11 commonly used reference genes as internal controls for normalization of 19 experiments that include neuroblastoma, TALL, melanoma, breast cancer, non small cell lung cancer (NSCL), acute myeloid leukemia (AML), prostate cancer, colorectal cancer, and cervical cancer cell lines subjected to various perturbations.

Results: The geNorm algorithm in the software package qbase+ was used to rank the candidate reference genes according to their expression stability. We observed that the stability of most of the candidate reference genes varies greatly in perturbation experiments. Expressed Alu repeats show relatively stable expression regardless of experimental condition.  These Alu repeats are ranked among the best reference assays in all perturbation experiments and display acceptable average expression stability values (M,0.5).

Conclusions: We propose the use of Alu repeats as a reference assay when performing cancer cell perturbation experiments.

Rihani A, Van Maerken T, Pattyn F, Van Peer G, Beckers A, De Brouwer S, Kumps C, Mets E, Van der Meulen J, Rondou P, Leonelli C, Mestdagh P, Speleman F, Vandesompele J
Average expression stability values of the reference genes

Guidelines for Minimum Information for Publication of Quantitative Digital PCR Experiments

This report addresses known requirements for dPCR that have already been identified during this early stage of its development and commercial implementation. Adoption of these guidelines by the scientific community will help to standardize experimental protocols, maximize efficient utilization of resources, and enhance the impact of this promising new technology.

Hugget J, Foy C A, Benes V, Emslie K, Garson J A, Haynes R, Hellemans J, Kubista M, Mueller R, Nolan T, Pfaffl M W, Shipley G L, Vandesompele J, Wittwer C T, Bustin S A
Example of data output from a droplet dPCR instrument (Bio-Rad QX100)

Accurate RT-qPCR gene expression analysis on cell culture lysates

Gene expression quantification on cultured cells using the reverse transcription quantitative polymerase chain reaction (RT-qPCR) typically involves an RNA purification step that limits sample processing throughput and precludes parallel analysis of large numbers of samples. An approach in which cDNA synthesis is carried out on crude cell lysates instead of on purified RNA samples can offer a fast and straightforward alternative. Here, we evaluate such an approach, benchmarking Ambion’s Cells-to-CT kit with the classic workflow of RNA purification and cDNA synthesis, and demonstrate its good accuracy and superior sensitivity.

Van Peer G, Mestdagh P, Vandesompele J
DNAse treatment

miRNA expression profiling - from reference genes to global mean normalization

MicroRNAs (miRNAs) are an important class of gene regulators, acting on several aspects of cellular function such as differentiation, cell cycle control, and stemness. These master regulators constitute an invaluable source of biomarkers, and several miRNA signatures correlating with patient diagnosis, prognosis, and response to treatment have been identifi ed. Within this exciting fi eld of research, whole-genome RT-qPCRbased miRNA profi ling in combination with a global mean normalization strategy has proven to be the most sensitive and accurate approach for high-throughput miRNA profi ling (Mestdagh et al., Genome Biol 10:R64, 2009). In this chapter, we summarize the power of the previously described global mean normalization method in comparison to the multiple reference gene normalization method using the most stably expressed small RNA controls. In addition, we compare the original global mean method to a modified global mean normalization strategy based on the attribution of equal weight to each individual miRNA during normalization. This modified algorithm is implemented in Biogazelle’s qbasePLUS software and is presented here for the first time.
D’haene B, Mestdagh P, Hellemans J, Vandesompele J
Average fold change expression difference of each miRNA in neuroblastoma with respect to the MYCN amplifi cation status

The microRNA body map: dissecting microRNA function through integrative genomics

While a growing body of evidence implicates regulatory miRNA modules in various aspects of human disease and development, insights into specific miRNA function remain limited. Here, we present an innovative approach to elucidate tissue-specific miRNA functions that goes beyond miRNA target prediction and expression correlation. This approach is based on a multi-level integration of corresponding miRNA and mRNA gene expression levels, miRNA target prediction, transcription factor target prediction and mechanistic models of gene network regulation. Predicted miRNA functions were either validated experimentally or compared to published data. The predicted miRNA functions are accessible in the miRNA bodymap, an interactive online compendium and mining tool of high-dimensional newly generated and published miRNA expression profiles. The miRNA bodymap enables prioritization of candidate miRNAs based on their expression pattern or functional annotation across tissue or disease subgroup. The miRNA bodymap project provides users with a single one-stop data-mining solution and has great potential to become a community resource.

Mestdagh et al.
Mechanistic models of miRNA-directed gene expression regulation

Measurable impact of RNA quality on gene expression results from quantitative PCR

Compromised RNA quality is suggested to lead to unreliable results in gene expression studies. Therefore, assessment of RNA integrity and purity is deemed essential prior to including samples in the analytical pipeline. This may be of particular importance when diagnostic, prognostic or therapeutic conclusions depend on such analyses. In this study, the comparative value of six RNA quality parameters was determined using a large panel of 740 primary tumour samples for which real-time quantitative PCR gene expression results were available. The tested parameters comprise of microfluidic capillary electrophoresis based 18S/28S rRNA ratio and RNA Quality Index value, HPRT1 5'–3' difference in quantification cycle (Cq) and HPRT1 3' Cq value based on a 5'/3' ratio mRNA integrity assay, the Cq value of expressed Alu repeat sequences and a normalization factor based on the mean expression level of four reference genes. Upon establishment of an innovative analytical framework to assess impact of RNA quality, we observed a measurable impact of RNA quality on the variation of the reference genes, on the significance of differential expression of prognostic marker genes between two cancer patient risk groups, and on risk classification performance using a multigene signature. This study forms the basis for further rational assessment of reverse transcription quantitative PCR based results in relation to RNA quality.

Vermeulen J, De Preter K, Lefever S, Nuytens J, De Vloed F, Derveaux S, Hellemans J, Speleman F, Vandesompele J
The effect of RNA quality on the significance of differential expression of a marker gene between tumours from two risk groups of neuroblastoma patients

RNA pre-amplification enables large-scale RT-qPCR gene-expression studies on limiting sample amounts

Vermeulen J, Derveaux S, Lefever S, De Smet E, De Preter K, Yigit N, De Paepe A, Pattyn F, Speleman F, Vandesompele J
Preservation of differential expression after pre-amplification

External oligonucleotide standards enable cross laboratory comparison and exchange of real-time quantitative PCR data

The quantitative polymerase chain reaction (qPCR) is widely utilized for gene expression analysis. However, the lack of robust strategies for cross laboratory data comparison hinders the ability to collaborate or perform large multicentre studies conducted at different sites. In this study we introduced and validated a workflow that employs universally applicable, quantifiable external oligonucleotide standards to address this question. Using the proposed standards and data-analysis procedure, we obtained a perfect concordance between expression values from eight different genes in 366 patient samples measured on three different qPCR instruments and matching software, reagents, plates and seals, demonstrating the power of this strategy to detect and correct inter-run variation and to enable exchange of data between different laboratories, even when not using the same qPCR platform.

Vermeulen J, Pattyn F, De Preter K, Vercruysse L, Derveaux S, Mestdagh P, Lefever S, Hellemans J, Speleman F, Vandesompele J

A novel and universal method for microRNA RT-qPCR data normalization

Gene expression analysis of microRNA molecules is becoming increasingly important. In this study we assess the use of the mean expression value of all expressed microRNAs in a given sample as a normalization factor for microRNA real-time quantitative PCR data and compare its performance to the currently adopted approach. We demonstrate that the mean expression value outperforms the current normalization strategy in terms of better reduction of technical variation and more accurate appreciation of biological changes.

Mestdagh P, Van Vlierberghe P, De Weer A, Muth D, Westermann F, Speleman F, Vandesompele J
Cumulative distribution of miRNA coefficient of variation (CV) values

The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments

BACKGROUND: Currently, a lack of consensus exists on how best to perform and interpret quantitative realtime PCR (qPCR) experiments. The problem is exacerbated by a lack of sufficient experimental detail in many publications, which impedes a reader’s ability to evaluate critically the quality of the results presented or to repeat the experiments.

CONTENT: The Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines target the reliability of results to help ensure the integrity of the scientific literature, promote consistency between laboratories, and increase experimental transparency. MIQE is a set of guidelines that describe the minimum information necessary for evaluating qPCR experiments. Included is a checklist to accompany the initial submission of a manuscript to the publisher. By providing all relevant experimental conditions and assay characteristics, reviewers can assess the validity of the protocols used. Full disclosure of all reagents, sequences, and analysis methods is necessary to enable other investigators to reproduce results. MIQE details should be published either in abbreviated form or as an online supplement.

Bustin S A, Benes V, Garson J A, Hellemans J, Hugget J, Kubista M, Mueller R, Nolan T, Pfaffl M W, Shipley G L, Vandesompele J, Wittwer C T

Standardization of real-time PCR gene expression data from independent biological replicates

Gene expression analysis by quantitative reverse transcription PCR (qRT–PCR) allows accurate quantifications of messenger RNA (mRNA) levels over different samples. Corrective methods for different steps in the qRT–PCR reaction have been reported; however, statistical analysis and presentation of substantially variable biological repeats present problems and are often not meaningful, for example, in a biological system such as mouse embryonic stem cell differentiation. Based on a series of sequential corrections, including log transformation, mean centering, and autoscaling, we describe a robust and powerful standardization method that can be used on highly variable data sets to draw statistically reliable conclusions.

Willems E, Leyns L, Vandesompele J

qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data

Although quantitative PCR (qPCR) is becoming the method of choice for expression profiling of selected genes, accurate and straightforward processing of the raw measurements remains a major hurdle. Here we outline advanced and universally applicable models for relative quantification and inter-run calibration with proper error propagation along the entire calculation track. These models and algorithms are implemented in qBase, a free program for the management and  automated analysis of qPCR data.

Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes

Background: Gene-expression analysis is increasingly important in biological research, with realtime reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Given the increased sensitivity, reproducibility and large dynamic range of this methodology, the requirements for a proper internal control gene for normalization have become increasingly stringent. Although housekeeping gene expression has been reported to vary considerably, no systematic survey has properly determined the errors related to the common practice of using only one control gene, nor presented an adequate way of working around this problem.

Results: We outline a robust and innovative strategy to identify the most stably expressed control genes in a given set of tissues, and to determine the minimum number of genes required to calculate a reliable normalization factor. We have evaluated ten housekeeping genes from different abundance and functional classes in various human tissues, and demonstrated that the conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes was validated as an accurate normalization factor by analyzing publicly available microarray data. 

Conclusions: The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which, among other things, opens up the possibility of studying the biological relevance of small expression differences.

Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F
Pairwise variation (Vn/n+1) analysis between the normalization factors NFn and NFn+1 to determine the number of control genes required for accurate normalization (arrowhead = optimal number of control genes for normalization)