https://www.benchmarking.uni-freiburg.de/api.php?action=feedcontributions&user=Ckreutz&feedformat=atomBenchmark-Wiki - User contributions [en]2021-08-01T14:37:49ZUser contributionsMediaWiki 1.31.0https://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=781Literature Studies2021-06-22T09:34:28Z<p>Ckreutz: </p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
https://journals.tubitak.gov.tr/biology/issues/biy-21-45-2/biy-45-2-1-2008-8.pdf<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|-<br />
| 2021 || Dowell JA || [[Benchmarking Quantitative Performance in Label-Free Proteomics]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
<br />
=== Other Studies ===<br />
https://link.springer.com/article/10.1007/s00521-021-06188-z<br />
<br />
https://www.diva-portal.org/smash/get/diva2:1568674/FULLTEXT01.pdf<br />
<br />
https://www.sciencedirect.com/science/article/pii/S2405471221002076<br />
<br />
https://www.tandfonline.com/doi/abs/10.1080/15476286.2021.1940047<br />
<br />
https://escholarship.org/content/qt4091n16g/qt4091n16g.pdf</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=780Literature Studies2021-06-22T08:08:28Z<p>Ckreutz: </p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
https://journals.tubitak.gov.tr/biology/issues/biy-21-45-2/biy-45-2-1-2008-8.pdf<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|-<br />
| 2021 || Dowell JA || [[Benchmarking Quantitative Performance in Label-Free Proteomics]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
<br />
=== Other Studies ===<br />
https://link.springer.com/article/10.1007/s00521-021-06188-z<br />
<br />
https://www.diva-portal.org/smash/get/diva2:1568674/FULLTEXT01.pdf<br />
<br />
https://www.sciencedirect.com/science/article/pii/S2405471221002076<br />
<br />
https://www.tandfonline.com/doi/abs/10.1080/15476286.2021.1940047</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=779Literature Studies2021-05-01T16:49:03Z<p>Ckreutz: /* Results from Literature */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
https://journals.tubitak.gov.tr/biology/issues/biy-21-45-2/biy-45-2-1-2008-8.pdf<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|-<br />
| 2021 || Dowell JA || [[Benchmarking Quantitative Performance in Label-Free Proteomics]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Benchmarking_Quantitative_Performance_in_Label-Free_Proteomics&diff=778Benchmarking Quantitative Performance in Label-Free Proteomics2021-02-02T14:48:12Z<p>Ckreutz: Created page with "__NUMBEREDHEADINGS__ === Citation === Dowell JA, Wright LJ, Armstrong EA, Denu JM (2021). Benchmarking Quantitative Performance in Label-Free Proteomics. ACS Omega. [https://d..."</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Dowell JA, Wright LJ, Armstrong EA, Denu JM (2021). Benchmarking Quantitative Performance in Label-Free Proteomics. ACS Omega.<br />
[https://doi.org/10.1021/acsomega.0c04030 doi:10.1021/acsomega.0c04030]<br />
<br />
<br />
=== Summary ===<br />
Briefly describe the scope of the paper, i.e. the field of research and/or application.<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
You can describe general design aspects here.<br />
The study designs for describing specific outcomes are listed in the following subsections:<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=777Literature Studies2021-02-02T14:44:13Z<p>Ckreutz: /* Omics Workflows */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|-<br />
| 2021 || Dowell JA || [[Benchmarking Quantitative Performance in Label-Free Proteomics]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=776Literature Studies2021-02-02T14:41:42Z<p>Ckreutz: /* Results from Literature */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|}<br />
<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=775Literature Studies2021-02-02T14:40:59Z<p>Ckreutz: /* Results from Literature */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|}<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=774Literature Studies2021-02-02T14:39:51Z<p>Ckreutz: /* Results from Literature */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|}<br />
<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=773Literature Studies2021-02-02T14:38:32Z<p>Ckreutz: /* Results from Literature */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|}<br />
<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=772Literature Studies2021-02-02T14:36:24Z<p>Ckreutz: /* Results from Literature */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}<br />
<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=NormalyzerDE:_Online_Tool_for_Improved_Normalization_of_Omics_Expression_Data_and_High-Sensitivity_Differential_Expression_Analysis&diff=771NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis2021-02-02T14:34:27Z<p>Ckreutz: /* Citation */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Willforss J, Chawade A, Levander F. NormalyzerDE: Online tool for improved normalization of omics expression data and high-sensitivity differential expression analysis. Journal of proteome research. 2018 Oct 2;18(2):732-40.<br />
<br />
[https://doi.org/10.1021/acs.jproteome.8b00523 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Briefly describe the scope of the paper, i.e. the field of research and/or application.<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
You can describe general design aspects here.<br />
The study designs for describing specific outcomes are listed in the following subsections:<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=The_effects_of_nonignorable_missing_data_on_label-free_mass_spectrometry_proteomics_experiments&diff=770The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments2021-02-02T14:33:15Z<p>Ckreutz: /* Citation */</p>
<hr />
<div>=== Citation ===<br />
O'Brien JJ, Gunawardena HP, Paulo JA, Chen X, Ibrahim JG, Gygi SP, Qaqish BF (2018). The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments. Ann Appl Stat. 12(4):2075-95.<br />
[https://doi.org/10.1214/18-AOAS1144 doi: 10.1214/18-AOAS1144]<br />
<br />
=== Summary ===<br />
In this paper parameter contrasts due to missing data are analyzed and a Bayesian selection model to overcome these contrasts and recover interblock information is introduced. The proposed model is compared to other imputation strategies as well as complete-case analyses.<br />
<br />
=== Study outcomes ===<br />
The introduced selection model for proteomics (SMP) tries to capture the missing data mechanisms of the specific dataset.<br />
<br />
==== Outcome O1 ====<br />
The SMP model improves accuracy, depth of discovery and internal coverage (Figures 1,2,3)<br />
<br />
==== Outcome O2 ====<br />
The mixed model and two-way ANOVA, which rely on intrablock estimation, outperform the one-way ANOVA and other imputation methods (Min,Mean,Svd,Knn), which rely on interblock information, on all datasets (Figures 1,2,3)<br />
<br />
==== Further outcomes ====<br />
Missing data leads to contrast bias between conditions.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
Separate analysis of imputation performance if protein contrasts are estimable or inestimable.<br />
<br />
9 imputation algorithms are compared: SMP, ANOVA (1+2-way), mean, column minimum, peptide minimum, svd, knn, mixture model, although most of them are quite simple models.<br />
<br />
Accuracy as well as interval coverage are assessed.<br />
<br />
=== Further comments and aspects ===<br />
<br />
Data simulation favors SMP model.<br />
<br />
=== References ===<br />
Model similar to:<br />
<br />
Luo R, Colangelo CM, Sessa WC, Zhao H. Bayesian Analysis of iTRAQ Data with Nonrandom Missingness: Identification of Differentially Expressed Proteins. Stat Biosci. 1(2):228-45.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_comparative_study_of_evaluating_missing_value_imputation_methods_in_label-free_proteomics&diff=769A comparative study of evaluating missing value imputation methods in label-free proteomics2021-02-02T14:32:00Z<p>Ckreutz: /* Citation */</p>
<hr />
<div>=== Citation ===<br />
Jin L, Bi Y, Hu C, Qu J, Shen S, Wang X, Tian Y (2021). A comparative study of evaluating missing value imputation methods in label-free proteomics. Scientific reports, 11(1), 1-11.<br />
[https://doi.org/10.1038/s41598-021-81279-4 doi: 10.1038/s41598-021-81279-4]<br />
<br />
=== Summary ===<br />
<br />
=== Study outcomes ===<br />
<br />
==== Outcome O1 ====<br />
<br />
==== Outcome O2 ====<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_comparative_study_of_evaluating_missing_value_imputation_methods_in_label-free_proteomics&diff=768A comparative study of evaluating missing value imputation methods in label-free proteomics2021-02-02T14:31:24Z<p>Ckreutz: </p>
<hr />
<div>=== Citation ===<br />
Jin L, Bi Y, Hu C, Qu J, Shen S, Wang X, Tian Y (2021). A comparative study of evaluating missing value imputation methods in label-free proteomics. Scientific reports, 11(1), 1-11.<br />
[https://doi.org/10.1038/s41598-021-81279-4: doi: 10.1038/s41598-021-81279-4]<br />
<br />
<br />
=== Summary ===<br />
<br />
=== Study outcomes ===<br />
<br />
==== Outcome O1 ====<br />
<br />
==== Outcome O2 ====<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_comparative_study_of_evaluating_missing_value_imputation_methods_in_label-free_proteomics&diff=767A comparative study of evaluating missing value imputation methods in label-free proteomics2021-02-02T14:25:15Z<p>Ckreutz: Created page with "=== Citation === O'Brien JJ, Gunawardena HP, Paulo JA, Chen X, Ibrahim JG, Gygi SP, Qaqish BF (2018). The effects of nonignorable missing data on label-free mass spectrometry..."</p>
<hr />
<div>=== Citation ===<br />
O'Brien JJ, Gunawardena HP, Paulo JA, Chen X, Ibrahim JG, Gygi SP, Qaqish BF (2018). The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments. Ann Appl Stat. 12(4):2075-95.<br />
[https://doi.org/10.1214/18-AOAS1144: doi: 10.1214/18-AOAS1144]<br />
<br />
<br />
=== Summary ===<br />
<br />
=== Study outcomes ===<br />
<br />
==== Outcome O1 ====<br />
<br />
==== Outcome O2 ====<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=766Literature Studies2021-02-02T14:23:43Z<p>Ckreutz: /* Imputation methods for missing values */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|-<br />
| 2021 || Jin L || [[A comparative study of evaluating missing value imputation methods in label-free proteomics]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Simultaneous_Improvement_in_the_Precision,_Accuracy_and_Robustness_of_Label-free_Proteome_Quantification_by_Optimizing_Data_Manipulation_Chains&diff=765Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains2020-11-08T17:23:55Z<p>Ckreutz: Created page with "__NUMBEREDHEADINGS__ === Citation === Tang J, Fu J, Wang Y, Luo Y, Yang Q, Li B, Tu G, Hong J, Cui X, Chen Y, Yao L. Simultaneous improvement in the precision, accuracy, and r..."</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Tang J, Fu J, Wang Y, Luo Y, Yang Q, Li B, Tu G, Hong J, Cui X, Chen Y, Yao L. Simultaneous improvement in the precision, accuracy, and robustness of label-free proteome quantification by optimizing data manipulation chains. Molecular & Cellular Proteomics. 2019 Aug 1;18(8):1683-99.<br />
<br />
[https://doi.org/10.1074/mcp.RA118.001169 Permanent link to the paper]<br />
<br />
<br />
=== Summary ===<br />
Briefly describe the scope of the paper, i.e. the field of research and/or application.<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
You can describe general design aspects here.<br />
The study designs for describing specific outcomes are listed in the following subsections:<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=764Literature Studies2020-11-08T17:21:18Z<p>Ckreutz: /* Omics Workflows */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|-<br />
| 2020 || Tang || [[Simultaneous Improvement in the Precision, Accuracy and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=763Literature Studies2020-03-09T12:56:48Z<p>Ckreutz: /* Identifying differential features */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2014 || Ching || [[Power analysis and sample size estimation for RNA-Seq differential expression]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=762Literature Studies2020-03-09T12:20:03Z<p>Ckreutz: /* Identifying differential features */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|-<br />
| 2020 || Li || [[Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=The_impact_of_sample_imbalance_on_identifying_differentially_expressed_genes&diff=761The impact of sample imbalance on identifying differentially expressed genes2020-03-04T13:42:33Z<p>Ckreutz: Created page with "__NUMBEREDHEADINGS__ === Citation === Kun Yang, Jianzhong Li and Hong Gao, The impact of sample imbalance on identifying differentially expressed genes, 2006, BMC Bioinformati..."</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Kun Yang, Jianzhong Li and Hong Gao, The impact of sample imbalance on identifying differentially expressed genes, 2006, BMC Bioinformatics, 7(Suppl 4):S8.<br />
<br />
[https://doi.org/10.1186/1471-2105-7-S4-S8 Permanent link to the paper]<br />
<br />
<br />
=== Summary ===<br />
Briefly describe the scope of the paper, i.e. the field of research and/or application.<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
You can describe general design aspects here.<br />
The study designs for describing specific outcomes are listed in the following subsections:<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=760Literature Studies2020-03-04T13:31:38Z<p>Ckreutz: /* Identifying differential features */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2006 || Yang || [[The impact of sample imbalance on identifying differentially expressed genes]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Predicting_Breast_Cancer_Survivability_Using_Data_Mining_Techniques&diff=759Predicting Breast Cancer Survivability Using Data Mining Techniques2020-02-28T16:49:12Z<p>Ckreutz: /* Citation */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Sarvestani, A. S., Safavi, A. A., Parandeh, N. M., & Salehi, M., Predicting breast cancer survivability using data mining techniques, 2010, 2nd International Conference on Software Technology and Engineering, 2, V2-227.<br />
<br />
[https://doi.org/10.1109/ICSTE.2010.5608818 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
"The performance of ... self organizing map (SOM), radial basis function network (RBF), general regression neural network (GRNN) and probabilistic neural network (PNN) are tested both on the Wisconsin breast cancer data (WBCD) and on the Shiraz Namazi Hospital breast cancer data (NHBCD)."<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
You can describe general design aspects here.<br />
The study designs for describing specific outcomes are listed in the following subsections:<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Predicting_Breast_Cancer_Survivability_Using_Data_Mining_Techniques&diff=758Predicting Breast Cancer Survivability Using Data Mining Techniques2020-02-28T16:49:02Z<p>Ckreutz: Created page with "__NUMBEREDHEADINGS__ === Citation === Sarvestani, A. S., Safavi, A. A., Parandeh, N. M., & Salehi, M., Predicting breast cancer survivability using data mining techniques, 20..."</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Sarvestani, A. S., Safavi, A. A., Parandeh, N. M., & Salehi, M., Predicting breast cancer survivability using data mining techniques, 2010, 2nd International Conference on Software Technology and Engineering, 2, V2-227.<br />
<br />
[https://doi.org/10.1109/ICSTE.2010.5608818 Permanent link to the paper]<br />
<br />
<br />
=== Summary ===<br />
"The performance of ... self organizing map (SOM), radial basis function network (RBF), general regression neural network (GRNN) and probabilistic neural network (PNN) are tested both on the Wisconsin breast cancer data (WBCD) and on the Shiraz Namazi Hospital breast cancer data (NHBCD)."<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
You can describe general design aspects here.<br />
The study designs for describing specific outcomes are listed in the following subsections:<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=757Literature Studies2020-02-28T16:46:02Z<p>Ckreutz: </p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data form the molecular biology field (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=756Literature Studies2020-02-28T16:45:24Z<p>Ckreutz: Remove Harper, it is not based on high-througput data</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Comparison_of_statistical_methods_for_classification_of_ovarian_cancer_using_mass_spectrometry_data&diff=755Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data2020-02-28T16:41:23Z<p>Ckreutz: /* Outcome O1 */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Baolin Wu, Tom Abbott, David Fishman, Walter McMurray, Gil Mor, Kathryn Stone, David Ward, Kenneth Williams, Hongyu Zhao, Comparison of statistical methods forclassification of ovarian cancer usingmass spectrometry data, 2003, Bioinformatics, 19(13), 1636–1643.<br />
<br />
[https://doi.org/10.1093/bioinformatics/btg210 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
The following classification methods were assessed in the context of mass spectrometry (MS) data:<br />
* linear discriminant analysis<br />
* quadratic discriminant analysis<br />
* k-nearest neighbor classifier<br />
* bagging and boosting classification trees<br />
* support vector machine<br />
* random forest (RF)<br />
<br />
Ovarian cancer and control serum samples were intended to be classified. <br />
Assessment was performed by crossvalidation.<br />
<br />
=== Study outcomes ===<br />
Predictions error were around 10-20%.<br />
==== Outcome O1 ====<br />
* Overall, the methods perform better if RF is used for feature selection (with few exceptions)<br />
* Overall, the methods perform better 25 features were used for classifcation, instead of 15 (with few exceptions)<br />
* Some approaches have a large variance in performance (Bagging, ARC) others had rather small variance (NN, SVM)<br />
* The .632+ estimator yielded slightly decreases estimation errors (better performances) and smaller variances compared to 10-fold CV.<br />
* Overall, RF had the best performance when feature selection was performed using RF<br />
<br />
Outcomes O1 are presented as Figures 4 and 5 in the original publication.<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
A single data set containing measurments from 47 patients with ovarian cancer and from 44 normal patients were used for the analysis. <br />
<br />
Two crossvaliation methods were used:<br />
* 10-fold crossvalidation<br />
* Bootstrap and the .632+ estimator<br />
* 15 or 25 features were selected by either t-statistics or random forests (RF)<br />
<br />
=== Further comments and aspects ===<br />
The data was generated at a time, where MS data quality and data processing was in the development stage. Data processing approaches for the raw spectra like MaxQuant, OpenMS etc were not yet available. <br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Comparison_of_statistical_methods_for_classification_of_ovarian_cancer_using_mass_spectrometry_data&diff=754Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data2020-02-28T16:41:10Z<p>Ckreutz: Created page with "__NUMBEREDHEADINGS__ === Citation === Baolin Wu, Tom Abbott, David Fishman, Walter McMurray, Gil Mor, Kathryn Stone, David Ward, Kenneth Williams, Hongyu Zhao, Comparison of s..."</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Baolin Wu, Tom Abbott, David Fishman, Walter McMurray, Gil Mor, Kathryn Stone, David Ward, Kenneth Williams, Hongyu Zhao, Comparison of statistical methods forclassification of ovarian cancer usingmass spectrometry data, 2003, Bioinformatics, 19(13), 1636–1643.<br />
<br />
[https://doi.org/10.1093/bioinformatics/btg210 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
The following classification methods were assessed in the context of mass spectrometry (MS) data:<br />
* linear discriminant analysis<br />
* quadratic discriminant analysis<br />
* k-nearest neighbor classifier<br />
* bagging and boosting classification trees<br />
* support vector machine<br />
* random forest (RF)<br />
<br />
Ovarian cancer and control serum samples were intended to be classified. <br />
Assessment was performed by crossvalidation.<br />
<br />
=== Study outcomes ===<br />
Predictions error were around 10-20%.<br />
==== Outcome O1 ====<br />
* Overall, the methods perform better if RF is used for feature selection (with few exceptions)<br />
* Overall, the methods perform better 25 features were used for classifcation, instead of 15 (with few exceptions)<br />
* Some approaches have a large variance in performance (Bagging, ARC) others had rather small variance (NN, SVM)<br />
* The .632+ estimator yielded slightly decreases estimation errors (better performances) and smaller variances compared to 10-fold CV.<br />
* Overall, RF had the best performance when feature selection was performed using RF<br />
<br />
Outcomes O1 are presented as Figures 4 and 5 in the original publication. <br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
A single data set containing measurments from 47 patients with ovarian cancer and from 44 normal patients were used for the analysis. <br />
<br />
Two crossvaliation methods were used:<br />
* 10-fold crossvalidation<br />
* Bootstrap and the .632+ estimator<br />
* 15 or 25 features were selected by either t-statistics or random forests (RF)<br />
<br />
=== Further comments and aspects ===<br />
The data was generated at a time, where MS data quality and data processing was in the development stage. Data processing approaches for the raw spectra like MaxQuant, OpenMS etc were not yet available. <br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=753Literature Studies2020-02-28T16:18:18Z<p>Ckreutz: /* Omics Workflows */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Harper || [[A review and comparison of classification algorithms for medical decision making]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || Cleary || [[Comparing Variant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || Merino || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || Vieth || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || Krishnan || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=752Literature Studies2020-02-28T16:16:42Z<p>Ckreutz: /* Preprocessing high-throughput data */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Harper || [[A review and comparison of classification algorithms for medical decision making]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || Bolstad || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || Gentzel || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || Irizarry || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || Freudenberg || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || Shippy || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || Goebels || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || Autio || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || Fan || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || Li || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || Bullard || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || Risso || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || McCall || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || Dillies || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || Chawade || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=751Literature Studies2020-02-28T16:11:58Z<p>Ckreutz: /* Classification */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2003 || Wu || [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
|-<br />
| 2005 || Harper || [[A review and comparison of classification algorithms for medical decision making]]<br />
|-<br />
| 2005 || Bellaachia|| [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
|}<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=750Literature Studies2020-02-28T16:10:31Z<p>Ckreutz: /* Identifying differential features */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2006 || Guo || [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
|-<br />
| 2010 || Su || [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
|-<br />
| 2017 || van Ooijen || [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
|-<br />
| 2017 || Wang || [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
|-<br />
| 2017 || Wreczycka || [[Strategies for analyzing bisulfite sequencing data]]<br />
|-<br />
| 2018 || Tran || [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
|}<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=749Literature Studies2020-02-28T16:07:47Z<p>Ckreutz: /* Identifying sets of features (e.g. gene set analyses) */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || Ackermann || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || Tintle || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || Mathur || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || Geistlinger || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=748Literature Studies2020-02-28T16:06:40Z<p>Ckreutz: /* Identifying sets of features (e.g. gene set analyses) */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2009 || First Author || [[A general modular framework for gene set enrichment analysis]]<br />
|-<br />
| 2009 || First Author || [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
|-<br />
| 2018 || First Author || [[Gene set analysis methods: a systematic comparison]]<br />
|-<br />
| 2020 || First Author || [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
|}<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=747Literature Studies2020-02-28T16:05:20Z<p>Ckreutz: /* Dimension reduction */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Janecek || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || Fernández-Gutiérrez || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=746Literature Studies2020-02-28T16:04:45Z<p>Ckreutz: /* Identifying differential regions (e.g. DMRs) */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || Peters || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || Bhasin || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || Jühling || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || Kolde || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || Ayyala || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || Gaspar || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || Condon || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || Catoni || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || Gong || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || First Author || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || First Author || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=745Literature Studies2020-02-28T16:02:21Z<p>Ckreutz: /* Identifying differential regions (e.g. DMRs) */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2015 || First Author || [[De novo identification of differentially methylated regions in the human genome]]<br />
|-<br />
| 2015 || First Author || [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
|-<br />
| 2015 || First Author || [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
|-<br />
| 2016 || First Author || [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
|-<br />
| 2016 || First Author || [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
|-<br />
| 2017 || First Author || [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
|-<br />
| 2018 || First Author || [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
|-<br />
| 2018 || First Author || [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
|-<br />
| 2018 || First Author || [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
|}<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || First Author || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || First Author || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=744Literature Studies2020-02-28T15:59:32Z<p>Ckreutz: /* Dimension reduction */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
''' 2015 '''<br />
* [[De novo identification of differentially methylated regions in the human genome]]<br />
* [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
* [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
''' 2016 '''<br />
* [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
* [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
''' 2017 '''<br />
* [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
''' 2018 '''<br />
* [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
* [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
* [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || First Author || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
| 2015 || First Author || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=743Literature Studies2020-02-28T15:58:45Z<p>Ckreutz: /* Dimension reduction */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
''' 2015 '''<br />
* [[De novo identification of differentially methylated regions in the human genome]]<br />
* [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
* [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
''' 2016 '''<br />
* [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
* [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
''' 2017 '''<br />
* [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
''' 2018 '''<br />
* [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
* [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
* [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
! 2008 || First Author || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
! 2015 || First Author || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=742Literature Studies2020-02-28T15:58:10Z<p>Ckreutz: /* Dimension reduction */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
''' 2015 '''<br />
* [[De novo identification of differentially methylated regions in the human genome]]<br />
* [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
* [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
''' 2016 '''<br />
* [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
* [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
''' 2017 '''<br />
* [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
''' 2018 '''<br />
* [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
* [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
* [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
! 2008 || First Author || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
! 2015 || First Author || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=741Literature Studies2020-02-28T15:56:44Z<p>Ckreutz: /* Dimension reduction */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
''' 2015 '''<br />
* [[De novo identification of differentially methylated regions in the human genome]]<br />
* [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
* [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
''' 2016 '''<br />
* [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
* [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
''' 2017 '''<br />
* [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
''' 2018 '''<br />
* [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
* [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
* [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
{| class="wikitable sortable"<br />
|-<br />
! 2008 || First Author || [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
|-<br />
! 2015 || First Author || [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
|-<br />
|}<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Beal || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Dondelinger || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Penas || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Degasperi || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Fröhlich || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schälte || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Loos || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Stapor || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Villaverde || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Hass || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Villaverde || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Lines || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Stapor || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Wu || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Pitt || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Loos || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Clairon || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schmiester || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Castro || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias]]<br />
|-<br />
| 2003 || || [[Preprocessing of tandem mass spectrometric data to support automatic protein identification]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2006 || Du P || [[Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2010 || Armananzas R || [[Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2011 || Zhang ZM || [[Peak alignment using wavelet pattern matching and differential evolution]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2013 || García-Torres M || [[Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data]]<br />
|-<br />
| 2013 || Horvatovich P || [[Bioinformatics and Statistics: LC‐MS (/MS) Data Preprocessing for Biomarker Discovery]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2014 || Aggio RB || [[Identifying and quantifying metabolites by scoring peaks of GC-MS data]]<br />
|-<br />
| 2014 || Cox J || [[Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2015 || Wang T || [[A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data]]<br />
|-<br />
| 2015 || Lu J || [[Improved Peak Detection and Deconvolution of Native Electrospray Mass Spectra from Large Protein Complexes]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2016 || Zheng Y || [[An improved algorithm for peak detection in mass spectra based on continuous wavelet transform]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Toward_a_gold_standard_for_benchmarking_gene_set_enrichment_analysis&diff=712Toward a gold standard for benchmarking gene set enrichment analysis2020-02-25T15:57:57Z<p>Ckreutz: </p>
<hr />
<div>__ NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Geistlinger, L., Csaba, G., Santarelli, M., Ramos, M., Schiffer, L., Law, C., ... & Zimmer, R., Toward a gold standard for benchmarking gene set enrichment analysis, 2020, Bioinformatics, 0, 1-12 <br />
<br />
[https://doi.org/10.1093/bib/bbz158 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Gene set analyses are combination of several analysis modules. <br />
This paper investigates the performance of ten prominent approaches.<br />
Biological plausibility based on co-citation databases is used for assessment.<br />
<br />
=== Study outcomes ===<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
Runtimes are as follows:<br />
<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* "75 expression datasets investigating 42 human diseases"<br />
* microarray and RNAseq data<br />
* pre-existing benchmark data sets<br />
* 10 methods:<br />
** ORA<br />
** GLOBALTEST<br />
** GSEA<br />
** SAFE<br />
** GSA<br />
** SAMGS<br />
** ROAST<br />
** CAMERA<br />
** PADOG<br />
** GSVA<br />
* "Gene set relevance rankings for each disease were constructed by querying the MalaCards database. MalaCards scores genes for disease relevance based on experimental evidence and co-citation in the literature."<br />
* "A nominal significance level of 0.05" is used (without correction with respect to multiple testing). This was also common in other benchmark studies.<br />
* The "type I error rate was evaluated by randomization of the sample labels" of the microarray data set.<br />
* "Random gene sets of increasing set size were analyzed to assess whether enrichment methods are affected by geneset size." For this purpose, 100 "random gene sets of defined sizes {5,10,25,50,100,250,500}" were sampled.<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
An R package (GSEABenchmarkeR) is available that seem to enable similar analyses. <br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Toward_a_gold_standard_for_benchmarking_gene_set_enrichment_analysis&diff=704Toward a gold standard for benchmarking gene set enrichment analysis2020-02-25T15:41:31Z<p>Ckreutz: /* Citation */</p>
<hr />
<div>__ NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Geistlinger, L., Csaba, G., Santarelli, M., Ramos, M., Schiffer, L., Law, C., ... & Zimmer, R., Toward a gold standard for benchmarking gene set enrichment analysis, 2020, Bioinformatics, 0, 1-12 <br />
<br />
[https://doi.org/10.1093/bib/bbz158 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Briefly describe the scope of the paper, i.e. the field of research and/or application.<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* "75 expression datasets investigating 42 human diseases"<br />
* microarray and RNAseq data<br />
* pre-existing benchmark data sets<br />
* 10 methods:<br />
** ORA<br />
** GLOBALTEST<br />
** GSEA<br />
** SAFE<br />
** GSA<br />
** SAMGS<br />
** ROAST<br />
** CAMERA<br />
** PADOG<br />
** GSVA<br />
* "Gene set relevance rankings for each disease were constructed by querying the MalaCards database. MalaCards scores genes for disease relevance based on experimental evidence and co-citation in the literature."<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_general_modular_framework_for_gene_set_enrichment_analysis&diff=703A general modular framework for gene set enrichment analysis2020-02-25T15:40:52Z<p>Ckreutz: </p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
M Ackermann and K Strimmer, A general modular framework for gene set enrichment analysis, 2009, BMC Bioinformatics, 10:47, pages etc in any possible citation style.<br />
<br />
[https://doi:10.1186/1471-2105-10-4 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Gene set analyses have a modular structure, i.e. they consist of <br />
# gene level statistics <br />
# gene level significance assessment<br />
# gene set statistics<br />
# gene set significance assessment<br />
# statistical conclusion<br />
<br />
Alternatively, steps 1.-3. might be replaced by a single global test.<br />
<br />
In this paper, 261 different variants of gene set enrichment procedures were evaluated based on simulated and experimental data.<br />
<br />
=== Study outcomes ===<br />
==== Outcome O1: Gene level statistics ====<br />
* The choice of the gene-level statistics (t, moderated t, or correlation) does NOT have a great impact <br />
* t statistic, moderated t, and correlation fail to find gene sets that contain up- and downregulated genes<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication. <br />
<br />
==== Outcome O2: Transformation of the gene level statistics ====<br />
* The transformation of the gene level statistic has a substantial impact<br />
* Transformations help to find gene sets that contain up- and downregulated genes<br />
* Combination of square transformation and rank transformation shows the best overall performance<br />
* Binary transformation (i.e. using a cutpoint) and FDRs decrease the performance<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication.<br />
<br />
==== Outcome O3: Gene set statistics ====<br />
* "mean and the maxmean statistic produce ... overall very good results"<br />
* "median and the Wilcoxon test are primarily advantageous if the competitive null hypothesis is tested, or if there are many outliers in the data"<br />
* "conditional FDR ... vary strongly with the choice of the gene-level statistic, transformation and permutation approach.<br />
* The ES score showed a rather weak performance<br />
<br />
Outcomes O3 are presented as Table 3 in the original publication. <br />
<br />
==== Outcome O4: Significance assessment ====<br />
* The parametric approach has the best power but is overoptimistic if the assumption of statistical indpendence is violated<br />
* Permutation seems to slightly outperform resampling<br />
* "restandardization procedure performs very similar to resampling"<br />
<br />
Outcomes O4 are presented as Table 4 in the original publication. <br />
<br />
==== Outcome O5: Global approaches ====<br />
* The performance of the globaltest procedure "is not better than that of the less sophisticated univariate methods" but "is computationally a little bit faster".<br />
* For Hotellings T2-test:<br />
** an "overall poor" performance was obtained<br />
** "the uncorrelated sets are found with the same reliability as with univariate approaches. However, ... the sets with correlation ... are hardly detected."<br />
** shows "improved performance with sample label permutation as opposed to gene sampling."<br />
<br />
Outcomes O5 are presented as Table 5 for the global test and in Table 6 for Hotellings T2 in the original publication.<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* 100 data sets were simulated<br />
* The simulated data sets have 600 features (genes) and 20 samples (10 vs. 10)<br />
* The data was simulated with normally distributed noise with variance equals to one<br />
* 520 genes were consided as uninformative (delta=0, rho=0)<br />
* Altogether, nine different simulation data sets were generated that consist of the following combinations:<br />
** Gene sets with different levels of differential expression (delta \in {0, 0.75, 1, -1}) were simulated<br />
** Gene sets with varying levels of intra-group correlation (rho \in {0, 0.6, -0.6}) were simulated<br />
** Gene sets that contain regulated and unregulated genes (half/half) were generated as well as gene set that contain up- and downregulated genes.<br />
* "The gene set statistic ES was not combined with a binary transformation since the latter does not allow a sensible ranking of the genes." <br />
* In total <br />
** 3 gene level statistics × <br />
** 5 transformations × <br />
** 6 gene set statistics × <br />
** 3 significance assessments <br />
** minus 9 insensible combinations<br />
** = 261 (in total) variants of gene set analyses were considered<br />
* The authors count how frequently the p-values that assess significance at the gene-set level are below a significance level 0.05<br />
<br />
<br />
==== Design for Outcome O1: Gene level statistics ====<br />
* The authors consider the impact of the selected approach at for module 1 (see summary above)<br />
* Three approaches were considered: t, moderated t and correlation<br />
* These approaches were evaluated for five different transformations (see O2)<br />
<br />
* Multiple other approaches <br />
* The authors already provide the important hint that the dependency on the gene level test statistic might be more relevant for smaller sample size (e.g. 3 vs 3) <br />
<br />
==== Design for Outcome O2: Transformation of the gene level statistics ====<br />
* The outcome was generated for five different transformations (and three gene level statistics)<br />
<br />
==== Design for Outcome O3: Gene set statistics ====<br />
* Three gene set statistics were investigated:<br />
** mean<br />
** maxmean<br />
** median<br />
** ES<br />
** conditional FDR<br />
** Wilcoxon<br />
* This analyses were performed for the moderated t statistic (gene level) and by using the quadratic transformation. For significance assessment, resampling was applied.<br />
<br />
==== Design for Outcome O4: Significance assessment ====<br />
* Four different approaches for assessing significance at the gene set level were evaluated: <br />
** parametric<br />
** resampling<br />
** permutation<br />
** restandardization<br />
* This analysis was performed by using the moderated t as the gene level statistic in combination with a quadratic transformation and the mean as the gene set statistic<br />
<br />
==== Design for Outcome O5: Global approaches ====<br />
* globaltest andHotelling's T2-test with a shrinkage covariance matrix was considered<br />
<br />
=== Further comments and aspects ===<br />
* Simulation is NOT based on characteristics or gene sets derived from real data <br />
* The paper provides very comprehensive outcomes in terms of combinations of approaches<br />
* After the paper was published another type of gene set statistics appeared that is based on Kolmogorov-Smirnov test. This approach is applied e.g. for GSEA.<br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Toward_a_gold_standard_for_benchmarking_gene_set_enrichment_analysis&diff=698Toward a gold standard for benchmarking gene set enrichment analysis2020-02-25T15:38:51Z<p>Ckreutz: Created page with "__ NUMBEREDHEADINGS__ === Citation === Geistlinger, L., Csaba, G., Santarelli, M., Ramos, M., Schiffer, L., Law, C., ... & Zimmer, R., Toward a gold standard for benchmarking..."</p>
<hr />
<div>__ NUMBEREDHEADINGS__<br />
=== Citation ===<br />
Geistlinger, L., Csaba, G., Santarelli, M., Ramos, M., Schiffer, L., Law, C., ... & Zimmer, R., Toward a gold standard for benchmarking gene set enrichment analysis, 2020, Bioinformatics, 0, 1-12 <br />
<br />
[https://doi.org/10.1093/bib/bbz158 Permanent link to the paper]<br />
<br />
<br />
=== Summary ===<br />
Briefly describe the scope of the paper, i.e. the field of research and/or application.<br />
<br />
=== Study outcomes ===<br />
List the paper results concerning method comparison and benchmarking:<br />
==== Outcome O1 ====<br />
The performance of ...<br />
<br />
Outcome O1 is presented as Figure X in the original publication. <br />
<br />
==== Outcome O2 ====<br />
...<br />
<br />
Outcome O2 is presented as Figure X in the original publication. <br />
<br />
==== Outcome On ====<br />
...<br />
<br />
Outcome On is presented as Figure X in the original publication. <br />
<br />
==== Further outcomes ====<br />
If intended, you can add further outcomes here.<br />
<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* "75 expression datasets investigating 42 human diseases"<br />
* microarray and RNAseq data<br />
* pre-existing benchmark data sets<br />
* 10 methods:<br />
** ORA<br />
** GLOBALTEST<br />
** GSEA<br />
** SAFE<br />
** GSA<br />
** SAMGS<br />
** ROAST<br />
** CAMERA<br />
** PADOG<br />
** GSVA<br />
* "Gene set relevance rankings for each disease were constructed by querying the MalaCards database. MalaCards scores genes for disease relevance based on experimental evidence and co-citation in the literature."<br />
<br />
==== Design for Outcome O1 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
==== Design for Outcome O2 ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
... <br />
<br />
==== Design for Outcome O ====<br />
* The outcome was generated for ...<br />
* Configuration parameters were chosen ...<br />
* ...<br />
<br />
=== Further comments and aspects ===<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=Literature_Studies&diff=692Literature Studies2020-02-25T15:31:04Z<p>Ckreutz: /* Identifying sets of features (e.g. gene set analyses) */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
{| class="wikitable"<br />
|-<br />
! Page summary<br />
|-<br />
| Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. </br> <br />
<br />
The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). </br><br />
<br />
Please extend this list by creating a new page and adding a link below. </br> <br />
Use the '''[[Guidelines_for_Summarizing_a_Literature_Study|guidelines described here]]'''.<br />
|}<br />
<br />
== Results from Literature ==<br />
<br />
=== Classification ===<br />
''' 2003 '''</br><br />
* [[Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data]]<br />
''' 2005 '''</br><br />
* [[A review and comparison of classification algorithms for medical decision making]]<br />
''' 2016 '''</br><br />
* [[Predicting Breast Cancer Survivability Using Data Mining Techniques]]<br />
<br />
=== Selection of Differential Features and Regions ===<br />
==== Identifying differential features ====<br />
''' 2006 '''</br><br />
* [[Rat toxicogenomic study reveals analytical consistency across microarray platforms]]<br />
''' 2010 '''</br><br />
* [[A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium]]<br />
''' 2017 '''</br><br />
* [[Identification of differentially expressed peptides in high-throughput proteomics data]]<br />
* [[In-depth method assessments of differentially expressed protein detection for shotgun proteomics data with missing values]]<br />
* [[Strategies for analyzing bisulfite sequencing data]]<br />
''' 2018 '''</br><br />
* [[Identification of Differentially Methylated Sites with Weak Methylation Effects]]<br />
<br />
==== Identifying differential regions (e.g. DMRs) ====<br />
''' 2015 '''<br />
* [[De novo identification of differentially methylated regions in the human genome]]<br />
* [[MethylAction: detecting differentially methylated regions that distinguish biological subtypes]]<br />
* [[metilene: Fast and sensitive calling of differentially methylated regions from bisulfite sequencing data]]<br />
''' 2016 '''<br />
* [[seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data]]<br />
* [[Statistical methods for detecting differentially methylated regions based on MethylCap-seq data]]<br />
''' 2017 '''<br />
* [[DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data]]<br />
''' 2018 '''<br />
* [[Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus]]<br />
* [[DMRcaller: a versatile R/Bioconductor package for detection and visualization of differentially methylated regions in CpG and non-CpG contexts]]<br />
* [[MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)]]<br />
<br />
==== Identifying sets of features (e.g. gene set analyses) ====<br />
'''2009<br />
<br />
* [[A general modular framework for gene set enrichment analysis]]<br />
* [[Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16]]<br />
<br />
'''2018<br />
<br />
* [[Gene set analysis methods: a systematic comparison]]<br />
<br />
'''2020<br />
* [[Toward a gold standard for benchmarking gene set enrichment analysis]]<br />
<br />
==== Dimension reduction ====<br />
''' 2008 '''</br><br />
* [[On the Relationship Between Feature Selection and Classification Accuracy]]<br />
''' 2015 '''</br><br />
* [[Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data]]<br />
<br />
=== Imputation methods for missing values ===<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 1996 || Schenker || [[Partially parametric techniques for multiple imputation]]<br />
|-<br />
| 1999 || Hastie T || [[Imputing Missing Data for Gene Expression Arrays]]<br />
|-<br />
| 2001 || Troyanskaya || [[Missing value estimation methods for DNA microarrays]]<br />
|-<br />
| 2002 || Engels J || [[Imputation of missing longitudinal data: a comparison of methods]]<br />
|-<br />
| 2003 || Oba || [[A Bayesian missing value estimation method for gene expression profile data]]<br />
|-<br />
| 2005 || Scholz || [[Nonlinear PCA: a missing data approach]]<br />
|-<br />
| 2007 || Stacklies || [[pcaMethods—a bioconductor package providing PCA methods for incomplete data]]<br />
|-<br />
| 2007 || Verboven || [[Sequential imputation for missing values]]<br />
|-<br />
| 2008 || Shaffer GN || [[Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes]]<br />
|-<br />
| 2011 || Templ || [[Iterative stepwise regression imputation using standard and robust methods]]<br />
|-<br />
| 2012 || Hrydziuszko O || [[Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline]]<br />
|-<br />
| 2012 || Stekhoven || [[MissForest—non-parametric missing value imputation for mixed-type data]]<br />
|-<br />
| 2013 || Taylor || [[Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies]]<br />
|-<br />
| 2013 || Waljee || [[Comparison of imputation methods for missing laboratory data in medicine]]<br />
|-<br />
| 2014 || Shah || [[Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study]]<br />
|-<br />
| 2014 || Rodwell || [[Comparison of methods for imputing limited-range variables: a simulation study]]<br />
|-<br />
| 2014 || Morris || [[Tuning multiple imputation by predictive mean matching and local residual draws]]<br />
|-<br />
| 2014 || Doove L || [[Recursive partitioning for missing data imputation in the presence of interaction effects]]<br />
|-<br />
| 2015 || Webb-Robertson BJM || [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics]]<br />
|-<br />
| 2016 || Folch-Fortuny A || [[Assessment of maximum likelihood PCA missing data imputation]]<br />
|-<br />
| 2016 || Lazar C || [[Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies]]<br />
|-<br />
| 2016 || Yin X || [[Multiple imputation and analysis for high-dimensional incomplete proteomics data]]<br />
|-<br />
| 2018 || Wei R || [[Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data]]<br />
|-<br />
| 2018 || Poyatos R || [[Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information]]<br />
|-<br />
| 2018 || O'Brien JJ || [[The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments]]<br />
|}<br />
<br />
=== ODE-based Modelling ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2001 || Schenker || [[Ways to Fit a PK Model with Some Data Below the Quantification Limit]]<br />
|-<br />
| 2008 || Balsa-Canto || [[Hybrid optimization method with general switching strategy for parameter estimation]]<br />
|-<br />
| 2011 || Tashkova || [[Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis]]<br />
|-<br />
| 2013 || Raue || [[Lessons Learned from Quantitative Dynamical Modeling in Systems Biology]]<br />
|-<br />
| 2013 || Schenker || [[ODE parameter inference using adaptive gradient matching with Gaussian processes]]<br />
|-<br />
| 2017 || Ballnus || [[Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems]]<br />
|-<br />
| 2017 || Henriques || [[Data-driven reverse engineering of signaling pathways using ensembles of dynamic models]]<br />
|-<br />
| 2017 || Melicher || [[Fast derivatives of likelihood functionals for ODE based models using adjoint-state method]]<br />
|-<br />
| 2017 || Schenker || [[Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy]]<br />
|-<br />
| 2017 || Schenker || [[Performance of objective functions and optimization procedures for parameter estimation in system biology models]]<br />
|-<br />
| 2017 || Schenker || [[Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks]]<br />
|-<br />
| 2018 || Schenker || [[Evaluation of Derivative-Free Optimizers for Parameter Estimation in Systems Biology]]<br />
|-<br />
| 2018 || Schenker || [[Hierarchical optimization for the efficient parametrization of ODE models]]<br />
|-<br />
| 2018 || Schenker || [[Input-dependent structural identifiability of nonlinear systems]]<br />
|-<br />
| 2018 || Schenker || [[Optimization and profile calculation of ODE models using second order adjoint sensitivity analysis]]<br />
|-<br />
| 2019 || Schenker || [[A comparison of methods for quantifying prediction uncertainty in systems biology]]<br />
|-<br />
| 2019 || Schenker || [[Benchmark problems for dynamic modeling of intracellular processes]]<br />
|-<br />
| 2019 || Schenker || [[Benchmarking optimization methods for parameter estimation in large kinetic models]]<br />
|-<br />
| 2019 || Schenker || [[Efficient computation of steady states in large-scale ODE models of biochemical reaction networks]]<br />
|-<br />
| 2019 || Schenker || [[Full observability and estimation of unknown inputs, states and parameters of nonlinear biological models]]<br />
|-<br />
| 2019 || Schenker || [[Mini-batch optimization enables training of ODE models on large-scale datasets]]<br />
|-<br />
| 2019 || Schenker || [[Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach]]<br />
|-<br />
| 2019 || Schenker || [[Parameter estimation in models of biological oscillators: an automated regularised estimation approach]]<br />
|-<br />
| 2019 || Schenker || [[Robust calibration of hierarchical population models for heterogeneous cell populations]]<br />
|-<br />
| 2019 || Schenker || [[Scalable nonlinear programming framework for parameter estimation in dynamic biological system models]]<br />
|-<br />
| 2019 || Schenker || [[Tracking for parameter and state estimation in possibly misspecified partially observed linear Ordinary Differential Equations]]<br />
|-<br />
| 2020 || Schenker || [[An application of Conditional RobustCalibration (CRC) to ordinary differential equations (ODEs) models in computational systems biology: a comparison of two sampling strategies]]<br />
|-<br />
| 2020 || Schenker || [[Efficient parameterization of large-scale dynamic models based on relative measurements]]<br />
|-<br />
| 2020 || Schenker || [[Testing structural identifiability by a simple scaling method]]<br />
|}<br />
<br />
=== Omics Workflows ===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|-<br />
| 2008 || Neuweger H || [[MeltDB: a software platform for the analysis and integration of metabolomics experiment data]]<br />
|-<br />
| 2008 || Barla A || [[Machine learning methods for predictive proteomics]]<br />
|-<br />
| 2009 || Xia J || [[MetaboAnalyst: a web server for metabolomic data analysis and interpretation]]<br />
|-<br />
| 2013 || Weisser H || [[An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics]]<br />
|-<br />
| 2014 || Cox J || [[Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ* ]]<br />
|-<br />
| 2015 || || [[ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines]]<br />
|-<br />
| 2016 || Tyanova S || [[The MaxQuant computational platform for mass spectrometry–based shotgun proteomics]]<br />
|-<br />
| 2016 || Röst HL || [[OpenMS: a flexible open-source software platform for mass spectrometry data analysis]]<br />
|-<br />
| 2017 || || [[A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies]]<br />
|-<br />
| 2018 || Välikangas T || [[A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation]]<br />
|-<br />
| 2019 || || [[A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines]]<br />
|-<br />
| 2019 || || [[Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays]]<br />
|}<br />
<br />
=== Preprocessing high-throughput data===<br />
{| class="wikitable sortable"<br />
|-<br />
! Year || First Author || Title<br />
|- 1999 || Perkins DN || [[Probability-based protein identification by searching sequence databases using mass spectrometry data]]<br />
|-<br />
| 2003 || || [[A comparison of normalization methods for high density oligonucleotide array data based on variance and bias ]]<br />
|-<br />
| 2005 || || [[Comparison of Affymetrix GeneChip Expression Measures]]<br />
|-<br />
| 2005 || Meleth S || [[The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins]]<br />
|-<br />
| 2005 || || [[Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays]]<br />
|-<br />
| 2006 || || [[Using RNA sample titrations to assess microarray platform performance and normalization techniques]]<br />
|-<br />
| 2006 || Wang P || [[Normalization regarding non-random missing values in high-throughput mass spectrometry data]]<br />
|-<br />
| 2007 || Carvalho B || [[Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data]]<br />
|-<br />
| 2007 || Cannataro M || [[MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid]]<br />
|-<br />
| 2008 || || [[Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix]]<br />
|-<br />
| 2009 || || [[Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations]]<br />
|-<br />
| 2009 || Mar JC || [[Data-driven normalization strategies for high-throughput quantitative RT-PCR]]<br />
|-<br />
| 2009 || Vakhrushev SY || [[Software platform for high-throughput glycomics]]<br />
|-<br />
| 2010 || || [[Consistency of predictive signature genes and classifiers generated using different microarray platforms]]<br />
|-<br />
| 2010 || || [[Detecting and correcting systematic variation in large-scale RNA sequencing data]]<br />
|-<br />
| 2010 || || [[Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments]]<br />
|-<br />
| 2010 || || [[Normalization of RNA-seq data using factor analysis of control genes or samples]]<br />
|-<br />
| 2011 || || [[Affymetrix GeneChip microarray preprocessing for multivariate analyses]]<br />
|-<br />
| 2012 || || [[A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis]]<br />
|-<br />
| 2014 || || [[Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets]]<br />
|-<br />
| 2014 || Zhou X || [[Prevention, diagnosis and treatment of high-throughput sequencing data pathologies]]<br />
|-<br />
| 2014 || Coble JB || [[Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery]]<br />
|-<br />
| 2015 || Caraus I || [[Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions]]<br />
|-<br />
| 2015 || Tam S || [[Optimization of miRNA-seq data preprocessing]]<br />
|-<br />
| 2015 || Rafiei A || [[Comparison of peak‐picking workflows for untargeted liquid chromatography/high‐resolution mass spectrometry metabolomics data analysis]]<br />
|-<br />
| 2015 || Chawade A || [[Data processing has major impact on the outcome of quantitative label-free LC-MS analysis]]<br />
|-<br />
| 2016 || Yi L || [[Chemometric methods in data processing of mass spectrometry-based metabolomics: A review]]<br />
|-<br />
| 2016 || Tsuji J || [[Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data]]<br />
|-<br />
| 2016 || Li B || [[Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis]]<br />
|-<br />
| 2017 || Li B || [[NOREVA: normalization and evaluation of MS-based metabolomics data]]<br />
|-<br />
| 2018 || Mazoure B || [[Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening]]<br />
|-<br />
| 2018 || Li Z || [[Comprehensive evaluation of untargeted metabolomics data processing software in feature detection, quantification and discriminating marker selection]]<br />
|-<br />
| 2018 || Willforss J || [[NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis]]<br />
|}</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_general_modular_framework_for_gene_set_enrichment_analysis&diff=684A general modular framework for gene set enrichment analysis2020-02-25T15:26:51Z<p>Ckreutz: /* Design for Outcome O4: Significance assessment */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
M Ackermann and K Strimmer, A general modular framework for gene set enrichment analysis, 2009, BMC Bioinformatics, 10:47, pages etc in any possible citation style.<br />
<br />
[https://doi:10.1186/1471-2105-10-4 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Gene set analyses have a modular structure, i.e. they consist of <br />
# gene level statistics <br />
# gene level significance assessment<br />
# gene set statistics<br />
# gene set significance assessment<br />
# statistical conclusion<br />
<br />
Alternatively, steps 1.-3. might be replaced by a single global test.<br />
<br />
In this paper, 261 different variants of gene set enrichment procedures were evaluated based on simulated and experimental data.<br />
<br />
=== Study outcomes ===<br />
==== Outcome O1: Gene level statistics ====<br />
* The choice of the gene-level statistics (t, moderated t, or correlation) does NOT have a great impact <br />
* t statistic, moderated t, and correlation fail to find gene sets that contain up- and downregulated genes<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication. <br />
<br />
==== Outcome O2: Transformation of the gene level statistics ====<br />
* The transformation of the gene level statistic has a substantial impact<br />
* Transformations help to find gene sets that contain up- and downregulated genes<br />
* Combination of square transformation and rank transformation shows the best overall performance<br />
* Binary transformation (i.e. using a cutpoint) and FDRs decrease the performance<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication.<br />
<br />
==== Outcome O3: Gene set statistics ====<br />
* "mean and the maxmean statistic produce ... overall very good results"<br />
* "median and the Wilcoxon test are primarily advantageous if the competitive null hypothesis is tested, or if there are many outliers in the data"<br />
* "conditional FDR ... vary strongly with the choice of the gene-level statistic, transformation and permutation approach.<br />
* The ES score showed a rather weak performance<br />
<br />
Outcomes O3 are presented as Table 3 in the original publication. <br />
<br />
==== Outcome O4: Significance assessment ====<br />
* The parametric approach has the best power but is overoptimistic if the assumption of statistical indpendence is violated<br />
* Permutation seems to slightly outperform resampling<br />
* "restandardization procedure performs very similar to resampling"<br />
<br />
Outcomes O4 are presented as Table 4 in the original publication. <br />
<br />
==== Outcome O5: Global approaches ====<br />
* The performance of the globaltest procedure "is not better than that of the less sophisticated univariate methods" but "is computationally a little bit faster".<br />
* For Hotellings T2-test:<br />
** an "overall poor" performance was obtained<br />
** "the uncorrelated sets are found with the same reliability as with univariate approaches. However, ... the sets with correlation ... are hardly detected."<br />
** shows "improved performance with sample label permutation as opposed to gene sampling."<br />
<br />
Outcomes O5 are presented as Table 5 for the global test and in Table 6 for Hotellings T2 in the original publication.<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* 100 data sets were simulated<br />
* The simulated data sets have 600 features (genes) and 20 samples (10 vs. 10)<br />
* The data was simulated with normally distributed noise with variance equals to one<br />
* 520 genes were consided as uninformative (delta=0, rho=0)<br />
* Altogether, nine different simulation data sets were generated that consist of the following combinations:<br />
** Gene sets with different levels of differential expression (delta \in {0, 0.75, 1, -1}) were simulated<br />
** Gene sets with varying levels of intra-group correlation (rho \in {0, 0.6, -0.6}) were simulated<br />
** Gene sets that contain regulated and unregulated genes (half/half) were generated as well as gene set that contain up- and downregulated genes.<br />
* "The gene set statistic ES was not combined with a binary transformation since the latter does not allow a sensible ranking of the genes." <br />
* In total <br />
** 3 gene level statistics × <br />
** 5 transformations × <br />
** 6 gene set statistics × <br />
** 3 significance assessments <br />
** minus 9 insensible combinations<br />
** = 261 (in total) variants of gene set analyses were considered<br />
* The authors count how frequently the p-values that assess significance at the gene-set level are below a significance level 0.05<br />
<br />
<br />
==== Design for Outcome O1: Gene level statistics ====<br />
* The authors consider the impact of the selected approach at for module 1 (see summary above)<br />
* Three approaches were considered: t, moderated t and correlation<br />
* These approaches were evaluated for five different transformations (see O2)<br />
<br />
* Multiple other approaches <br />
* The authors already provide the important hint that the dependency on the gene level test statistic might be more relevant for smaller sample size (e.g. 3 vs 3) <br />
<br />
==== Design for Outcome O2: Transformation of the gene level statistics ====<br />
* The outcome was generated for five different transformations (and three gene level statistics)<br />
<br />
==== Design for Outcome O3: Gene set statistics ====<br />
* Three gene set statistics were investigated:<br />
** mean<br />
** maxmean<br />
** median<br />
** ES<br />
** conditional FDR<br />
** Wilcoxon<br />
* This analyses were performed for the moderated t statistic (gene level) and by using the quadratic transformation. For significance assessment, resampling was applied.<br />
<br />
==== Design for Outcome O4: Significance assessment ====<br />
* Four different approaches for assessing significance at the gene set level were evaluated: <br />
** parametric<br />
** resampling<br />
** permutation<br />
** restandardization<br />
* This analysis was performed by using the moderated t as the gene level statistic in combination with a quadratic transformation and the mean as the gene set statistic<br />
<br />
==== Design for Outcome O5: Global approaches ====<br />
* globaltest andHotelling's T2-test with a shrinkage covariance matrix was considered<br />
<br />
=== Further comments and aspects ===<br />
* Simulation is NOT based on characteristics or gene sets derived from real data <br />
* The paper provides very comprehensive outcomes in terms of combinations of approaches<br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_general_modular_framework_for_gene_set_enrichment_analysis&diff=683A general modular framework for gene set enrichment analysis2020-02-25T15:26:42Z<p>Ckreutz: /* Further comments and aspects */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
M Ackermann and K Strimmer, A general modular framework for gene set enrichment analysis, 2009, BMC Bioinformatics, 10:47, pages etc in any possible citation style.<br />
<br />
[https://doi:10.1186/1471-2105-10-4 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Gene set analyses have a modular structure, i.e. they consist of <br />
# gene level statistics <br />
# gene level significance assessment<br />
# gene set statistics<br />
# gene set significance assessment<br />
# statistical conclusion<br />
<br />
Alternatively, steps 1.-3. might be replaced by a single global test.<br />
<br />
In this paper, 261 different variants of gene set enrichment procedures were evaluated based on simulated and experimental data.<br />
<br />
=== Study outcomes ===<br />
==== Outcome O1: Gene level statistics ====<br />
* The choice of the gene-level statistics (t, moderated t, or correlation) does NOT have a great impact <br />
* t statistic, moderated t, and correlation fail to find gene sets that contain up- and downregulated genes<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication. <br />
<br />
==== Outcome O2: Transformation of the gene level statistics ====<br />
* The transformation of the gene level statistic has a substantial impact<br />
* Transformations help to find gene sets that contain up- and downregulated genes<br />
* Combination of square transformation and rank transformation shows the best overall performance<br />
* Binary transformation (i.e. using a cutpoint) and FDRs decrease the performance<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication.<br />
<br />
==== Outcome O3: Gene set statistics ====<br />
* "mean and the maxmean statistic produce ... overall very good results"<br />
* "median and the Wilcoxon test are primarily advantageous if the competitive null hypothesis is tested, or if there are many outliers in the data"<br />
* "conditional FDR ... vary strongly with the choice of the gene-level statistic, transformation and permutation approach.<br />
* The ES score showed a rather weak performance<br />
<br />
Outcomes O3 are presented as Table 3 in the original publication. <br />
<br />
==== Outcome O4: Significance assessment ====<br />
* The parametric approach has the best power but is overoptimistic if the assumption of statistical indpendence is violated<br />
* Permutation seems to slightly outperform resampling<br />
* "restandardization procedure performs very similar to resampling"<br />
<br />
Outcomes O4 are presented as Table 4 in the original publication. <br />
<br />
==== Outcome O5: Global approaches ====<br />
* The performance of the globaltest procedure "is not better than that of the less sophisticated univariate methods" but "is computationally a little bit faster".<br />
* For Hotellings T2-test:<br />
** an "overall poor" performance was obtained<br />
** "the uncorrelated sets are found with the same reliability as with univariate approaches. However, ... the sets with correlation ... are hardly detected."<br />
** shows "improved performance with sample label permutation as opposed to gene sampling."<br />
<br />
Outcomes O5 are presented as Table 5 for the global test and in Table 6 for Hotellings T2 in the original publication.<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* 100 data sets were simulated<br />
* The simulated data sets have 600 features (genes) and 20 samples (10 vs. 10)<br />
* The data was simulated with normally distributed noise with variance equals to one<br />
* 520 genes were consided as uninformative (delta=0, rho=0)<br />
* Altogether, nine different simulation data sets were generated that consist of the following combinations:<br />
** Gene sets with different levels of differential expression (delta \in {0, 0.75, 1, -1}) were simulated<br />
** Gene sets with varying levels of intra-group correlation (rho \in {0, 0.6, -0.6}) were simulated<br />
** Gene sets that contain regulated and unregulated genes (half/half) were generated as well as gene set that contain up- and downregulated genes.<br />
* "The gene set statistic ES was not combined with a binary transformation since the latter does not allow a sensible ranking of the genes." <br />
* In total <br />
** 3 gene level statistics × <br />
** 5 transformations × <br />
** 6 gene set statistics × <br />
** 3 significance assessments <br />
** minus 9 insensible combinations<br />
** = 261 (in total) variants of gene set analyses were considered<br />
* The authors count how frequently the p-values that assess significance at the gene-set level are below a significance level 0.05<br />
<br />
<br />
==== Design for Outcome O1: Gene level statistics ====<br />
* The authors consider the impact of the selected approach at for module 1 (see summary above)<br />
* Three approaches were considered: t, moderated t and correlation<br />
* These approaches were evaluated for five different transformations (see O2)<br />
<br />
* Multiple other approaches <br />
* The authors already provide the important hint that the dependency on the gene level test statistic might be more relevant for smaller sample size (e.g. 3 vs 3) <br />
<br />
==== Design for Outcome O2: Transformation of the gene level statistics ====<br />
* The outcome was generated for five different transformations (and three gene level statistics)<br />
<br />
==== Design for Outcome O3: Gene set statistics ====<br />
* Three gene set statistics were investigated:<br />
** mean<br />
** maxmean<br />
** median<br />
** ES<br />
** conditional FDR<br />
** Wilcoxon<br />
* This analyses were performed for the moderated t statistic (gene level) and by using the quadratic transformation. For significance assessment, resampling was applied.<br />
<br />
==== Design for Outcome O4: Significance assessment ====<br />
* Four different approaches for assessing significance at the gene set level were evaluated: <br />
** parametric<br />
** resampling<br />
** permutation<br />
** restandardization<br />
* This analysis was performed by using the moderated t as the gene level statistic in combination with a quadratic transformation and the mean as the gene set statistic<br />
<br />
<br />
==== Design for Outcome O5: Global approaches ====<br />
* globaltest andHotelling's T2-test with a shrinkage covariance matrix was considered<br />
<br />
=== Further comments and aspects ===<br />
* Simulation is NOT based on characteristics or gene sets derived from real data <br />
* The paper provides very comprehensive outcomes in terms of combinations of approaches<br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_general_modular_framework_for_gene_set_enrichment_analysis&diff=682A general modular framework for gene set enrichment analysis2020-02-25T15:26:30Z<p>Ckreutz: /* References */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
M Ackermann and K Strimmer, A general modular framework for gene set enrichment analysis, 2009, BMC Bioinformatics, 10:47, pages etc in any possible citation style.<br />
<br />
[https://doi:10.1186/1471-2105-10-4 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Gene set analyses have a modular structure, i.e. they consist of <br />
# gene level statistics <br />
# gene level significance assessment<br />
# gene set statistics<br />
# gene set significance assessment<br />
# statistical conclusion<br />
<br />
Alternatively, steps 1.-3. might be replaced by a single global test.<br />
<br />
In this paper, 261 different variants of gene set enrichment procedures were evaluated based on simulated and experimental data.<br />
<br />
=== Study outcomes ===<br />
==== Outcome O1: Gene level statistics ====<br />
* The choice of the gene-level statistics (t, moderated t, or correlation) does NOT have a great impact <br />
* t statistic, moderated t, and correlation fail to find gene sets that contain up- and downregulated genes<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication. <br />
<br />
==== Outcome O2: Transformation of the gene level statistics ====<br />
* The transformation of the gene level statistic has a substantial impact<br />
* Transformations help to find gene sets that contain up- and downregulated genes<br />
* Combination of square transformation and rank transformation shows the best overall performance<br />
* Binary transformation (i.e. using a cutpoint) and FDRs decrease the performance<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication.<br />
<br />
==== Outcome O3: Gene set statistics ====<br />
* "mean and the maxmean statistic produce ... overall very good results"<br />
* "median and the Wilcoxon test are primarily advantageous if the competitive null hypothesis is tested, or if there are many outliers in the data"<br />
* "conditional FDR ... vary strongly with the choice of the gene-level statistic, transformation and permutation approach.<br />
* The ES score showed a rather weak performance<br />
<br />
Outcomes O3 are presented as Table 3 in the original publication. <br />
<br />
==== Outcome O4: Significance assessment ====<br />
* The parametric approach has the best power but is overoptimistic if the assumption of statistical indpendence is violated<br />
* Permutation seems to slightly outperform resampling<br />
* "restandardization procedure performs very similar to resampling"<br />
<br />
Outcomes O4 are presented as Table 4 in the original publication. <br />
<br />
==== Outcome O5: Global approaches ====<br />
* The performance of the globaltest procedure "is not better than that of the less sophisticated univariate methods" but "is computationally a little bit faster".<br />
* For Hotellings T2-test:<br />
** an "overall poor" performance was obtained<br />
** "the uncorrelated sets are found with the same reliability as with univariate approaches. However, ... the sets with correlation ... are hardly detected."<br />
** shows "improved performance with sample label permutation as opposed to gene sampling."<br />
<br />
Outcomes O5 are presented as Table 5 for the global test and in Table 6 for Hotellings T2 in the original publication.<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* 100 data sets were simulated<br />
* The simulated data sets have 600 features (genes) and 20 samples (10 vs. 10)<br />
* The data was simulated with normally distributed noise with variance equals to one<br />
* 520 genes were consided as uninformative (delta=0, rho=0)<br />
* Altogether, nine different simulation data sets were generated that consist of the following combinations:<br />
** Gene sets with different levels of differential expression (delta \in {0, 0.75, 1, -1}) were simulated<br />
** Gene sets with varying levels of intra-group correlation (rho \in {0, 0.6, -0.6}) were simulated<br />
** Gene sets that contain regulated and unregulated genes (half/half) were generated as well as gene set that contain up- and downregulated genes.<br />
* "The gene set statistic ES was not combined with a binary transformation since the latter does not allow a sensible ranking of the genes." <br />
* In total <br />
** 3 gene level statistics × <br />
** 5 transformations × <br />
** 6 gene set statistics × <br />
** 3 significance assessments <br />
** minus 9 insensible combinations<br />
** = 261 (in total) variants of gene set analyses were considered<br />
* The authors count how frequently the p-values that assess significance at the gene-set level are below a significance level 0.05<br />
<br />
<br />
==== Design for Outcome O1: Gene level statistics ====<br />
* The authors consider the impact of the selected approach at for module 1 (see summary above)<br />
* Three approaches were considered: t, moderated t and correlation<br />
* These approaches were evaluated for five different transformations (see O2)<br />
<br />
* Multiple other approaches <br />
* The authors already provide the important hint that the dependency on the gene level test statistic might be more relevant for smaller sample size (e.g. 3 vs 3) <br />
<br />
==== Design for Outcome O2: Transformation of the gene level statistics ====<br />
* The outcome was generated for five different transformations (and three gene level statistics)<br />
<br />
==== Design for Outcome O3: Gene set statistics ====<br />
* Three gene set statistics were investigated:<br />
** mean<br />
** maxmean<br />
** median<br />
** ES<br />
** conditional FDR<br />
** Wilcoxon<br />
* This analyses were performed for the moderated t statistic (gene level) and by using the quadratic transformation. For significance assessment, resampling was applied.<br />
<br />
==== Design for Outcome O4: Significance assessment ====<br />
* Four different approaches for assessing significance at the gene set level were evaluated: <br />
** parametric<br />
** resampling<br />
** permutation<br />
** restandardization<br />
* This analysis was performed by using the moderated t as the gene level statistic in combination with a quadratic transformation and the mean as the gene set statistic<br />
<br />
<br />
==== Design for Outcome O5: Global approaches ====<br />
* globaltest andHotelling's T2-test with a shrinkage covariance matrix was considered<br />
<br />
=== Further comments and aspects ===<br />
* Simulation is NOT based on characteristics or gene sets derived from real data <br />
* The paper provides very comprehensive outcomes in terms of combinations of approaches<br />
<br />
<br />
=== References ===</div>Ckreutzhttps://www.benchmarking.uni-freiburg.de/index.php?title=A_general_modular_framework_for_gene_set_enrichment_analysis&diff=681A general modular framework for gene set enrichment analysis2020-02-25T15:26:13Z<p>Ckreutz: /* Outcome O5: Global approaches */</p>
<hr />
<div>__NUMBEREDHEADINGS__<br />
=== Citation ===<br />
M Ackermann and K Strimmer, A general modular framework for gene set enrichment analysis, 2009, BMC Bioinformatics, 10:47, pages etc in any possible citation style.<br />
<br />
[https://doi:10.1186/1471-2105-10-4 Permanent link to the paper]<br />
<br />
=== Summary ===<br />
Gene set analyses have a modular structure, i.e. they consist of <br />
# gene level statistics <br />
# gene level significance assessment<br />
# gene set statistics<br />
# gene set significance assessment<br />
# statistical conclusion<br />
<br />
Alternatively, steps 1.-3. might be replaced by a single global test.<br />
<br />
In this paper, 261 different variants of gene set enrichment procedures were evaluated based on simulated and experimental data.<br />
<br />
=== Study outcomes ===<br />
==== Outcome O1: Gene level statistics ====<br />
* The choice of the gene-level statistics (t, moderated t, or correlation) does NOT have a great impact <br />
* t statistic, moderated t, and correlation fail to find gene sets that contain up- and downregulated genes<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication. <br />
<br />
==== Outcome O2: Transformation of the gene level statistics ====<br />
* The transformation of the gene level statistic has a substantial impact<br />
* Transformations help to find gene sets that contain up- and downregulated genes<br />
* Combination of square transformation and rank transformation shows the best overall performance<br />
* Binary transformation (i.e. using a cutpoint) and FDRs decrease the performance<br />
<br />
Outcomes O1 and O2 are presented as Table 2 in the original publication.<br />
<br />
==== Outcome O3: Gene set statistics ====<br />
* "mean and the maxmean statistic produce ... overall very good results"<br />
* "median and the Wilcoxon test are primarily advantageous if the competitive null hypothesis is tested, or if there are many outliers in the data"<br />
* "conditional FDR ... vary strongly with the choice of the gene-level statistic, transformation and permutation approach.<br />
* The ES score showed a rather weak performance<br />
<br />
Outcomes O3 are presented as Table 3 in the original publication. <br />
<br />
==== Outcome O4: Significance assessment ====<br />
* The parametric approach has the best power but is overoptimistic if the assumption of statistical indpendence is violated<br />
* Permutation seems to slightly outperform resampling<br />
* "restandardization procedure performs very similar to resampling"<br />
<br />
Outcomes O4 are presented as Table 4 in the original publication. <br />
<br />
==== Outcome O5: Global approaches ====<br />
* The performance of the globaltest procedure "is not better than that of the less sophisticated univariate methods" but "is computationally a little bit faster".<br />
* For Hotellings T2-test:<br />
** an "overall poor" performance was obtained<br />
** "the uncorrelated sets are found with the same reliability as with univariate approaches. However, ... the sets with correlation ... are hardly detected."<br />
** shows "improved performance with sample label permutation as opposed to gene sampling."<br />
<br />
Outcomes O5 are presented as Table 5 for the global test and in Table 6 for Hotellings T2 in the original publication.<br />
<br />
==== Further outcomes ====<br />
<br />
=== Study design and evidence level ===<br />
==== General aspects ====<br />
* 100 data sets were simulated<br />
* The simulated data sets have 600 features (genes) and 20 samples (10 vs. 10)<br />
* The data was simulated with normally distributed noise with variance equals to one<br />
* 520 genes were consided as uninformative (delta=0, rho=0)<br />
* Altogether, nine different simulation data sets were generated that consist of the following combinations:<br />
** Gene sets with different levels of differential expression (delta \in {0, 0.75, 1, -1}) were simulated<br />
** Gene sets with varying levels of intra-group correlation (rho \in {0, 0.6, -0.6}) were simulated<br />
** Gene sets that contain regulated and unregulated genes (half/half) were generated as well as gene set that contain up- and downregulated genes.<br />
* "The gene set statistic ES was not combined with a binary transformation since the latter does not allow a sensible ranking of the genes." <br />
* In total <br />
** 3 gene level statistics × <br />
** 5 transformations × <br />
** 6 gene set statistics × <br />
** 3 significance assessments <br />
** minus 9 insensible combinations<br />
** = 261 (in total) variants of gene set analyses were considered<br />
* The authors count how frequently the p-values that assess significance at the gene-set level are below a significance level 0.05<br />
<br />
<br />
==== Design for Outcome O1: Gene level statistics ====<br />
* The authors consider the impact of the selected approach at for module 1 (see summary above)<br />
* Three approaches were considered: t, moderated t and correlation<br />
* These approaches were evaluated for five different transformations (see O2)<br />
<br />
* Multiple other approaches <br />
* The authors already provide the important hint that the dependency on the gene level test statistic might be more relevant for smaller sample size (e.g. 3 vs 3) <br />
<br />
==== Design for Outcome O2: Transformation of the gene level statistics ====<br />
* The outcome was generated for five different transformations (and three gene level statistics)<br />
<br />
==== Design for Outcome O3: Gene set statistics ====<br />
* Three gene set statistics were investigated:<br />
** mean<br />
** maxmean<br />
** median<br />
** ES<br />
** conditional FDR<br />
** Wilcoxon<br />
* This analyses were performed for the moderated t statistic (gene level) and by using the quadratic transformation. For significance assessment, resampling was applied.<br />
<br />
==== Design for Outcome O4: Significance assessment ====<br />
* Four different approaches for assessing significance at the gene set level were evaluated: <br />
** parametric<br />
** resampling<br />
** permutation<br />
** restandardization<br />
* This analysis was performed by using the moderated t as the gene level statistic in combination with a quadratic transformation and the mean as the gene set statistic<br />
<br />
<br />
==== Design for Outcome O5: Global approaches ====<br />
* globaltest andHotelling's T2-test with a shrinkage covariance matrix was considered<br />
<br />
=== Further comments and aspects ===<br />
* Simulation is NOT based on characteristics or gene sets derived from real data <br />
* The paper provides very comprehensive outcomes in terms of combinations of approaches<br />
<br />
<br />
=== References ===<br />
The list of cited or related literature is placed here.</div>Ckreutz