Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.
Webb-Robertson, B.-J. M.; Wiberg, H. K.; Matzke, M. M.; Brown, J. N.; Wang, J.; McDermott, J. E.; Smith, R. D.; Rodland, K. D.; Metz, T. O.; Pounds, J. G.; Waters, K. M.; et al. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J. Proteome Res. 2015, 14 (5), 1993−2001. DOI
Evaluation of performance and caveats of 9 imputation algorithms applied on a LC-MS data set.
List the paper results concerning method comparison and benchmarking:
Most imputation methods perform well, no single algorithm or imputation strategy (single, local, global) outperforms, sometimes even no imputation is superior in subsequent classification analysis.
Local similarity-based approaches are in general the most accuarate and robust methods. Such as least-squares adaptive (LSA) or regularized expectation maximization (REM).
With left-censored data the number of missing values highly depends on peptide intensity (Figure 1)
The 'best' imputation method highly depends on the data and the goal of the downstream analysis and therewith advantageous methods are hard to define.
Study design and evidence level
3 single-value approaches (LOD1,LOD2,RTI), 5 local similarity approaches (KNN, LLS, LSA, REM, MBI) and 2 global-structure approaches (PPCA, BPCA) were evaluated which allows comparison and discussion of different imputation strategies. They were applied to 3 real datasets of different type and species, which represent a broad biological application.
Further comments and aspects
The list of cited or related literature is placed here.