Difference between revisions of "Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies"

(Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies)
 
(15 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Accounting for the Multiple Natures of Missing Values in Label-Free
+
=== Citation ===
Quantitative Proteomics Data Sets to Compare Imputation Strategies ==
 
 
Lazar, C., Gatto, L., Ferro, M., Bruley, C., and Burger, T. (2016):
 
Lazar, C., Gatto, L., Ferro, M., Bruley, C., and Burger, T. (2016):
 
Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. Journal of Proteome Research, 15:1116–1125.
 
Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. Journal of Proteome Research, 15:1116–1125.
  
[https://doi.org/10.1021/acs.jproteome.5b00981: https://doi.org/10.1021/acs.jproteome.5b00981]  
+
[https://doi.org/10.1021/acs.jproteome.5b00981 Permanent link to the article]  
  
  
 
=== Summary ===
 
=== Summary ===
Briefly describe the scope of the paper, i.e. the field of research and/or application.
+
In this paper 5 imputation algorithms are evaluated depending on the number of missing values and randomness of the data to set practical guideless in choosing an appropriate imputation method which accounts for the specific type of missingness mechanism.
  
 
=== Study outcomes ===
 
=== Study outcomes ===
List the paper results concerning method comparison and benchmarking:
+
Following outcomes can be drawn from the paper:
 
==== Outcome O1 ====
 
==== Outcome O1 ====
The performance of ...
+
Imputation performs better with fewer missing values.
 
 
Outcome O1 is presented as Figure X in the original publication.  
 
  
 
==== Outcome O2 ====
 
==== Outcome O2 ====
...
+
There exist MNAR-devoted methods and MCAR-devoted methods (see Figures 2 and 3).
 +
Depending on the MNAR ratio of a specific data set, one should privilege a MNAR/MCAR-devoted method (see Figure 4).
  
Outcome O2 is presented as Figure X in the original publication.
+
==== Outcome O3 ====
+
MNAR-devoted methods perform worse the more missing values and the more random the missing values are (see Figures 2 and 3).
==== Outcome On ====
 
...
 
  
Outcome On is presented as Figure X in the original publication.  
+
MCAR-devoted methods perform worse the more missing values and the more NOT at random the missing values are (see Figures 2 and 3).
  
==== Further outcomes ====
+
==== Outcome O4 ====
If intended, you can add further outcomes here.
+
On average MCAR-devoted methods outperform MNAR-devoted methods, so that MCAR-devoted methods are recommended if the randomness of missing values is not known.
  
 +
==== Outcome O5 ====
 +
Peptide-level imputation is more accuarte (Figure 6).
  
 
=== Study design and evidence level ===
 
=== Study design and evidence level ===
==== General aspects ====
 
You can describe general design aspects here.
 
The study designs for describing specific outcomes are listed in the following subsections:
 
  
==== Design for Outcome O1 ====
+
The consideration of simulated data as well as real data, plus the application on protein level as well as on peptide level, makes the result sound and reliable.  
* The outcome was generated for ...
 
* Configuration parameters were chosen ...
 
* ...
 
==== Design for Outcome O2 ====
 
* The outcome was generated for ...
 
* Configuration parameters were chosen ...
 
* ...
 
  
...  
+
The great variations of missing value incorporation, 11 rates of MV and 11 rates of MNAR values, result in 121 simulated datasets which give a broad representation of different missingness mechanisms.
  
==== Design for Outcome O ====
+
Imputation was performed with 3 MCAR-devoted methods (kNN, SVDimpute, MLE) and 2 MNAR-devoted methods (MinDet, MinProb) which is not many but still shows the performance difference between MCAR/MNAR-devoted methods.
* The outcome was generated for ...
 
* Configuration parameters were chosen ...
 
* ...
 
  
 
=== Further comments and aspects ===
 
=== Further comments and aspects ===
  
 
=== References ===
 
=== References ===
The list of cited or related literature is placed here.
+
Webb-Robertson, B.-J. M.; Wiberg, H. K.; Matzke, M. M.;
 +
Brown, J. N.; Wang, J.; McDermott, J. E.; Smith, R. D.; Rodland, K. D.;
 +
Metz, T. O.; Pounds, J. G.; Waters, K. M.; et al. Review, evaluation,
 +
and discussion of the challenges of missing value imputation for mass
 +
spectrometry-based label-free global proteomics. J. Proteome Res. 2015,
 +
14 (5), 1993−2001. [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.]] [https://doi.org/10.1021/acs.jproteome.5b00981: DOI]

Latest revision as of 11:50, 25 February 2020

Citation

Lazar, C., Gatto, L., Ferro, M., Bruley, C., and Burger, T. (2016): Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. Journal of Proteome Research, 15:1116–1125.

Permanent link to the article


Summary

In this paper 5 imputation algorithms are evaluated depending on the number of missing values and randomness of the data to set practical guideless in choosing an appropriate imputation method which accounts for the specific type of missingness mechanism.

Study outcomes

Following outcomes can be drawn from the paper:

Outcome O1

Imputation performs better with fewer missing values.

Outcome O2

There exist MNAR-devoted methods and MCAR-devoted methods (see Figures 2 and 3). Depending on the MNAR ratio of a specific data set, one should privilege a MNAR/MCAR-devoted method (see Figure 4).

Outcome O3

MNAR-devoted methods perform worse the more missing values and the more random the missing values are (see Figures 2 and 3).

MCAR-devoted methods perform worse the more missing values and the more NOT at random the missing values are (see Figures 2 and 3).

Outcome O4

On average MCAR-devoted methods outperform MNAR-devoted methods, so that MCAR-devoted methods are recommended if the randomness of missing values is not known.

Outcome O5

Peptide-level imputation is more accuarte (Figure 6).

Study design and evidence level

The consideration of simulated data as well as real data, plus the application on protein level as well as on peptide level, makes the result sound and reliable.

The great variations of missing value incorporation, 11 rates of MV and 11 rates of MNAR values, result in 121 simulated datasets which give a broad representation of different missingness mechanisms.

Imputation was performed with 3 MCAR-devoted methods (kNN, SVDimpute, MLE) and 2 MNAR-devoted methods (MinDet, MinProb) which is not many but still shows the performance difference between MCAR/MNAR-devoted methods.

Further comments and aspects

References

Webb-Robertson, B.-J. M.; Wiberg, H. K.; Matzke, M. M.; Brown, J. N.; Wang, J.; McDermott, J. E.; Smith, R. D.; Rodland, K. D.; Metz, T. O.; Pounds, J. G.; Waters, K. M.; et al. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J. Proteome Res. 2015, 14 (5), 1993−2001. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. DOI