Difference between revisions of "Toward a gold standard for benchmarking gene set enrichment analysis"

(Created page with "__ NUMBEREDHEADINGS__ === Citation === Geistlinger, L., Csaba, G., Santarelli, M., Ramos, M., Schiffer, L., Law, C., ... & Zimmer, R., Toward a gold standard for benchmarking...")
 
 
(One intermediate revision by the same user not shown)
Line 4: Line 4:
  
 
[https://doi.org/10.1093/bib/bbz158 Permanent link to the paper]
 
[https://doi.org/10.1093/bib/bbz158 Permanent link to the paper]
 
  
 
=== Summary ===
 
=== Summary ===
Briefly describe the scope of the paper, i.e. the field of research and/or application.
+
Gene set analyses are combination of several analysis modules.
 +
This paper investigates the performance of ten prominent approaches.
 +
Biological plausibility based on co-citation databases is used for assessment.
  
 
=== Study outcomes ===
 
=== Study outcomes ===
List the paper results concerning method comparison and benchmarking:
 
 
==== Outcome O1 ====
 
==== Outcome O1 ====
 
The performance of ...
 
The performance of ...
Line 27: Line 27:
  
 
==== Further outcomes ====
 
==== Further outcomes ====
If intended, you can add further outcomes here.
+
Runtimes are as follows:
 +
 
  
  
Line 47: Line 48:
 
** GSVA
 
** GSVA
 
* "Gene set relevance rankings for each disease were constructed by querying the MalaCards database. MalaCards scores genes for disease relevance based on experimental evidence and co-citation in the literature."
 
* "Gene set relevance rankings for each disease were constructed by querying the MalaCards database. MalaCards scores genes for disease relevance based on experimental evidence and co-citation in the literature."
 +
* "A nominal significance level of 0.05" is used (without correction with respect to multiple testing). This was also common in other benchmark studies.
 +
* The "type I error rate was evaluated by randomization of the sample labels" of the microarray data set.
 +
* "Random gene sets of increasing set size were analyzed to assess whether enrichment methods are affected by geneset size." For this purpose, 100 "random gene sets of defined sizes {5,10,25,50,100,250,500}" were sampled.
  
 
==== Design for Outcome O1 ====
 
==== Design for Outcome O1 ====
Line 65: Line 69:
  
 
=== Further comments and aspects ===
 
=== Further comments and aspects ===
 +
An R package (GSEABenchmarkeR) is available that seem to enable similar analyses.
  
 
=== References ===
 
=== References ===
 
The list of cited or related literature is placed here.
 
The list of cited or related literature is placed here.

Latest revision as of 15:57, 25 February 2020

__ NUMBEREDHEADINGS__

Citation

Geistlinger, L., Csaba, G., Santarelli, M., Ramos, M., Schiffer, L., Law, C., ... & Zimmer, R., Toward a gold standard for benchmarking gene set enrichment analysis, 2020, Bioinformatics, 0, 1-12

Permanent link to the paper

Summary

Gene set analyses are combination of several analysis modules. This paper investigates the performance of ten prominent approaches. Biological plausibility based on co-citation databases is used for assessment.

Study outcomes

Outcome O1

The performance of ...

Outcome O1 is presented as Figure X in the original publication.

Outcome O2

...

Outcome O2 is presented as Figure X in the original publication.

Outcome On

...

Outcome On is presented as Figure X in the original publication.

Further outcomes

Runtimes are as follows:


Study design and evidence level

General aspects

  • "75 expression datasets investigating 42 human diseases"
  • microarray and RNAseq data
  • pre-existing benchmark data sets
  • 10 methods:
    • ORA
    • GLOBALTEST
    • GSEA
    • SAFE
    • GSA
    • SAMGS
    • ROAST
    • CAMERA
    • PADOG
    • GSVA
  • "Gene set relevance rankings for each disease were constructed by querying the MalaCards database. MalaCards scores genes for disease relevance based on experimental evidence and co-citation in the literature."
  • "A nominal significance level of 0.05" is used (without correction with respect to multiple testing). This was also common in other benchmark studies.
  • The "type I error rate was evaluated by randomization of the sample labels" of the microarray data set.
  • "Random gene sets of increasing set size were analyzed to assess whether enrichment methods are affected by geneset size." For this purpose, 100 "random gene sets of defined sizes {5,10,25,50,100,250,500}" were sampled.

Design for Outcome O1

  • The outcome was generated for ...
  • Configuration parameters were chosen ...
  • ...

Design for Outcome O2

  • The outcome was generated for ...
  • Configuration parameters were chosen ...
  • ...

...

Design for Outcome O

  • The outcome was generated for ...
  • Configuration parameters were chosen ...
  • ...

Further comments and aspects

An R package (GSEABenchmarkeR) is available that seem to enable similar analyses.

References

The list of cited or related literature is placed here.