-
Hi, Thank you for developing this amazing tool! I have a few questions regarding the appropriate expression data for GSEA input. According to the documentation, it appears that the input txt file can include FPKM, Expected Counts, TPM, etc., without specific notes about normalization. However, the Broad Institute emphasizes the importance of cross-sample normalization for standard GSEA (https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Using_RNA-seq_Datasets_with_GSEA) Could someone clarify whether GSEApy handles normalization issues automatically, or should I follow the original GSEA's recommondations and manually use normalized counts as input? Additionally, does the requirement change depending on whether it's phenotype permutation or gene_set permutation? For context, in some cases I only have 3 biological replicates available in both the pos and neg groups, which limits me to gene_set permutation. Your insights would be immensely valuable. Thank you in advance for your help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
GSEA only ranks your genes by given the input gene expression. the normalization method is your own choice. You should follow the original GSEA's recommondations and manually use normalized counts/expression as input. Yes, if you only have 3 biological replicates, gene_set permutation is your only choice. (need much more samples to stimulate null distribution, 6 in total is far less enough) |
Beta Was this translation helpful? Give feedback.
GSEA only ranks your genes by given the input gene expression. the normalization method is your own choice.
You should follow the original GSEA's recommondations and manually use normalized counts/expression as input.
Yes, if you only have 3 biological replicates, gene_set permutation is your only choice. (need much more samples to stimulate null distribution, 6 in total is far less enough)