Skip to content

Commit

Permalink
workshop
Browse files Browse the repository at this point in the history
  • Loading branch information
dw-thomson committed Oct 9, 2024
1 parent fd82735 commit 2039b5e
Show file tree
Hide file tree
Showing 8 changed files with 16,617 additions and 56 deletions.
54 changes: 26 additions & 28 deletions content/DifferentialAbundance.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,42 +20,18 @@ nf-core/differentialabundance is a bioinformatics pipeline that can be used to a
- for this reason there is a completed run in ~/workshop/nfDifferentialAbundance/, where you can go through the motions
- a good chance to demonstrate the **resume** feature of nextflow


```bash
mkdir ~/workshop/nfDifferentialAbundance2
cd ~/workshop/nfDifferentialAbundance2
mkdir ~/workshop/DifferentialAbundance
cd ~/workshop/DifferentialAbundance
ls -l
```
***optional*** you can use the nf-core launch command to build a launch command, but these instructions will be using a 'nextflow run' command

The minimal input requirements are
1. Sample sheet
- containg the sample information, metadata and group relationships

2




nf-core launch
```
####
nf-core/DifferentialAbundance
There are a few ways of installing the nf-core pipeline. But this happens automatically when you use a *nexflow run nf-core/* commands.
You can check the ~/.nextflow/assets folders to see what is already installed
```
```bash
ls -l ~/.nextflow/assets/nf-core/
```
If you don't see the pipeline, you can pull it from the nf-core website.
If you don't see the pipeline, you can pull it from the nf-core website.
```bash
nextflow pull nf-core/differentialabundance
```
Expand All @@ -64,6 +40,28 @@ but this happeds automatically when running the nextflow run nf-core/differentia
![nfpull](nextflowpull.png)


to identify the requirements of the pipeline go to the website
https://nf-co.re/differentialabundance/1.5.0/docs/usage/
The minimal input requirements are
1. Sample sheet, containg the sample information, metadata and group relationships
2. A counts table, such as that made with the RNAseq pipeline
3. a transcript length table - this is output from the RNAseq pipeline. It allows for more acurate normalisation based on transcript length.
4. profile, refering to the config file and the experiment type. rnaseq, singularity.
5. a gtf file - ideally this is the same genome reference that was used in the mapping step of the nf-core/RNAseq run
6. optional. gsea gene set for gsea analysis. this can be downloaded from https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp


For those who just want to get the result, the previous run can be resumed with the following command
```bash
cd ~/workshop/nfDifferentialAbundance
sh nf_differentialabundance.sh
```

#### multiQC summary
[link to multiQC](../SAGC_Workshop_RNAseq.html)



# R Shiny App
- one of the key outputs from the nf-core/DifferentialAbundance pipeline is an Shiny app.
- This runs R processes in the background and presents the data as a site.
Expand Down
225 changes: 225 additions & 0 deletions content/DifferentialAbundance.md.save
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
+++
title = 'nf-core/DifferentialAbundance'
+++

![DifferentialAbundance_pipeline](DifferentialAbundance_pipeline.png)

nf-core/differentialabundance is a bioinformatics pipeline that can be used to analyse data represented as matrices, comparing groups of observations to generate differential statistics and downstream analyses. The pipeline supports RNA-seq data such as that generated by the nf-core rnaseq workflow, and Affymetrix arrays via .CEL files. Other types of matrix may also work with appropriate changes to parameters, and PRs to support additional specific modalities are welcomed.

1. Optionally generate a list of genomic feature annotations using the input GTF file (if a table is not explicitly supplied).
2. Cross-check matrices, sample annotations, feature set and contrasts to ensure consistency.
3. Run differential analysis over all contrasts specified.
4. Optionally run a differential gene set analysis.
5. Generate exploratory and differential analysis plots for interpretation.
6. Optionally build and (if specified) deploy a Shiny app for fully interactive mining of results.
7. Build an HTML report based on R markdown, with interactive plots (where possible) and tables.

# Set-up run
***for those with more confidence*** this run can be prepared from scratch in a new directory.
- the pipeline will taake ~40mins to run
- for this reason there is a completed run in ~/workshop/nfDifferentialAbundance/, where you can go through the motions
- a good chance to demonstrate the **resume** feature of nextflow

```bash
mkdir ~/workshop/DifferentialAbundance
cd ~/workshop/DifferentialAbundance
ls -l
```
***optional*** you can use the nf-core launch command to build a launch command, but these instructions will be using a 'nextflow run' command

You can check the ~/.nextflow/assets folders to see what is already installed
```bash
ls -l ~/.nextflow/assets/nf-core/
```
If you don't see the pipeline, you can pull it from the nf-core website.
```bash
nextflow pull nf-core/differentialabundance
```
but this happeds automatically when running the nextflow run nf-core/differentialabundance

![nfpull](nextflowpull.png)


to identify the requirements of the pipeline go to the website
https://nf-co.re/differentialabundance/1.5.0/docs/usage/
The minimal input requirements are
1. Sample sheet, containg the sample information, metadata and group relationships
2. A counts table, such as that made with the RNAseq pipeline
3. a transcript length table - this is output from the RNAseq pipeline. It allows for more acurate normalisation based on transcript length.
4. profile, refering to the config file and the experiment type. rnaseq, singularity.
5. a gtf file - ideally this is the same genome reference that was used in the mapping step of the nf-core/RNAseq run
6. optional. gsea gene set for gsea analysis. this can be downloaded from https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp


For those who just want to get the result, the previous run can be resumed with the following command
```bash
cd ~/workshop/nfDifferentialAbundance
sh nf_differentialabundance.sh
```

#### multiQC summary
[link to multiQC](../SAGC_Workshop_RNAseq.html)



# R Shiny App
- one of the key outputs from the nf-core/DifferentialAbundance pipeline is an Shiny app.
- This runs R processes in the background and presents the data as a site.
- One way to run the app is to kick it off in R studio, to do this you can download the App directory and use the following R code to open it.

## Launching the App on your local R Studio
***option 1***

The easiest way to run your R shiny app is using R studio.

Firstly, you will need to ***download**** the nf-core/DifferentialAbundance outputs which you have generated to your pc using scp.

```bash
IP=[yourIPaddress]

mkdir -p ~/workshop_RNAseq/nfDifferentialAbundance
scp -r workshop@$IP:/home/workshop/workshop/nfDifferentialAbundance/outs/ ~/workshop_RNAseq/nfDifferentialAbundance/

cd ~/workshop_RNAseq/nfDifferentialAbundance/outs
```
Open R studio, and paste the following R code. This also requires installation of a few packages in Rstudio including the ***shinyngs** package. The installation code is hashed out in this case.

Within R studio, in the top left panel you can paste this code
```r
##To run shiny App
# run this in R studio

# install packages
install.packages("remotes")
remotes::install_github("pinin4fjords/shinyngs")
library(remotes)
library(shinyngs)
library(markdown)

# if you have downloaded the app locally then you will need to add your path to the working directory
# if you downloaded it to the directory described above, this is where it will be '~/workshop_RNAseq/nfDifferentialAbundance'
setwd("/Users/danielthomson/workshop_RNAseq/nfDifferentialAbundance/outs/shinyngs_app/SAGC_Workshop_RNAseq")

#####
esel <- readRDS("data.rds") # you need to navigate to the 'shinyngs_app' directory
app <- prepareApp("rnaseq", esel)
shiny::shinyApp(app$ui, app$server)
```

## Using Rstudio via Nectar
***option 2***

You can also run Rstudio from a virual machine. In some cases this is preferable when dealing with large datasets, where you can control the compute resources used. \

| setting | Description |
|------|----------------------------------------------------------------------------------------|
|Host | use the instance IP address, or the hostname [hostname].[project].cloud.edu.au |
|Login | use the username you created in the Configure Application dialog |


you shoulud see the following login page, access with your passoword\
![Nectar Rstudio](Nectar_Rstudio.png)


Within R studio, in the top left panel you can paste this code, making sure the path to the app directory is correct.
```r
##To run shiny App
# run this in R studio

#######################
# install packages
#install.packages("remotes")
#if (!require("BiocManager", quietly = TRUE))
# install.packages("BiocManager")
#BiocManager::install(c("SummarizedExperiment", "GSEABase", "limma"), force = TRUE)
#devtools::install_github('pinin4fjords/shinyngs', force = TRUE)

library(remotes)
library(shinyngs)
library(markdown)

#######################

# navigate to where the shiny app is
# if you are working on Rstudio running on your nectar instance, then this should be in the 'out' directory from where you ran the nf-core/DiferentialAbundance pipeline
setwd("/home/workshop/workshop/nfDifferentialAbundance/outs/shinyngs_app/SAGC_Workshop_RNAseq/")

#####
esel <- readRDS("data.rds") # you need to navigate to the 'shinyngs_app' directory
app <- prepareApp("rnaseq", esel)
shiny::shinyApp(app$ui, app$server)
```

# Interractive RNAseq data analysis
This previous step is worth the hastle, because it gives you access to the R Shiny App with all your results available.

![](RshinyHomepage.png)

From here, you will see all the parameters you set up in your *nextflow run* kickoff script. And it will give you access to the data analysis.

Once you've made it to this point take some time to navigate around. You will see interactive versions of many key differential expression analysis tools. In the background, it is being run in your Rstudio session.

Now that we have done the combined work of what would take days writing R code for DEseq2 and ggplot2, there is a fair bit to unpack and understand \
We'll walk through some of the key analysis, results.

![](Rshiny_volcano.png)



![](RshinyExperimentalData.png)
You can see from the 'Experimental data' page, all of the information came from the sample sheet provided to nf-core/DifferentialAbudance.

```bash
cd home/workshop/workshop/nfDifferentialAbundance/
cat SampleSheet.csv
```
sample,fastq_1,fastq_2,treatment,cellline,condition
Acontrol1,,,control,A,Acontrol
Acontrol2,,,control,A,Acontrol
Acontrol3,,,control,A,Acontrol
Acontrol4,,,control,A,Acontrol
Atreated1,,,treated,A,Atreated
Atreated2,,,treated,A,Atreated
Atreated3,,,treated,A,Atreated
Atreated4,,,treated,A,Atreated
Bcontrol1,,,control,B,Bcontrol
Bcontrol2,,,control,B,Bcontrol
Bcontrol3,,,control,B,Bcontrol
Bcontrol4,,,control,B,Bcontrol
Btreated1,,,treated,B,Btreated
Btreated2,,,treated,B,Btreated
Btreated3,,,treated,B,Btreated
Btreated4,,,treated,B,Btreated

```bash
cat contrasts.csv
```
id,variable,reference,target
cellline,cellline,A,B
treatedVScontrol1,condition,Acontrol,Atreated
treatedVScontrol2,condition,Bcontrol,Btreated

There is the opportunity to add as many extra columns to this table, which will all provide extra possible comparisons to the app.

![](RowMetadata.png)
Looking at the 'row metadata' page, you can see that all the gene information comes from the ***gtf*** file which we used.\
For well annotated genomes like Hg38 (human) there is alot of extra information that can be pulled out.
```bash
head -n 10 genome.gtf
```

```bash
head -n 10 genome.gtf
```


![normalisation](normalisation.png)


![PCA](PCA.png)

![cluster](heirachicalclustering.png)

![foldchange](foldchange.png)

![PDX1](PDX1.png)
1 change: 0 additions & 1 deletion content/sRNA.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ title = 'nf-core/smRNAseq'
https://nf-co.re/smrnaseq/2.2.1/

#### nf-core/smRNAseq
test

Pipeline summary
1. Raw read QC (FastQC)
Expand Down
8,169 changes: 8,169 additions & 0 deletions public/SAGC_Workshop_RNAseq.html

Large diffs are not rendered by default.

53 changes: 28 additions & 25 deletions public/differentialabundance/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -159,43 +159,46 @@ <h1 id="set-up-run">



<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">mkdir ~/workshop/nfDifferentialAbundance2
</span></span><span class="line"><span class="cl"><span class="nb">cd</span> ~/workshop/nfDifferentialAbundance2
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">mkdir ~/workshop/DifferentialAbundance
</span></span><span class="line"><span class="cl"><span class="nb">cd</span> ~/workshop/DifferentialAbundance
</span></span><span class="line"><span class="cl">ls -l</span></span></code></pre></div>
<p><em><strong>optional</strong></em> you can use the nf-core launch command to build a launch command, but these instructions will be using a &rsquo;nextflow run&rsquo; command</p>
<p>The minimal input requirements are</p>
<ol>
<li>Sample sheet</li>
</ol>
<ul>
<li>containg the sample information, metadata and group relationships</li>
</ul>
<p>2</p>
<p>nf-core launch</p>
<p>You can check the ~/.nextflow/assets folders to see what is already installed</p>



<pre tabindex="0"><code>


<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">ls -l ~/.nextflow/assets/nf-core/</span></span></code></pre></div>
<p>If you don&rsquo;t see the pipeline, you can pull it from the nf-core website.</p>



<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">nextflow pull nf-core/differentialabundance</span></span></code></pre></div>
<p>but this happeds automatically when running the nextflow run nf-core/differentialabundance</p>
<p><img src="nextflowpull.png" alt="nfpull"></p>
<p>to identify the requirements of the pipeline go to the website
<a href="https://nf-co.re/differentialabundance/1.5.0/docs/usage/">https://nf-co.re/differentialabundance/1.5.0/docs/usage/</a>
The minimal input requirements are</p>
<ol>
<li>Sample sheet, containg the sample information, metadata and group relationships</li>
<li>A counts table, such as that made with the RNAseq pipeline</li>
<li>a transcript length table - this is output from the RNAseq pipeline. It allows for more acurate normalisation based on transcript length.</li>
<li>profile, refering to the config file and the experiment type. rnaseq, singularity.</li>
<li>a gtf file - ideally this is the same genome reference that was used in the mapping step of the nf-core/RNAseq run</li>
<li>optional. gsea gene set for gsea analysis. this can be downloaded from <a href="https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp">https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp</a></li>
</ol>
<p>For those who just want to get the result, the previous run can be resumed with the following command</p>



####
nf-core/DifferentialAbundance
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nb">cd</span> ~/workshop/nfDifferentialAbundance
</span></span><span class="line"><span class="cl">sh nf_differentialabundance.sh</span></span></code></pre></div>

There are a few ways of installing the nf-core pipeline. But this happens automatically when you use a *nexflow run nf-core/* commands.
You can check the ~/.nextflow/assets folders to see what is already installed</code></pre>
<p>ls -l ~/.nextflow/assets/nf-core/</p>



<pre tabindex="0"><code>If you don&#39;t see the pipeline, you can pull it from the nf-core website.
```bash
nextflow pull nf-core/differentialabundance</code></pre>
<p>but this happeds automatically when running the nextflow run nf-core/differentialabundance</p>
<p><img src="nextflowpull.png" alt="nfpull"></p>
<h4 id="multiqc-summary">
<a class="Heading-link u-clickable" href="/workshop_nfRNAseq/differentialabundance/#multiqc-summary">multiQC summary</a>
</h4>
<p><a href="../SAGC_Workshop_RNAseq.html">link to multiQC</a></p>



Expand Down
2 changes: 0 additions & 2 deletions public/srna/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
nf-core/smRNAseq
test
Pipeline summary
Raw read QC (FastQC)
Expand Down Expand Up @@ -181,7 +180,6 @@ <h1 class="Heading-title">
<h4 id="nf-coresmrnaseq">
<a class="Heading-link u-clickable" href="/workshop_nfRNAseq/srna/#nf-coresmrnaseq">nf-core/smRNAseq</a>
</h4>
<p>test</p>
<p>Pipeline summary</p>
<ol>
<li>Raw read QC (FastQC)</li>
Expand Down
Loading

0 comments on commit 2039b5e

Please sign in to comment.