v1.3_Arborescence of all results

This is the entire tree structure of an analysis.

Here we are in the output folder.

Analysis of individual samples

[sample_name]                                            #1 folder per sample
├── KALLISTOBUS
│   ├── [sample_name].barcodes.txt                       #count matrix
│   ├── [sample_name].genes.txt                          #count matrix
│   ├── [sample_name].mtx                                #count matrix
│   ├── run_info.json                                    #alignment run information
│   └── Materials_and_Methods.txt
├── QC_reads
│   └── [sample_name]_RAW.html
├── QC_droplets                                          #empty droplets are filtered
│   ├── [sample_name]_kneeplot.png
│   ├── [sample_name]_saturation_plot.png                #sequencing saturation
│   ├── [sample_name]_QChist.png                         #histogramm of cell frequency according to some biases before filtering
│   └── [sample_name]_QC_NON-NORMALIZED.rda              #seurat R object before filtering
└── [Filtering_features_counts_biases]                   #ex:"F200_C1000_M0-0.2_R0-1_G5" = Features>200genes, Counts>1000UMI, 0%<%ARNmito<20%, 0%<%ARNribo<100%, Genes>5cells
    └── [Filtering_Doublets]                                     #ex:"DOUBLETSFILTER_all" = filtred par all methods; "DOUBLETSKEPT" = not filtered
        ├── [sample_name]_FILTERED_NON-NORMALIZED.rda            #seurat R object after filtering but before normalization and dimension reduction
        ├── [sample_name]_QChist.png                             #histogramm of cell frequency according to some biases after filtering
        ├── [sample_name]_stat.txt                               #statistics from alignment to before normalization
        └── [Normalisation_Method_Corrected_Biases]              #ex:"SCTransform" = no correction; "SCTransformnFeature_RNA_percent_mt" = correction of nFeature_RNA et percent_mt
            └── [Dimension_Reduction_Method_Corrected_Biases]    #ex:"scbfa"
                ├── [sample_name]_LogNormalize_scbfa.rda         #seurat R object after normalization and dimension reduction bur before clustering
                ├── [sample_name]_[assay]_[Dimension_Reduction_Method]_dims.bias.cor.png    #correlation between biases and dimensions; allows to identify the biases to correct.
                ├── clustree_[assay]_[Dimension_Reduction_Method]                           #help to choose the number of dimensions and the resolution for clustering
                │   ├── dimensions                               #clustree representation of the number of changing cluster cells according to the number of dimensions, with fixed resolution
                │   │   ├── [sample_name]_[assay]_[dimension_reduction_method]11.png
                │   │   └── ...
                │   ├── louvain_resolution                       #clustree representation of the number of cluster changing cells according to the resolution, with the number of dimensions fixed.
                │   │   ├── [sample_name]_[assay]_res10.png
                │   │   └── ...
                │   └── uMAPs                                    #umap representation of a range of kept dimensions and resolution
                │       ├── [sample_name]_[assay]_uMAPs_[assay]_[Dimension_Reduction_Method]11_ALLres.png     #grouped graphs with 11 kept dimensions and all resolutions.
                │       ├── ...
                │       ├── [sample_name]_uMAP_[assay]_[Dimension_Reduction_Method]11_res0.9.png              #graph with 11 kept dimensions and 0.9 resolutions
                │       └── ...
                └── [Clustering]                                                       #ex: dims25_res0.3: kept dimensions = 25, rssolution = 0,3
                    ├── cells_annotation                                               #cell type annotation according to the databases at our disposal
                    │   └── [Tool_used]                                                #ex: "singler", "clustifyr"
                    │       ├── [sample_name]_[assay]_uMAP_SR_[Database]_cells.png     #cell by cell annotation ("_cell" suffix)
                    │       ├── [sample_name]_[assay]_uMAP_CFR_[Database]_clust.png    #cluster annotation ("_clust" suffix)
                    │       └── ...
                    ├── found_markers                                                  #differential gene expression analysis by Wilcoxon
                    │   ├── [sample_name]_findmarkers_top10_cluster1_vln.png           #violinplot of top 10 marker genes for each cluster (here cluster 1)
                    │   ├── ...
                    │   ├── [sample_name]_findmarkers_top10_heatmap.png                #heatmap of top 10 marker genes by cluster, for all clusters
                    │   ├── [sample_name]_findmarkers_upset_all.png                    #venn diagramm of all marker genes by cluster, for all clusters
                    │   ├── [sample_name]_findmarkers_upset_top10.png                  #venn diagramm of top 10 marker genes by cluster, for all clusters
                    │   └── [sample_name]_[assay]_[Dimension_Reduction_Method].25_res.0.3_findmarkers_all.txt       #table of all identified marker genes
                    ├── markers
                    │   ├── [sample]_markers_SINGLE_ABCC4_uMAP.png             #umap of gene expressions of genes provided by the makefile (here ABCC4 gene)
                    │   └── ...
                    ├── [sample_name]_[assay]_[Dimension_Reduction_Method]_uMAP3d_dim25_res0.3.png       #umap in 3D
                    ├── [sample_name]_[assay]_[Dimension_Reduction_Method]_uMAP_dim25_res0.3.png         #umap in 2D
                    ├── [sample_name]_[Normalisation_Method_Corrected_Biases]_[Dimension_Reduction_Method_Corrected_Biases]_25_0.3.crb    #cerebro object based on the final seurat R object (here for 25 dimensions and 0.3 of resolution)
                    ├── [sample_name]_[Normalisation_Method_Corrected_Biases]_[Dimension_Reduction_Method_Corrected_Biases]_25_0.3.rda    #final seurat R object (here for 25 dimensions and 0.3 of resolution)
                    ├── technical                                                                               #representation of potential biases on the umap
                    │   ├── [sample_name]_technical_MULTI_ALL_uMAPs.png
                    │   └── ...
                    ├── TCR_results
                    │   ├── Clusters_analysis                                  #per cluster analysis
                    │   │   ├── aaProperties_[sample_name].png                 #lists the physicochemical properties of the amino acids that make up the receptors.
                    │   │   ├── clust_cldiv[sample_name]png                    #measures the diversity of clonotypes within each cluster (4 metrics: Shannon, inverse Simpson,  Chao1, and Abundance-based Coverage Estimator (ACE))
                    │   │   ├── clust_clhomeo[sample_name].png                 #proportion of cells which present the clonotypes according to their frequency among the clonotypes
                    │   │   ├── clust_clOverlap_[sample_name].png :            #percentage of common clonotypes between 2 clusters
                    │   │   ├── clust_clprop[sample_name].png                  #proportion of cells with the most frequent [x: y] clonotypes (organized in rank)
                    │   │   ├── clust_quantContig_[sample_name].png            #number of different clonotype in the sample
                    │   │   ├── Frequency_top_10_clust{clust_num]_umap[sample_name].png    #umap locating the 10 most frequent TCRs per cluster.
                    │   │   ├── overlap_[def_clonotype]_[sample_name].txt                  #percentage of common clonotypes between the clusters, with the name of the compared clusters, their number of TCRs, the number of TCRs in common and the list of these TCRs in common.
                    │   │   └── abundanceContig.png          #the number of clonotype depending on the number of cells that clonotype presents
                    │   └── Global_analysis                  #overall analysis of the sample, disregarding clusters
                    │         ├── aaProperties.png           #lists the physicochemical properties of the amino acids that make up the receptors.
                    │         ├── abundanceContig.png        #number of clonotypes depending on the number of cells that clonotype presents
                    │         ├── clhomeo.png                #proportion of cells that present clonotypes according to their frequency among the colonotypes
                    │         ├── cloneType.png              #several umap with the location of each part of the TCR + the size of the TRA and TRB.
                    │         ├── clprop.png                 #proportion of cells with the most frequent [x: y] clonotypes (organized in rank).
                    │         ├── Frequency_top_10_umap[sample_name].png       #umap locating the 10 most frequent TCRs
                    │         ├── Frequency_top11to20_umap[sample_name].png    #umap locating the 11 to 20 most frequent TCRs
                    │         ├── Frequency_umap[sample_name].png              #distribution of the frequency of clonotypes (in gradient at the top and grouped in class at the bottom).
                    │         ├── lengthContig.png                             #the length of the TCRs (combined or separate chains).
                    │         ├── QC_quantif.png                               #QC of receptors
                    │         ├── quantUniqueContig.png                        #number of different clonotypes in the sample
                    │         └── cldiv.png                                    #measures the diversity of clonotypes within the sample (4 metrics:  Shannon, inverse Simpson,  Chao1, and Abundance-based Coverage Estimator (ACE))
                    └── BCR_results
                        ├── ...            #same as TCR
                        └── ...            #same as TCR

Analysis of integrated/grouped samples

GROUPED_ANALYSIS 
├──INTEGRATED                                                          #analysis of integrated samples
│  └──[Name_Integration_data]
│         └── [Normalisation_Method_Corrected_Biases]                  #ex:"SCTransform" = no correction; "SCTransformnFeature_RNA_percent_mt" = correction of nFeature_RNA et percent_mt
│                 └── [Dimension_Reduction_Method_Corrected_Biases]    #ex:"scbfa"
│                     ├── ...                                          #same as "Analysis of individual samples" section
│                     └── ...                                          #same as "Analysis of individual samples" section
└── NO_INTEGRATED                                                      #analysis of grouped samples
           └── [Normalisation_Method_Corrected_Biases]                 #ex:"SCTransform" = no correction; "SCTransformnFeature_RNA_percent_mt" = correction of nFeature_RNA et percent_mt
                   └── [Dimension_Reduction_Method_Corrected_Biases]   #ex:"scbfa"
                       ├── ...                                         #same as "Analysis of individual samples" section
                       └── ...                                         #same as "Analysis of individual samples" section

Note:

Some folders may be missing, as they depend on the type of analysis and the type of your initial data.
If the analysis was carried out by a bioinformatician from the Gustave Roussy bioinformatics core facility, this bioinformatician may only give you the relevant part of the sub-files in order to facilitate your exploration of the results.

Home

Resources of the Theory of single cell RNA-seq

v1.3

Pipeline details

Installation

Usage

Configuration

Results help

Complete Examples of school cases

Individual analysis :
1 sample (scRNA-seq + ADT + TCR + BCR)

Grouped/Integrated analysis :
2 samples (scRNA-seq + ADT + TCR + BCR)

The datasets
Preparation of the analysis
- Make the ADT reference index
- Make the Markfile
General information
Make the integrated analysis
- Integration, Normalization, Dimension Reduction, Biases and Clustering Evaluation
- Clustering, Marker Genes, Annotation, ADT, TCR, BCR and Cerebro
Make the grouped analysis
- Merge, Normalization, Dimension Reduction, Biases and Clustering Evaluation
- Clustering, Marker Genes, Annotation, ADT, TCR, BCR and Cerebro

v1.3.1