Skip to content

Creating visualizations for MP BioPath output

cadiachan edited this page Nov 16, 2017 · 2 revisions

make_output_files.py

This python script takes in 2 files to create a heatmap and has the option of producing a t-SNE plot and Kaplan Meier survival curve from the hierarchical clustering performed automatically as represented by the horizontal dendrogram in the heatmap for the donors (x-axis).

Number of groups (k-means) will need to be changed directly in the R-script. An additional parameter can be added so that the user can specify k-means value directly through command line. By default, k-means is 4.

All output files will be in .svg format.

Ensure that create_visuals.R is in the same folder (REQUIRED for make_output_files.py to run)

Usage

For heatmap (only) output:

python make_output_files.py [MP-BioPath output file] -p [list of pathways with colours file] -o [output name prefix]

For heatmap, t-SNE plot (-t) and Kaplan Meier survival curve (-k) output:

python make_output_files.py [MP-BioPath output file] -p [list of pathways with colours file] -o [output name prefix] -t -k

For no heatmap output (-h):

python make_output_files.py [MP-BioPath output file] -p [list of pathways with colours file] -o [output name prefix] -h

Notes

By default, you must provide both:

  • MP-BioPath output file
  • -p --pathway_list: List of pathways with their assigned colours

Optional parameters:

  • -o --output: For user to specify prefix of names for output files (Default = "output")
  • -m --heatmap: Will not produce heatmap file if specified (Default = TRUE)
  • -k --kaplan: Will produce Kaplan Meier survival curve if specified (Default = FALSE)
  • -t --tsne: Will produce t-SNE plot if specified (Default = FALSE)
  • -h --help: Lists all options

Note about Kaplan Meier curve:

At the time this script was written, there was no clinical data for the donors. In order to create the Kaplan Meier curve, a toy data set is used. Check R script for name of file.

This means that the Kaplan Meier part of the script currently is not taking the clustering/subgroup information from the heatmap dendrogram automatically. Changes will need to be made once clinical data is obtained.