Skip to content

Commit

Permalink
Small README and help changes
Browse files Browse the repository at this point in the history
  • Loading branch information
UBrau committed Jan 24, 2024
1 parent 6efb39c commit adb1f35
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 18 deletions.
6 changes: 3 additions & 3 deletions R/vastdiff.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,6 @@ option.list <- list(
help = "Name of the replicate set B, [default is first element of --replicateB]"),
make_option(c("-i", "--input"), type = "character", default = "INCLUSION_LEVELS",
help = "Exact or Partial match to PSI table in output directory [default %default]"),
make_option(c("-n", "--nLines"), type = "integer", default = "5000",
help = "Number of lines to read/process in parallel at a time...
lower number = less memory = greater overhead [default %default]"),
make_option(c("-p", "--paired"), action="store_true", default = FALSE,
help = "Samples are paired, -a pairOneA,pairTwoA,.. -b pairOneB,pairTwoB,.. [default %default]\n
Expand Down Expand Up @@ -83,6 +80,9 @@ option.list <- list(
[general options]"),
make_option(c("-c", "--cores"), type = "integer", default = 1, metavar="int",
help="Number of cores to use for plot processing.. [default %default]"),
make_option(c("-n", "--nLines"), type = "integer", default = "5000",
help = "Number of lines to read/process in parallel at a time...
lower number = less memory = greater overhead [default %default]"),
make_option(c("-z", "--seed"), type = "integer", default = 10, metavar="int",
help="Seed the RNG for a deterministic result.. [default %default]"),
make_option(c("-v", "--verbose"), type = "logical", default = TRUE, metavar=NULL,
Expand Down
27 changes: 12 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ From release v2.0.0, VAST-TOOLS includes a new module to identify and profile an
``vast-tools`` provides two alternative modules (``compare`` and ``diff``) to perform differential splicing analyses on a reduced number of samples per group. Each module gives different functionalities.

- ``compare``: pre-filters the events based on read coverage, imbalance and other features, and simply compares average and individual dPSIs. That is, it looks for non-overlapping PSI distributions based on fixed dPSI cut-offs. For more than 3 replicates, it is likely to be too stringent.
- ``diff``: performs a statistical test to assess whether the PSI distributions of the two compared groups are signficantly different. It is possible to pre-filter the events based on the minimum number of reads per sample, but subsequent filtering is highly recommended (e.g. overlapping the results with the output of ``tidy``). For more than 5 samples per group it may also be over stringent.
- ``diff``: performs a statistical test to assess whether the PSI distributions of the two compared groups are signficantly different. It is possible to pre-filter the events based on the minimum number of reads per sample, but subsequent filtering is highly recommended (e.g. overlapping the results with the output of ``tidy``). For more than 5 samples per group it may also be overly stringent.
- When comparing multiple samples per group, an alternative approach is recommended. First, events should be pre-filtered using ``tidy`` (see [Simplifying Combine Table](#simplifying-combine-table)). This module allows to select events for which a minimum number of samples per group pass the quality controls. Then, a Mann-Whitney U-test or similar can be used to identify differentially spliced events. Finally, average dPSI per group should be calculated and a minimum difference (usually |dPSI| > 15) should be requested.

#### *compare*: Comparing PSIs Between Samples
Expand Down Expand Up @@ -446,9 +446,10 @@ Note: Sample names do not have to follow any specific convention as long as they
Probably the most important extra options to consider are ``-r PROB (--prob)``,
``-m MINDIFF (--minDiff)``, ``-e MINREADS (--minReads)``, and `-S MINSAMPLES (--minSamples)`
These represent the stringency criterion for filtering of visual output and textual
data sent to file. `-S` is the minimum number of samples for each set `-a` and `-b`
output. `-S` is the minimum number of samples for each set `-a` and `-b`
that have to have at least `-e` reads each to be considered in the downstream
statistical comparison.
statistical comparison. Values in a sample group that do not survive these setting are
set to NA in the output table and not represented in the PDF.

The ``-r`` flag represents the
minimal probability of acceptance that is required to consider a comparison to
Expand All @@ -458,13 +459,12 @@ stringency requirements.
The ``-m`` flag represents the minimum value of difference (`MV`, see example below)
between PSI in group A and PSI in group B that you will accept, such that we are are sure with at least
probability ``-r`` that there is a difference of at least ``-m``. `-m` does not
currently alter the output sent to STDOUT, but does filter what is plotted to PDF
and printed to file.
alter the output table, but does filter what is plotted to PDF.

The ``-e`` flag specifies the minimum number of reads for a sample/event to be
compared. In cases where the prior distribution has been methodically calculated
and/or is believable beyond an uninformative prior (like the uniform default),
this may not be necessary, however it is still highly recommended. The default
this may not be necessary, however it is still highly recommended. The default
value for ``-e`` is 10, though this could easily be higher.

Additionally, ``diff`` allows you to alter the parameters of the conjugate beta
Expand All @@ -478,8 +478,8 @@ it may be more appropriate to use a custom prior model that is able to more accu
reflect the lower expectation of inclusion levels.

In the case that you have paired samples, where NormalA is dependent on
PerturbationA, it is appropriate to use the ``--paired=TRUE`` flag. For
example when considering NormalA and NormalB, to compare to PerturbationA and
PerturbationA, it is appropriate to use the ``--paired=TRUE`` flag. For
example, when considering NormalA and NormalB, to compare to PerturbationA and
PerturbationB, the probability that P( joint_psi1 - joint_psi2 > ``-m`` ) is
calculated such that NormalA is only compared to PerturbationA, and then NormalB
is compared to PerturbationB. No MLE fitting is used in this case.
Expand All @@ -495,16 +495,13 @@ posterior distribution to sample, lower numbers decrease accuracy but increase
performance.

The ``diff`` command is also able to run in parallel. Specify the number of
cores to use with ``-c INT``
Obviously more cores will increase the speed of ``diff``, at the cost of increased
RAM usage.
cores to use with ``-c INT``. Obviously more cores will increase the speed of ``diff``,
at the cost of increased memory.

Using the ``-n`` flag to specify the number of lines to read/process at a time,
will set a max threshold to the RAM used by parallel processing with the ``-c``
flag. A lower number means that ``diff`` will use significantly less memory,
however by decreasing ``-n`` you have increased the number of times that the
``mclapply`` function must calculate the parallel processing overhead. The
default is 100, which works well.
however, decreasing ``-n`` increases parallelization overhead and extends run time.

*Output Format*

Expand All @@ -522,7 +519,7 @@ The text output of diff looks like:
BCORL1 | HsaEX0007940 | 0.213452 | 0.500425 | -0.286973 | 0.05

Where for example the first event HsaEX0008312 in the BOD1L gene has multireplicate point estimate
for SampleA of 0.12 and 0.7 for SampleB. While this gives an expected value for the difference of
for SampleA of 0.12 and 0.7 for SampleB. While this gives an expected value for the difference of
PSI (dPsi/ΔPSI) between SampleA and SampleB of -0.57, the minimum value (`MV`) for |ΔPSI| at 0.95 is
0.3, meaning that there is a 0.95 probability that |ΔPSI| is greater than 0.3. Use this value
to filter for events that are statistically likely to have at least a minimal difference of some
Expand Down

0 comments on commit adb1f35

Please sign in to comment.