Module for performing plotting.
This module uses pylab and matplotlib to make plots. Before running a function in this module, you should use the PylabAvailable function to determine if pylab and matplotlib are available. Otherwise, calling any other function will raise an Exception if thise modules are not available. The pdf backend is used for matplotlib / pylab. This means that plots must be created as PDF files.
A few functions also utilize scipy for calculations. Before using these functions, you should use ScipyAvailable to see if scipy is available. Otherwise an exception will be raised.
PylabAvailable
CumulativeFractionPlot
Base10Formatter
SubsetPValue
SplitLabel
PlotLinearDensity
CorrelationPlot
PlotDistributionComparison
Provided in their individual documentation strings below.
Converts a number into Latex formatting with scientific notation.
Takes a number and converts it to a string that can be shown in LaTex using math mode. It is converted to scientific notation if the criteria specified by exp_cutoff.
number the number to be formatted, should be a float or integer. Currently only works for numbers >= 0
exp_cutoff convert to scientific notation if abs(math.log10(number)) >= this.
exp_decimal_digits show this many digits after the decimal if number is converted to scientific notation.
decimal_digits show this many digits after the decimal if number is NOT converted to scientific notation.
The returned value is the LaTex’ string. If the number is zero, the returned string is simply ‘0’.
>>> Base10Formatter(103, 3, 1, 1)
'103.0'
>>> Base10Formatter(103.0, 2, 1, 1)
'1.0 \\times 10^{2}'
>>> Base10Formatter(103.0, 2, 2, 1)
'1.03 \\times 10^{2}'
>>> Base10Formatter(2892.3, 3, 1, 1)
'2.9 \\times 10^{3}'
>>> Base10Formatter(0.0, 3, 1, 1)
'0'
>>> Base10Formatter(0.012, 2, 1, 1)
'1.2 \\times 10^{-2}'
>>> Base10Formatter(-0.1, 3, 1, 1)
Traceback (most recent call last):
...
ValueError: number must be >= 0
Plots the correlation between two variables as a scatter plot.
The data is plotted as a scatter plot.
This function uses pylab / matplotlib. It will raise an Exception if these modules cannot be imported (if PylabAvailable() == False).
The calling variables use LaTex format for strings. So for example, ‘$10^5$’ will print the LaTex equivalent of this string. Similarly, certain raw text strings (such as those including underscores) will cause problems if you do not escape the LaTex format meaning. For instance, ‘x_label’ will cause a problem since underscore is not valid outside of math mode in LaTex, so you would need to use ‘x_label’ to escape the underscore.
CALLING VARIABLES:
Creates a cumulative fraction plot.
Takes a list of numeric data. Plots a cumulative fraction plot giving the fraction of the data points that are <= the indicated value.
datalist is a list of numbers giving the data for which we are computing the cumulative fraction plot. Raises an exception if this is an empty list.
plotfile is the name of the output plot file created by this method (such as ‘plot.pdf’). The extension must be ‘.pdf’.
title is a string placed above the plot as a title. Uses LaTex formatting.
xlabel is the label given to the X-axis. Uses LaTex formatting.
This function uses pylab / matplotlib. It will raise an Exception if these modules cannot be imported (if PylabAvailable() is False).
Compares two distributions and tests if one has a greater mean.
This function can be generally used to compare and plot two distributions. Specifically, this function creates a plot of the distributions of integers in the two distributions fullset and subset. For generating this plot, there is no actual requirement that subset be a true subset of fullset.
However, if subset is a true subset of fullset, then this function can also calculate and display the P-value for the hypothesis that the mean of subset is greater than the mean of fullset.
This function uses pylab / matplotlib. It will raise an Exception if these modules cannot be imported (if PylabAvailable() == False).
The calling variables use LaTex format for strings. So for example, ‘$10^5$’ will print the LaTex equivalent of this string. Similarly, certain raw text strings (such as those including underscores) will cause problems if you do not escape the LaTex format meaning. For instance, ‘x_label’ will cause a problem since underscore is not valid outside of math mode in LaTex, so you would need to use ‘x_label’ to escape the underscore.
CALLING VARIABLES:
Plots linear density of variable as a function of primary sequence.
This function is designed to plot some variable (such as the number of epitopes as a function of the primary sequence position). It creates an output PDF plot plotfile.
The data is plotted as lines. If there is more than one data series to be plotted, a legend is included.
This function uses pylab / matplotlib. It will raise an Exception if these modules cannot be imported (if PylabAvailable() == False).
The calling variables use LaTex format for strings. So for example, ‘$10^5$’ will print the LaTex equivalent of this string. Similarly, certain raw text strings (such as those including underscores) will cause problems if you do not escape the LaTex format meaning. For instance, ‘x_label’ will cause a problem since underscore is not valid outside of math mode in LaTex, so you would need to use ‘x_label’ to escape the underscore.
CALLING VARIABLES:
Returns True if pylab/matplotlib available, False otherwise.
You should call this function to test for the availability of the pylab/matplotlib plotting modules before using other functions in this module.
Returns True if scipy is available, False otherwise.
Splits a string with a return if it exceeds a certain length.
label a string giving the label we might split.
splitlen the maximum length of a label before we attempt to split it.
splitchar the character added when splitting a label.
If len(label) > splitlen, we attempt to split the label in the middle by adding splitchar. The label is split as close to the middle as possible while splitting at a space.
No splitting as label length less than splitlen
>>> SplitLabel('WT virus 1', 10, '\n')
'WT virus 1'
Splitting of this label
>>> SplitLabel('WT plasmid 1', 10, '\n')
'WT\nplasmid 1'
Splitting of this label
>>> SplitLabel('mutated WT plasmid 1', 10, '\n')
'mutated WT\nplasmid 1'
Computes P-value that mean of subset is < or > than mean of fullset.
subset is a list of numbers.
fullset is a list of numbers with len(fullset) > len(subset)
nrandom is the number of random draws to use to compute the P-value.
withreplacement should be True or False. If True, the random draws are done with replacement (same value can be drawn multiple times). If False, the draws are done without replacement.
Computes the mean of the numbers in subset. Then performs nrandom draws of len(subset) samples (with or without replacement depending on the value of withreplacement) of fullset. Determines if the mean of the random subsets is < or >= to the mean in subset. If it is <, computes the fraction of random subsets where the random subsets have a mean >= subset. If it is >=, computes the fraction where the random subsets have a mean <= subset. Then returns the 2-tuple (gt_or_lt, fraction) where gt_or_lt is either “<” (if the mean of subset is >= to the random or “>”. So fraction represents the one-sided P-value for the hypothesis that subset has a mean > or < than the value of a random subset from fullset.