NonpaR: Nonparametric Tests
Fisher Exact Parameters
Data Bin Partition Value
Fisher Exact (FE) requires that the data fall into two bins. Usually these are CGH
absent/present calls using discrete data or if an expression value can be used as a cutoff,
the input data can be essentially continuous. The data bin partition splits that data by
classifying each value as above the cutoff of below or equal to the cutoff.
The next to buttons allow you to associate one of your data bin names with the
values that are greater than the supplied cutoff.
Contingency Matrix Orientation
The 2x2 contingency matrix that holds information indicating which values in an expression vector
fall under a particular sample group and data bin, can have its columns and
rows swapped. Note that this has no impact on the reported p-values but rather
is used to describe the observed effect if a user is interested in a one-tailed result.
In most studies one is generally going to look for disproportionality where a data bin is
over represented in one sample group or the other. In some cases, it might be more
important to ask a more specific question such as which genes are over represented in
sample group A. The comparison that is of greatest interest should appear in the upper left
quadrant of the contingency matrix. The one tailed result will be focused on reporting on this
observation.
Alpha Value
The alpha value is a p-value cutoff. p-values below this value are considered as
justification for rejection of the test's null hypothesis. Lowering the alpha value, or critical p-value, makes the test
more stringent by limiting chance of a single type two error.
FDR: False Discovery Rate
The False Discovery Rate (FDR) reports an estimate of the fraction of false positives among a set of genes called signficant.
This option uses the Benjamini-Hochberg correction which is described as a correction on the p-value such that the FDR for
the collection of genes with p-values less than a particular gene i, is less than the corrected p-value for gene i. The FDR
estimates the number of false positives such that if you call 100 genes positive and the FDR is 0.05 then it is estimated
that 5 genes or fewer are falsely called significant.
There are two options for FDR, one in which you supply and FDR prior to the analysis, and a second option that supports
a user interface to permit balancing the number of significant calls with FDR following computation of the results. This second
option presents a graph of FDR vs. number of significant genes.