GOPCA class

class gopca.GOPCA(matrix, configs, **kwargs)

Class for performing GO-PCA.

This class implements the GO-PCA algorithm. (The GO enrichment testing is implemented in the enrichment.GeneSetEnrichmentAnalysis class of the genometools package). The input data consists of an expression matrix (genometools.expression.ExpMatrix) and a list of GO-PCA “configurations” (GOPCAConfig), i.e., pairs of parameter settings and gene set collections.

Parameters:
matrix

genometools.expression.ExpMatrix – The expression matrix.

configs

list of GOPCAConfig – The list of GO-PCA configurations. Each configuration consists of gene sets (represented by a GOPCAGeneSets instance) along with a set of GO-PCA parameters (GOPCAParams) to use for testing those gene sets.

num_components

int

The number of principal components to test. If set 0, the number is
determined automatically using a permutation-based algorithm.
pc_seed

int – The random number generator seed, used to generate the permutations for automatically determining the number of principal components to test.

pc_num_permutations

int – The number of permutations to used for automatically determining the number of principal components to test.

pc_zscore_thresh

float – The z-score threshold used for automatically determining the number of principal components (PC) to test. First, the fraction of variance explained by the first PC in each permuted dataset is calculated. Then, the mean and standard deviation of those values are used to calculate a z-score for the fraction of variance explained by each PC in the real dataset. All PCs with a z-score above the specified threshold are tested.

pc_max_components

int – The maximum number of principal components (PCs) to test (only relevant when the algorithm for automatically determining the number of PCs to test is used. For testing a fixed number of PCs, set the num_components attribute to a non-zero value.

verbose

bool – If set to True, generate more verbose output.

estimate_num_components()

Estimate the number of non-trivial PCs using a permutation test.

static print_signatures(signatures, maxlength=50, debug=False)

Print a list of signatures, sorted by PC and enrichment score.

run()

Perform GO-PCA.

Returns:The GO-PCA run, or None if the run failed.
Return type:GOPCARun or None
set_param(name, value)

Set a GO-PCA parameter.

Parameters:
  • name (str) – The name of the parameter.
  • value – The value of the parameter.
Returns:

Return type:

None

classmethod simple_setup(matrix, params, gene_sets, gene_ontology=None, **kwargs)

Initialize GO-PCA instance with only one collection of gene sets.