Setting ENIGMA properties
The
enigma.properties
file contains the following fields that
define ENIGMA settings (example values are filled in):
ENIGMA parameters
Directory in which the data files
are located
- expressionData=hughratios
Log-ratio expression data. File
format:
UID |
DESCRIPTION |
condition1 |
condition2 |
condition3 |
gene a |
|
0.001 |
0.033 |
-0.01 |
gene b |
|
-0.045 |
-0.012 |
0.03 |
gene c |
|
-0.018 |
0.02 |
0.059 |
P-values for over- or
under-expression of genes under experimental conditions. File
format:
UID |
DESCRIPTION |
condition1 |
condition2 |
condition3 |
gene a |
|
0.001 |
1.20E-04 |
0.050 |
gene b |
|
0.935 |
0.352 |
0.032 |
gene c |
|
0.566 |
1.23E-05 |
0.220 |
- geneDescriptionFile=SGD_features.tab
File Format. Only the fourth (ORF),
fifth (NAME) and sixteenth (DESCRIPTION) column are used by the present version of
ENIGMA, all other columns can be left empty
DBREF |
FEATURE _TYPE |
FEATURE _QUALIFIER |
ORF |
NAME |
ALIAS |
... |
|
|
|
|
|
|
|
... |
DESCRIPTION |
S000002143 |
ORF |
Dubious |
YAL069W |
|
|
|
|
|
|
|
|
|
|
|
Dubious open reading frame unlikely to encode a
protein, based on available experimental and comparative
sequence data |
S000000054 |
ORF |
Verified |
YAL058W |
CNE1 |
FUN48 |
|
|
|
|
|
|
|
|
|
Calnexin |
S000000045 |
ORF |
Verified |
YAL047C |
SPC72 |
LDB4 |
|
|
|
|
|
|
|
|
|
Component of the cytoplasmic Tub4p (gamma-tubulin)
complex, binds spindle pole bodies and links them to
microtubules |
- chipData=harb04_reg_graph_0_p005_ORF.txt
File with ChIP binding data or
motif data. If chipData=null,
no TF binding site overrepresentation analysis will be
performed. File
format:
TF |
Target |
Binding_Sites |
YMR016C |
YMR016C |
14 |
YMR016C |
YDR309C |
6 |
YMR016C |
YKL062W |
10 |
'Binding_Sites' represents the
number of binding sites/motifs for the TF in the target's
promoter (this number is not really used, if you don't have info
on the number of binding sites, just fill in ones).
- interactionData=BIOGRID-ORGANISM-Saccharomyces_cerevisiae-2.0.35.tab.txt
a BioGRID-style text file
containing protein and genetic interactions. See the
BioGRID website for details and downloadable interaction
files.
- regulatorData=regulators.txt
Either a custom list of potential regulators,
or 'regulatorData=default', in which case the regulators are
selected from the GO categories specified under regulatoryGoCats.
Custom list file format:
ORF |
NAME |
DESCRIPTION |
YAL040C |
CLN3 |
G1/S transition of mitotic cell
cycle* G1/S-specific cyclin |
YAL041W |
CDC24 |
establishment of cell polarity (sensu
Saccharomyces)* signal transducer* |
YBL005W |
PDR3 |
transport transcription factor |
- regulatoryGoCats=30528 4672 79
a space-separated list of GO
category numbers from which potential regulators should be
selected.
- clusterPar1=0.30
- clusterPar2=0.55
These lines define custom ENIGMA
clustering parameters. If clusterpar1=NaN and clusterpar2=NaN,
ENIGMA will search for optimal parameter settings using a
Simulated Annealing procedure. Otherwise, ENIGMA will look for
modules using the specified parameter settings (between 0 and 1,
clusterpar1 controls the spacing of the modules, clusterpar2
controls the size and coherence of individual modules).
p-value threshold for significant
up- or down-regulation.
NOTE: if pvals=false (i.e. if you
don't wish to use p-values), this threshold can be set to a
log-ratio cutoff for 'significant' up- or down-regulation. E.g.
when 'pvals=false' is used in combination with 'pvalThreshold=1'
on data in log2-ratio format, two-fold upregulation or
down-regulation of expression will be considered 'significant'.
'true' if you want to use
p-values to assess up- or downregulation of genes under specific
conditions (recommended), 'false' if you want to use a log-ratio
cutoff instead to determine up- or down-regulation. In this
case, you should specify the same file in both the 'expressiondata'
and 'pvaldata' fields (see above)
Significance level for FDR
correction of expression correlation p-values. Also used as FDR
level for determining set of conditions under which a module's
genes show a significant response (expression up or down).
FDR level used for selecting
enriched TF binding sites from chipData
The number of Simulated Annealing
runs, if clusterPar1=NaN and clusterPar2=NaN
Begin temperature of first
(rough) stage of Simulated Annealing
End temperature of first (rough)
stage of Simulated Annealing
Size of steps in parameter space
in first (rough) stage of Simulated Annealing
Cooling rate during first (rough)
stage of Simulated Annealing
Begin temperature of second stage
of Simulated Annealing
End temperature of second stage
of Simulated Annealing
Size of steps in parameter space
in second stage of Simulated Annealing
Cooling rate during second stage
of Simulated Annealing
True if you want to narrow down
the condition sets for the modules to the most relevant ones,
false otherwise
Cosine correlation threshold for
grouping conditions into leaves. E.g.
conditions in the same leaf should have a cosine correlation of
at least 0.65. If you don't want to define leaves but just
cluster hierarchically, use cosCorrThreshold= -1.
Draw figures for all modules. drawModules=eps
for eps figures, drawModules=png for png figures, drawModules=null
if you don't want figures to be drawn
BiNGO parameters
True if you want to perform GO
overrepresentation analysis on the gene sets of the modules, otherwise false
True if you want to perform GO
overrepresentation analysis on the condition sets of the modules, otherwise false.
This option requires that your condition names start with the
names of perturbed genes, separated by the rest of the condition
name by ',' or '('.
FDR significance level for GO
overrepresentation analysis, generally 0.05 or 0.01
- annotationFile=S_cerevisiae_default
GO annotation file, either one of
the default files or a custom file (see
BiNGO
website for info on making custom annotation files). Currently
allowed default files are:
S_cerevisiae_default
A_thaliana_default
S_pombe_default
T_brucei_default
C_elegans_default
D_melanogaster_default
B_rerio_default
H_sapiens_default
M_musculus_default
R_norvegicus_default
P_falsiparum_3D7_default
O_sativa_japonica_default
B_anthracis_Ames_default
S_oneidensis_MR-1_default
P_syringae_DC3000_default
C_burnetii_RSA_default
G_sulfurreducens_PCA_default
M_capsulatus_Bath_default
L_monocytogenes_4b_F2365_default
C_jejuni_RM1221_default
D_ethenogenes_195_default
S_pomeroyi_DSS-3_default
G_gallus_default
B_taurus_default
- ontologyFile=GO_Biological_Process
GO ontology files, either one of
the default files or a custom file (see
BiNGO
website for info on making custom ontologies). Currently allowed
default files are:
GO_Biological_Process
GO_Molecular_Function
GO_Cellular_Component
GO_Full
GOSlim_Generic
GOSlim_GOA
GOSlim_Plants
GOSlim_Yeast
True if you use one of the
default GO annotation files, false otherwise
True if you use one of the
default GO ontology files, false otherwise
|