Frequently asked questions

Technical questions

1. Why does this site look all messy?
That's because you're using Internet Explorer. Don't use it, use Firefox.
To view this site use at least a 1280x1024 resolution and view it in Firefox.

2. Why does Firefox ask me to cancel the running script?
Firefox thinks the script has crashed because it takes so long. But that's normal, it takes a while to load the ontologies.
To fix this problem type in "about:config" in the address bar in Firefox, hit Enter and change the value of dom.max_script_run_time to 60 or higher.

3. Why is Cytoscape not starting?
Calculations are done, loading is finished, but Cytoscape does not start. You need to allow pop-ups in your browser. You also need an up to date version of Java (see Tool question 7).
4. Why do I have to download Cytoscape every time I launch the Cytoscape view?
You need to make sure caching is enabled for the Java plugin. You can find out how to enable caching on this page. Make sure the cache size is big enough (50-100 MB).

Tool questions

1. Can you give me some more information about the precalculated experiment sets?
The precalculated expression compendia contain absolute expression values resulting from the preprocessing of the microarray data.
Preprocessing: RMA procedure (background correction, normalization, summarization - note that this procedure returns log2 transformed expression values)
The TAIR10 - v14 CDF (Chip Description File) downloaded from Brainarray was used to map probes to genes.

AtGenExpress All (425 experiments)
All experiments performed by AtGenExpress
Microarray compendium 1 (454 experiments)
Collection of microarray experiments oriented towards growth, development and cell cycle studies
Microarray compendium 2 (192 experiments)
Collection of microarray experiments. Very similar experiments were removed. Similar numbers of experiments of each design type
Abiotic stress (256 exp)
Abiotic stress series (cold, drought, genotoxic, heat, osmotic, oxidative, salt, UV-B, wounding)
Biotic stress (69 exp)
Biotic stress series (Botrytis, Pseudomonas, Phytophtora, etc.)
Development (135 exp)
Developmental series (different tissues, developmental stages, developmental mutants)
Flower (72 exp)
Microarray experiments in which floral tissues are sampled
Genetic modification (313 exp)
Microarray experiments in which transgenic lines are profiled (gene overexpression (knock-in), gene knock-out, transient transgene expression)
Hormone treatment (140 exp)
Hormone treatment series (ABA, brassinosteroids, GA, cytokinin, etc. and inhibitors)
Leaf (212 exp)
Microarray experiments in which leaf tissues are sampled
Root (258 exp)
Microarray experiments in which root tissues are sampled
Seed (83 exp)
Microarray experiments in which seed tissues are sampled
Stress (abiotic+biotic) (336 exp)
Combination of the abiotic and biotic stress dataset
Whole plant (85 exp)
Microarray experiments in which whole plants are sampled

DIFFERENT DESIGN TYPES: Different MO terms were added to describe the type of experiment performed.
  • stimulus_or_stress_design
    • abiotic_stress_design
    • biotic_stress_design
  • compound_treatment_design
    • hormone_treatment_design
  • genetic_modification_design
  • growth_condition_design
  • time_series_design
  • circadian_rhythm_design
  • development_or_differentiation_design
  • individual_genetic_characteristics_design
  • organism_part_comparison_design
  • strain_or_line_design
  • translational_bias_design


2. Why and how to use multiple precalculated expression datasets?
The correlation coefficient for two genes can vary considerably depending on the input expression dataset.

  • By selecting multiple expression datasets, co-expression in different conditions can be studied simultaneously. The expression datasets that can be selected are described in Question 1.

  • Co-expression links are reported when they meet the requirements set in step 3 (Pearson correlation coefficient threshold, top most correlated genes, or both (see Question 5)). One can report only those co-expression links that meet the requirements based on all selected expression datasets or at least X expression datasets.

  • The correlation coefficients found in the different datasets will be reported in the "dataset coefficients" attribute (see Question 7). On the edges, either the minimum, maximum or average correlation coefficient ("corrcoeff" attribute) over the datasets meeting the requirements will be shown.


  • 3. What kind of localization data is used?
    Both experimentally identified and predicted localizations are used. The source of the data and according evidence code is mentioned in the Cytoscape attributes.

    Data type Data source Download date
    localization SUBA 2009-06-05
    localization IPSort 2007-11-25
    localization LocTree 2007-11-25
    localization MitoProt 2007-11-25
    localization MultiLoc 2007-11-25
    localization PeroxP 2007-11-25
    localization Predator 2007-11-25
    localization SubLoc 2007-11-25
    localization SwissProt 2007-11-25
    localization TargetP 2007-11-25
    localization WoLF_PSORT 2007-11-25

    4. What kind of functional annotation data is used?
    Diverse resources for functional gene annotation are integrated. The source of the data and according evidence code, the number of databases and the type of data are mentioned in the Cytoscape attributes and can be viewed in the Cytoscape Attribute Browser (Lower panel) (See Tool question 9).

    Data type Data source Download date
    biological_process (GO) TAIR 2010-08-21
    molecular_function (GO) TAIR 2010-08-21
    INTERPRO_domain TAIR 2008-12-17
    PubMed ID TAIR 2011-04-01
    Phenotype TAIR 2011-05-24
    MapMan pathways and processes MapMan 2011-06-17

    5. What kind of protein-protein interaction data is used?
    Both experimentally identified and predicted protein-protein interactions are used. The source of the data is mentioned in the Cytoscape attributes. Note that the AraNet gene-gene association data are also represented as protein-protein interactions. However, these interactions are not necessarily direct interactions. MIND0.5 contains the results of a binary split ubiquitin interaction screen of membrane proteins and proteins curated as signaling proteins. These data need independent verification (see MIND database).

    Data type Data source Download date
    protein-protein_interaction ArathReactome 2009-10-12
    protein-protein_interaction AtPID 2007-12-10
    protein-protein_interaction BioGRID 2010-07-26
    protein-protein_interaction De Bodt et al., 2009 (filtered) 2008-12-09
    protein-protein_interaction De Bodt et al., 2009 (predicted) 2008-12-09
    protein-protein_interaction DIP 2010-06-14
    protein-protein_interaction Geisler-Lee et al., 2007 (BAR Arabidopsis Interactions Viewer) 2007-08-03
    protein-protein_interaction IntAct 2010-08-10
    protein-protein_interaction MINT 2010-07-27
    protein-protein_interaction TAIR 2009-05-27
    protein-protein_interaction MIND0.5 (www.associomics.org) 2011-08-01
    protein-protein_interaction Arabidopsis Interactome Mapping Consortium - Yeast-2-hybrid 2011-08-01
    protein-protein_interaction EVEX text mining data - binding 2012-05-02
    gene-gene_association AraNet (www.functionalnet.org/aranet) 2011-06-21


    6. What kind of regulatory interaction data is used?
  • Regulatory interactions are retrieved from AGRIS, consisting of interactions identified through ChIP-chip, ChIP-Seq, yeast one-hybrid, EMSA, microarray analysis, ChIP-PCR, RT-PCR.
  • Interactions can be confirmed (solid edges) or unconfirmed (dashed edges) and direct (arrow) or indirect (circle as arrow head).
  • In addition, regulatory interactions are inferred from the CORNET microarray data with genetic_modification as design type (see Tool question 1). In these experiments, transcript profiling of transgenic lines in which one or more genes (either encoding for a transcription factor or not) is overexpressed or mutated. Differentially expressed genes are identified using Limma (BioConductor), comparing transgenic to wild type plants.
  • Finally, direct and indirect regulatory interactions are retrieved from the EVEX text mining resource.
  • Regulatory interactions inferred from microarray data are assumed to be unconfirmed (dashed edges) and indirect (circle as arrow head) in the Cytoscape attributes.
  • If known, it is indicated if the target gene is either activated (green arrow head) or repressed (red arrow head) in the particular regulatory interaction. See legend in Tool question 10.
  • In case more than 1 gene is found to regulate it's targets as a complex, the pairwise interactions between these genes are indicated by the "dimer" edge.



  • 7. What guidelines should I follow when setting the options in Step 3?
    • Start with stringent parameters depending on the number of query genes:
      • High correlation coefficient (e.g. 0.8)
      • Low number of neighbours (e.g. 10)
    • Gradually loosen the stringency of the parameters
    • When the co-expression network is too large, a warning will be given. A text output instead of a visual Cytoscape output can be chosen

    8. What do all the options in Step 3 mean exactly?
    Pairwise comparisons:
    All possible combinations between query genes are made. A correlation coefficient is calculated for each pair of genes
    Neighbours:
    Add extra genes. Every query gene is compared to the complete Arabidopsis genome. A gene pair with a correlation coefficient above the chosen thresholds or a gene belonging to the top X most correlated genes is reported.
    Thresholds:
    Two thresholds can be chosen.
    1. Correlation coefficients higher and lower than a certain value (if given) are reported.
    2. The top X genes with highest correlation coefficients are reported.
    3. When using multiple datasets, the same approach is followed for each expression dataset and results are combined or intersected based on the "atleast" parameter. For example, if atleast=2 is chosen, only co-expression links meeting the requirements for two or more datasets are reported.
    The most limiting threshold always has priority.
    (e.g. If you ask for top 10 neighbours but only 6 meet the first threshold's requirement, only 6 are shown. If you ask for top 10 neighbours but many more meet the first threshold's requirement, only 10 are shown.)
    Relations between neighbours:
    If you want to test if the neighbours of your query gene are co-expressed, you need to choose this option.

    9. What is Cytoscape and how do I use it with this tool?
    Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. Cytoscape will launch through JAVA™ Web Start Launcher
    You may need to update JAVA™ for the Cytoscape Web Start.
    Get the latest JAVA™ Runtime Environment
    You can also download and install the latest version of Cytoscape locally on your computer.

    To view more information about an edge or a node, click on the "Select attributes" button in the Data Panel.

    Then select all the desired columns you want to be visible.
    This table is an overview of the available attributes:
    Node attributes
    IDagi code
    descriptionshort TAIR functional description
    descriptionLongverbose TAIR functional description
    gene.selectedindicates in which step of the application this gene was a query gene
    localizationlocalization of the gene
    localizationReferencesreferences and evidence codes of the localization of the gene
    Hint: Move your mouse over this column and wait for the tooltip to appear. The tooltip is nicely formatted HTML and easier to read.
    biologicalProcessGO biological process
    molecularFunctionGO molecular function
    proteinDomainsInterPro protein domains
    MapManMapMan pathways and processes
    PMIDPubMed ID assigned to a certain gene by TAIR
    PhenotypePhenotype assigned to a transgenic line of a certain gene by TAIR
    Edge attributes
    IDlabel describing the relation between 2 genes: (cor) = correlation, (pp) = protein-protein interaction, (tf) = regulatory interaction
    interactionkind of relation between genes, e.g. regulatory interactions (activation or repression, direct or indirect)
    corrcoeffThe correlation value [-1.0,1.0] between 2 genes if the edge indicates a correlation
    n_refhow many times this protein-protein interaction is referenced
    featuresdetailed information about the references
    Hint: Move your mouse over this column and wait for the tooltip to appear. The tooltip is nicely formatted HTML and easier to read.
    propindicates if the protein-protein interaction is experimental or predicted
    dataset coefficientsindividual coefficients for each dataset, if the coefficient is situated between -0.6 and 0.6, NA (not available) is returned
    matching datasetsnumber of datasets returning a correlation coefficient matching the selected criteria
    PPI_evidenceadditional evidence from the original data source (AraNet and MIND0.5)
    evexconfEVEX confidence values
    evexeventidEVEX event ids which can be used to link out to the original EVEX database (right-click in attributes table > Search on web > Plants_Arabidopsis)
    evexeviindicates whether an event is speculated and/or negated
    evexregspecifies the type of regulation (e.g. regulation of expression)

    10. What do the colors and shapes mean in the networks in Cytoscape?
    Visualization legend:

    Correlation networks
    Edge color
    Query gene
    Protein-protein interactions
    Edge colorblack
    Edge widththe more references, the wider the edge
    Edge style = experimental
    = predicted
    Query gene
    Transcription factor interactions
    Edge colorblack
    Edge widththe more references, the wider the edge
    Edge style = Confirmed
    = Unconfirmed
    Edge arrow = direct + Activation
    = direct + Repression
    = direct + unknown
    = indirect
    Arrow colorgreen = activation
    red = repression
    black = unknown
    Query gene
    COR and PPI
    Query gene
    COR and TF
    Query gene
    TF and PPI
    Query gene
    Localization pie colors
    (the size of a pie slice represents how much
    a value is referenced compared to the others)
     CYTOSKELETON
     PEROXISOME
     OTHER
     CELLPLATE
     NUCLEOLUS
     CYTOSOL
     CHLOROPLAST
     NUCLEUS
     MITOCHONDRIA
     ENDOSOME
     PLASTID
     VACUOLE
     ENDOPLASMICRETICULUM
     EXTRACELLULAR
     PLASMAMEMBRANE
     GOLGI

    11. How can I compile a user-defined dataset of microarray experiments?
    There are 2 ways to do this (you will be the only user able to view this dataset):
    1. Via the Co-expression tool -> User-defined page
      • Search for the desired experiments by choosing keywords.
      • Hit the "Fetch Experiments!" button.
      • Select the experiments you actually want to use.
      • Give your set of experiments a name. (This name must be unique per user.)
      • Click the "Normalize" button.
      • After normalization is finished, you'll see a link you have to click to continue. Click it.
      • This brings you to the Co-expression tool -> Predefined page, but with your dataset already selected. (You can still save the data locally by clicking the Save icon next to it.)
    2. Via the Browse experiments page
      • Search for the desired experiments by choosing keywords.
      • Hit the "Fetch Experiments!" button.
      • Select the experiments you actually want to use.
      • Click the "Normalize" button.
      • After normalization is finished, you'll see a link to a file. Download the file.
      • You will receive a zipped file starting with norm_ followed by a random alphanumeric string.
        In this file, you'll find two text files:
        • rma_randomString.txt: contains the normalized rma data
          The first line of this file is a comment line containing the experiment name per column.
        • desc_randomString.txt: contains the description of the experiments mentioned in the rma file.
          This file is formatted as a tab delimited file. The first line contains the column headers and each row shows the information for one experiment.
      • You can use these files as input on the Co-expression tool -> Predefined page now, using the Upload own rma data option.

    12. How do I use my own microarray data with this tool?
    • I have raw data (CEL files):
      1. Go to the Upload page (You can edit previously uploaded experiments by going to the Edit page)
      2. Fill in the form
      3. Your experiment is now available to you through Browse experiments.
        Use them as explained in How can I compile a user-defined dataset of microarray experiments?.
      4. Your experiments are indicated by , public ones by
      5. For more details, download the tutorial.
    • I have normalized data (rma file):
      1. Go to the Co-expression tool -> Predefined page
      2. Choose the "Upload your own data" option in Step 2
      3. Browse to your normalized data file
        This file should meet the following requirements:
        • The first line is a header line with experiment names. For each column, there should be a name in the header. These names should be seperated by tabs.
        • The first character on the header line should be a #, followed by a tab.
        • The first column should contain the gene IDs (AGI codes).
        • Each row represents the expression values of one gene over all experiments, seperated by tabs.
    PLEASE NOTE THAT IT CAN TAKE A LONG TIME FOR A FILE TO UPLOAD OVER A SLOW CONNECTION

    13. Where can I download raw microarray data (CEL files)?
    1. Go to the Browse experiments page.
    2. Fetch the experiments you want by choosing keywords.
    3. Select the experiments you actually want to download from the list of fetched experiments.
    4. Hit the "Download raw data" button. !!! WHEN DOWNLOADING A LARGE AMOUNT OF DATA IT IS NORMAL THAT THE PAGE TAKES A WHILE TO LOAD !!!
    5. The file you download contains all the CEL files and a text file called input_<randomString>.txt. This text file can be used as input for an R script for example.
      It also contains a file called desc_<randomString>.txt which contains the descriptions of the experiments.