Frequently Asked Questions
Platform & Annotations
Previous versions of the PLAZA platform are still available and can be reached using Menubar→Version (cfr. dicots 3.0, top right next to the search bar). Older versions are no longer modified in terms of data & tools. In contrast, Workbench experiments containing user-defined gene sets are still fully operational and are maintained independently per PLAZA version.
Current supported web browsers:
Some parts of the site look scrambled/different compared to the screenshots/tutorial.
I think I found an error/bug!
Please use the contact information at the bottom of each page to let us know. We're continuously updating the website and implementing more features so we'll get it fixed fast.
If you have used PLAZA for your research and would like to cite it in your paper, please use one of these references:
- Proost, S., Van Bel, M., Vaneecchoutte, D., Van de Peer, Y., Inze, D., Mueller-Roeber, B., Vandepoele, K. (2015) PLAZA 3.0: an access point for plant comparative genomics. Nucleic Acids Research
- Van Bel, M., Proost, S., Wischnitzki, E., Movahedi, S., Scheerlinck, C., Van de Peer, Y., Vandepoele, K. (2012) Dissecting plant genomes with the PLAZA comparative genomics platform. Plant Physiology 158:590-600
- Proost, S., Van Bel, M., Sterk, L., Billiau, K., Van Parys, T., Van de Peer, Y., Vandepoele, K. (2009) PLAZA: a comparative genomics resource to study gene and genome evolution in plants. The Plant Cell 21: 3718-3731
- Vandepoele, K., Van Bel, M., Richard, G., Van Landeghem, S., Verhelst, B., Moreau, H., Van de Peer, Y., Grimsley, N. , Piganeau, G. (2013) pico-PLAZA, a genome database of microbial photosynthetic eukaryotes. Environ Microbiol. 2013 Aug;15(8):2147-53.
The PLAZA platform uses the Arabidopsis standard for gene identifiers: a prefix for the species, an indication of the chromosome, “G”, and then a gene counter. E.g. AT+1+G+00450 (AT1G00450). This standard is applied to all genes from all genes, for example the Zea Mays gene GRMZM2G454425 is in the PLAZA platform represented by ZM10G17690. Whereas for Arabidopsis thaliana the original identifiers are retained, for other species both the original as well as the PLAZA gene identifier are supported.
With PLAZA 3.0 various plaza releases exist simultaniously and it is necessary to specify which version you are linking. To do this the version needs to be specified in the in the url. Replace [version] with :
- plaza_v1 : The original release (2009, published in Plant Cell)
- plaza_v2 : The (unpublished) second version (2011)
- plaza_v2_5 : The 2012 releases published in Plant Phys.
- plaza_v3_dicots : The 2014 dicot centric version
- plaza_v3_monocots : The 2014 monocot centric version
- pico-plaza : The public release for pico-PLAZA (2013, published in Environmental Microbiology)
In the PLAZA website all genes are identified through this PLAZA identifier and a typical gene page looks like this: http://bioinformatics.psb.ugent.be/plaza/versions/[version]/genes/view/ZM10G17690
To link from an external source to a PLAZA gene page, a special web page was created which redirects to the correct PLAZA gene page. The template for this web page is http://bioinformatics.psb.ugent.be/plaza/versions/[version]/genes/locate_gene/SPECIES/IDENTIFIER
We tried to extract as much meaningful meta-data as possible form the different resources from which we retrieved the genes (original protein identifiers, prediction or transcript identifiers, gene names, etc.). These can then be used to link to the PLAZA website.
Physcomitrella patens : http://bioinformatics.psb.ugent.be/plaza/versions/[version]/genes/locate_gene/ppa/173565 (uses the protein id provided in GFF file)
Zea mays : http://bioinformatics.psb.ugent.be/plaza/versions/[version]/genes/locate_gene/zma/GRMZM2G454425 (uses the standard Zea Mays identifiers)
Arabidopsis lyrata : http://bioinformatics.psb.ugent.be/versions/[version]/plaza/genes/locate_gene/aly/Al_scaffold_0009_8 (uses predictions identifiers)
In case it is not clear how the link to the PLAZA website, please contact the site administrator.
How is the species tree created?
Currently, the PLAZA species tree is manually constructed using information from the National Center for Biotechnology Information taxonomy (Federhen, 2011), with additional information from the literature (Moore et al., 2010) to resolve trifurcations.
Platform & Annotations
Why isn't genome X included in your platform ?
We're making an effort to include all sequenced plant organisms. We are however limited to published genomes.
Genome X is published but not in the platform?
For each new genome it is necessary to rebuild gene families and genomic colinearity, this takes some time and computational power. So the genome (or an updated version) will be included in the next data-update.
Unpublished annotation releases
Our general policy is to rely on community annotations as much as possible, and this for two reasons. Firstly, these (frequently published) official releases are most of the time considered as 'the reference' for a given model species, making linking to generally accepted gene identifiers straightforward. Secondly, although we realize that some of these public annotations are far from perfect (see section Construction for more information about Quality Control), we suggest users to directly contact community annotation providers (or the PI of a specific genome project) in case they can provide better genome annotations. PLAZA does not have the means to re-annotate complete plant genomes, so contacting annotation providers is probably the best way to distribute new annotations to the scientific community (and get these improved annotation in future PLAZA versions). As an side note, please consider that objectively demonstrating that your annotation is superior to a public reference annotation is probably an important criterion.
I like this platform, can I set up a PLAZA version for myself with different organism?
Currently, there are no plans to release the platform in such way it can be deployed on other systems. Please contact us for more information or collaborations.
Check Data→Download to download tab-delimited data files containing structural and functional annotation data, as well as sequence data. Please also see our FTP directory ftp://ftp.psb.ugent.be/pub/plaza/.
Apart from coding and amino acid gene sequences, it is also possible to retrieve non-coding DNA sequences (e.g. upstream or downstream). These sequence types are accessible via the Gene Page→Show Sequences or via the Workbench→Export function.
The re-distribution of large-scale PLAZA comparative genomics datasets through other online databases is not permitted, please contact us for more information.
In the Synteny plot it seems similar segments aren't next to each other...
The clustering of the gene strings, that determines the order, is done with the maximum window size (15 genes left and right). Sometimes, when viewed with a window size of 5 genes, this gives the impression similar regions aren't grouped together. Use the zoom option and reinspect the plot.
Why do chromosomes contain less genes in the WGDotplot then in the actual annotation?
This is a consequence of the detection using i-ADHoRe. i-ADHoRe remaps tandem duplicates onto a representative, tandem duplicates that aren't representatives are then removed from the genelists as they have a negative influence on the colinearity detection. All output generated is based on these remapped genelists and therefore lacks some genes that were duplicated by tandem duplications.
Why are some genes both Tandem and Block duplicates?
i-ADHoRe, the tool used to detect genomic homology, first identifies tandem duplicates (i.e. homologous genes in close proximity) before detecting homologous regions. During tandem detection, all genes within a tandem array are remapped/collapsed on a tandem representative. Consequently, when such a tandem representative is a block duplicate within a colinear region, multiple duplication events most probably shaped the genomic region under investigation. Therefore, we also flag all genes within the original tandem array as block duplicates.
What organisms should I select when using the Skyline plot or WGDotplot?
When selecting the species included you should aim to include all species you're interested in, but as few as possible other species. If you're interested in comparing Arabidopsis with Populus, the rosids run is prefered over the all-species run. In case you're wondering why: it has to do with i-ADHoRe's statistics on higher level multiplicons; the highest levels might be missed if extra species are included.
Why is there a normalization option in the Similarity heatmap ?
Bit-scores are a highly suitable measure to delineate sub-families. However, bit-scores depend on the length of the protein and are therefore not a relative score. This causes the entire diagonal not to be in the same white shading. Turning on the normalization will show a gene-by-gene normalized heatmap and converts all bit-scores using the maximum score of the reference gene/line. We feel that the un-normalized heatmaps offer a more correct view as truly homologous genes should be of similar length. Without normalization outliers stand out.
Where can I download Ks values ?
First generate a WGDotplot for the species (or two species) you need Ks values for. At the bottom of the page (below the Ks distribution) there is a button to download Ks values for all pairs in the plot.
Why is scaffold X, chr_random, ... missing from the WGDotplot?
Some genomes are only partially assembled into pseudo-molecules while the remaining portion of the genome is provided as unassembled scaffolds (in PLAZA grouped into a virtual chromosome 0). As there usually are a high number of fairly small scaffolds, these can obscure the WGDotplot and therefore are excluded when generating WGDotplots.
When running GO enrichment, what GO source should I use?
- Primary GO data refers to annotations made by the GO consortium, UniProtKB/Swiss-Prot or found using InterProScan.
- 'Primary and Orthology' comprises primary + orthology-based transfer (iOrtho and reconciled phylogenetic trees)
- 'All' covers Primary + Orthology- + Homology-based transfer.
The Orthology-based method only transfers primary annotations based on experimental GO evidence codes while Homology-based transfer gene family GO annotations to the individual gene family members. Although the gene-GO coverage is all»homology»orthology>primary, GO annotations from homology- and orthology-based transfer are purely electronic. For more details, please see the PLAZA 3.0 paper, Supplementary Method 1: Projection of GO annotation.