Xenbase
Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of
Chrome
FireFox
, or
Safari
Cite Xenbase
Xenopus gene nomenclature
Gene Nomenclature Guidelines
Overview
Gene/RNA names and symbols are lowercase italics (
pax6
Proteins symbols are first letter capital, not italics (Pax6)
Should not start with X, Xt, Xl for Xenopus species
Official Xenopus gene names and symbols are found on Xenbase gene pages and are based on human gene nomenclature (
Orthology to human genes are assigned by synteny
laevis homeologs (sub-genome genes) are designated ".L" and ".S" for the long and short chromsomes, repectively
Legacy gene name/symbols that are no longer available are recorded as synonyms (
Xbra
is a synonym of
For gene name questions please email the Xenbase gene name coordinator
xenbase@cchmc.org
Example:
Gene Name:
beta-carotene oxygenase 2
Gene Symbol:
bco2
RNA Symbol:
bco2
Protein Symbol: Bco2
Gene Names
Detailed Gene Nomenclature
Xenopus gene names and symbols are identical to human gene names whenever possible. A full description of the nomenclature rules used by the
Human Genome Nomenclature Committee (HGNC)
can be found at
Orthology assignments are based primarily on synteny and requires more than a Blast result in order to apply the human gene name. Data for 12,000 tropicalis gene models has been generated by Dan Rokhsar see
Orthology assignments should be approved by the HGNC, the Xenopus Gene Nomenclature Committee and communicated by Xenbase staff.
In cases where mammalian gene names reference an original Xenopus name (
chordin-like
), the Xenopus name will be retained (
chordin
).
Gene names should not start with any characters or words in order to identify the gene as being Xenopus (e.g. X, Xt, Xl, Xenopus, tropicalis, laevis).
Gene names are lower case and italics, and should only contain Latin letters and Arabic numbers. Greek letters should be spelled out (β -> beta), and Roman numerals should be changed to Arabic equivalents (IV -> 4).
Example:
beta-carotene oxygenase 2
Punctuation should only be used if the human gene name uses punctuation (except paralogs / homeologs as described below).
When identity is uncertain be cautious. Use a temporary symbol or name such as “caudal type homeobox 2 [provisional]” until more information is available, at which time the name would be changed and the [provisional] tag removed.
Pioneer species names should not be used. For example, in some species nanos3 is known as “nanos homolog 3 (Drosophila)”. In Xenopus it would simply be named “nanos homolog 3”.
Xenbase administers Xenopus gene nomenclature.
Gene names for laevis homeologs are appended with "L homeolog" or "S homeolog" to distinguish the sub-genome they are associated with.
When there is no human ortholog of a new Xenopus gene or when the human gene name is provisional, new gene names will be based on consultations with the
HGNC
, the Xenopus gene nomenclature committee, and the requesting parties. Gene name requests should be sent to the Xenbase gene name coordinator
xenbase@cchmc.org
Gene Families and Paralogs
Gene families
are a related set of genes formed by duplication of a single ancestral gene. Genes within gene families usually have similar biological functions.
When naming genes in gene families, a root word should be used to identify the gene as being a member of the gene family. Gene family members should be assigned increasing unique numerical identifiers. In keeping with HGNC policies the next available number that is not already used in other species should be appended to the end without punctuation (see exceptions below).
Example:
nodal homolog 1 (nodal1), nodal homolog 2 (nodal2)
Some exceptions are made for rare legacy names that have a different format.
Example:
nkx2-1
and
nkx2-2
Exceptions can also be made when there are clear subgroups within a gene family. In this case it is acceptable to append a “.1, .2, .n+1.” to the end of the symbol to indicate that the genes belong to the same subfamily. This applies to both Xenopus specific gene family expansions as well as to cases where there are multiple Xenopus genes relative to a single member of a mammalian gene family.
Example:
nodal homolog 3, gene 1 (nodal3.1), nodal homolog 3, gene 2 (nodal3.2)
Example:
Human Hairy and Enhancer of split 6 (HES6) is syntenic with two tropicalis genes:
hairy and enhancer of split 6, gene 1 (hes6.1)
and
hairy and enhancer of split 6, gene 2 (hes6.2)
Expanded gene families are numbered independently in each
Xenopus
species or sub-genome. Importantly the same “.n” number designation between different species or subgenomes does not necessarily imply a direct one-to-one orthology.
Example:
X. tropicalis:
bix1.1, bix1.2, bix1.3, bix1.4, bix1.5
and
bix1.6
X. laevis:
bix1.1.L, bix1.2.L, bix1.3.L, bix1.1.S, bix1.2.S
and
bix1.3S
Complex orthologies not covered by the rules above will be resolved in a case-by-case manner in consultation with the XGNC and the HGNC.
Pseudogenes
are non-functional DNA sequences that are similar in structure to normal genes. Xenopus pseudogene names should be given the next integer within the gene family name, and designation “pseudogene” should be appended to the end of the gene name. HGNC pseudogene naming guidelines will be applied.
Example:
fer-1-like 4 pseudogene
Note:
Genes that are pseudogenes in one species may be expressed in other species.
Gene Symbols
Gene symbols are lower case and italics, and should only contain Latin letters and Arabic numbers (unless specified below).
Gene symbols are identical to human gene symbols whenever possible.
Should not start with X, Xt, Xl for Xenopus species.
Symbols are short-form representations (or abbreviations) of the descriptive gene name. Symbols should also be at least three characters long, with the first character being a letter.
Gene symbols should have no spaces and punctuation should only be used if the human equivalent uses punctuation (except for paralogs or laevis homeologs as described above).
Gene symbols must be unique and should avoid matching common words or abbreviations in order to avoid problems with database searching (e.g. DNA, EGTA, PBS, CAN, GET...).
Example:
bco2
Symbols for genes in gene families should contain a base or root “word”, followed by increasing numerical identifiers.
Example:
nodal, nodal1, nodal2, nodal3
RNA Symbols
RNA symbols are the same as gene symbols in
lowercase and italics
and match human symbol nomenclature.
Latin Letters and Arabic Numbers only.
Example:
bco2
RNA splice variants:
RNA strands that arise from splice variants of genes should use the same gene symbol as the gene, followed by -v and increasing numerical identifiers.
Example:
fzd4-v1
Protein Symbols
Protein names and symbols are exactly the same as the gene name and symbol but have the
first letter uppercase, and are not italics
The word “protein” or additional terms are not included.
Example:
Bco2
Protein variants arising from alternative spliced variants of genes should use the symbol as the alternative transcript, including the –v and increasing numerical identifiers.
Example:
Fzd4-v1
Xenopus Gene Nomenclature Committee (2013)
Enrique Amaya
, University of Manchester, UK
Julie Baker
, Stanford University, USA
Ira Blitz
, University of California, Irvine, USA
Dale Frank
, Technion - Israel Institute of Technology, Israel
Mike Gilchrist
, The Francis Crick Institute, Mill Hill Laboratory, London UK
Matt Guille
, EXRC, University of Portsmith, UK
Richard Harland
, University of California, Berkeley, USA
Marko Horb
, Marine Biological Laboratory, USA
Mustafa Khokha
, Yale School of Medicine, USA
Hajime Ogino
, Nara Institute of Science and Technology, Japan
Nicolas Pollet
, Institute of Systems & Synthetic Biology, France
Atsushi Suzuki
, Hiroshima University, Japan
Masanori Taira
, University of Tokyo, Japan
Gert Veenstra
, Nijmegen Center for Molecular Life Sciences, Netherlands
Peter Vize
, University of Calgary, Canada
Aaron Zorn
(Chair), Cincinnati Children's Hospital, USA
Please address all comments or questions to the nomenclature administrator at Xenbase,
xenbase@cchmc.org
Nomenclature
Chromosome:
Chromosome Nomenclature
Gene:
Gene Nomenclature
Gene Nomenclature Questions
Transgene:
Transgene Nomenclature