♦
Input a NCBI Taxonomy identifier or
scientific name
of species
♦
Input a UniProt or NCBI RefSeq ID.
♦ Quick search
Taxon search
Initial stage: click a taxon with approx. 100 or less families.
Taxon stage: in Species group list tab, select a Ucorn ID
(e.g. Ucorn20231018A1) to access to Terminal stage.
Taxon stage: in Lineage tab, select a taxon related to the target taxon.
Species search:
Initial stage: input a NCBI Taxonomy identifier
or scientific name of species (only Genus name & species epithet).
Species stage: in Species group list tab, select a Ucorn ID
(e.g., Ucorn20231018A1) to access to Terminal stage.
Taxon stage: in Lineage tab,
select a taxon related to the target species.
Gene search:
Initial stage: input a UniProt or NCBI RefSeq identifier
(e.g., P02764 or NP_032794.1).
Species stage: in Species group list tab, select a Ucorn ID
(ex. Ucorn20231018A1) to access to Terminal stage.
Taxon stage: in Lineage tab,
select a taxon related to the target species.
♦ Goal of Ucorn
Unconventional species & genes
The main aim of the Ucorn database is
to enrich information on organismal species & genes.
Its first version provides gene function analyses
such as Gene Ontology and MeSH analyses
to understand functionality of such species & genes.
Multiple evolutionary time points
UniRef50 is useful to outline genes at a single point
along the evolutionary scale.
However, a gene can be grasped in multiple points along the scale.
Such multiple points are available
by an approach based on sequence homology between all gene pairs.
By considering multiple evolutionary points,
sequence homology between gene clusters can be discussed.
♦ Cluster type
Type D (over domains) is a type of gene cluster with genes
in different domains.
Type C (over categories) is a type of gene cluster with genes
in different phylogenetic categories within a single domain.
Type T (typical) is a type of gene cluster with genes
in all species in a taxon.
Type S (subtypical) is a type of gene cluster with genes
in some species in a taxon.
Type A (atypical) is a type of gene cluster type with genes
in different taxa within a single phylogenetic category.
Gene cluster types except for type T are unconventinal & atypical
in the view of discordance with vertical transfer in phylogeny.
♦ History
2023-12-08 The UniRef50 dataset dated by 2023-10-18
was downloaded.
2024-01-16 The Ucorn database α version,
which provides organismal genes & species,
was released.
2024-03-18 The Ucorn database β version,
which related organismal gene & species to GO and MeSH terms,
was released.
♦ Data source
UniProt
UniRef50:
"uniref50.xml" was downloaded
from its FTP site.
Proteome:
tab-formatted lists of proteomes by phylogenetic domains
were downloaded from the websites of the Proteome.
Knowledgebase:
"idmapping.dat"
was downloaded from its FTP site.
NCBI
Taxonomy:
"new_taxdump.zip" was downloaded
from its FTP site.
PubMed:
"pubmed24h" files were downloaded
from its FTP site
to relate articles to MeSH terms.
Gene:
"gene2pubmed" was downloaded
from its FTP site
to relate articles to genes.
♦ Pipeline
The genome lists obtained from the Proteome were
rearranged at species level.
The file obtained from the Taxonomy was used
for information on species.
The UniRef50 file contains genes at levels lower than species
such as subspecies or strains.
Therefore, the UniRef50 data at species level was processed
using the file at the previous step.
Each gene cluster in UniRef50 is related to species.
Gene clusters with the same species groups were classified
to relate gene groups to species groups.
The PMIDs were related to UniRef50 gene clusters
based on "gene2pubmed".
"uniref50.xml" was used
to relate UniProt protein IDs to NCBI gene IDS.
The MeSH terms were related to species group
based on "pubmed24h" files,
which contain relationships of articles to MeSH terms.
To related articles to gene clusters,
the file produced at the previous step was used.
The Gene Ontology (GO) terms were related to species group
based on "idmapping.dat",
which contains relationships of gene clusters to Go terms.
In the step, the aspect of biological process of GO terms was used,
because organismal gene clusters are expected to show
similar biological processes,
rather than common gene functions
(of course, different gene functions can relate to a particular process).
♦ Publications
Ogata Y, Kitayama R: Ucorn: a database to enrich information
on organismal species and genes. Submitting.
♦ References
Queller DC, Strassmann JE: Beyond society: the evolution of organismality.
Phil. Trans. R. Soc. B, 2009, 364:3143-55.
PMID:19805423 doi: 10.1098/rstb.2009.0095
♦ Downloads
To download the main resources of the Ucorn database, click
here.
The KAGIANA Project (since 2006)
Copyright (C) 2012-2024 All rights reserved the Ogata Lab.