UCSC Genome Bioinformatics
   Home  -   Genomes  -   Blat  -   Tables  -   PCR  -   Help
  Frequently Asked Questions: Assembly Releases and Versions
  Return to FAQ Table of Contents


  List of UCSC genome releases
 

Question:
"How do UCSC's release numbers correspond to those of other organizations, such as NCBI?"

Response:

SPECIES UCSC VERSION RELEASE DATE RELEASE NAME STATUS
VERTEBRATES    
Humanhg19Feb. 2009Genome Reference Consortium GRCh37Available
 hg18Mar. 2006NCBI Build 36.1Available
 hg17May 2004NCBI Build 35Available
 hg16Jul. 2003NCBI Build 34Available
 hg15Apr. 2003NCBI Build 33Archived
 hg13Nov. 2002NCBI Build 31Archived
 hg12Jun. 2002NCBI Build 30Archived
 hg11Apr. 2002NCBI Build 29Archived
 hg10Dec. 2001NCBI Build 28Archived
 hg8Aug. 2001UCSC-assembledArchived
 hg7Apr. 2001UCSC-assembledArchived
 hg6Dec. 2000UCSC-assembledArchived
 hg5Oct. 2000UCSC-assembledArchived
 hg4Sep. 2000UCSC-assembledArchived
 hg3Jul. 2000UCSC-assembledArchived
 hg2Jun. 2000UCSC-assembledArchived (data set only)
 hg1May 2000UCSC-assembledArchived (data set only)
CatfelCat4Dec. 2008NHGRI catChrV17eAvailable
 felCat3Mar. 2006Broad Institute Release 3Available
ChickengalGal3May 2006WUSTL Gallus-gallus-2.1Available
 galGal2Feb. 2004WUSTL Gallus-gallus-1.0Available
ChimppanTro2Mar. 2006CGSC Build 2 Version 1Available
 panTro1Nov. 2003CGSC Build 1 Version 1Available
CowbosTau4Oct. 2007Baylor College of Medicine HGSC Btau_4.0Available
 bosTau3Aug. 2006Baylor College of Medicine HGSC Btau_3.1Available
 bosTau2Mar. 2005Baylor College of Medicine HGSC Btau_2.0Available
 bosTau1Sep. 2004Baylor College of Medicine HGSC Btau_1.0Archived
DogcanFam2May 2005Broad Institute v2.0Available
 canFam1Jul. 2004Broad Institute v1.0Available
ElephantloxAfr3Jul. 2009Broad loxAfr3Available
Fugufr2Oct. 2004JGI v4.0Available
 fr1Aug. 2002JGI v3.0Available
Guinea pigcavPor3Feb. 2008Broad cavPor3Available
HorseequCab2Sep. 2007Broad EquCab2Available
 equCab1Jan. 2007Broad EquCab1Available
LampreypetMar1Mar. 2007WUSTL v3.0Available
LizardanoCar1Feb. 2007Broad AnoCar1Available
MarmosetcalJac3Mar. 2009WUSTL Callithrix_jacchus-v3.2Available
 calJac1Jun. 2007WUSTL Callithrix_jacchus-v2.0.2Available
MedakaoryLat2Oct. 2005NIG v1.0Available
Mousemm9Jul. 2007NCBI Build 37Available
 mm8Feb. 2006NCBI Build 36Available
 mm7Aug. 2005NCBI Build 35Available
 mm6Mar. 2005NCBI Build 34Archived
 mm5May 2004NCBI Build 33Archived
 mm4Oct. 2003NCBI Build 32Archived
 mm3Feb. 2003NCBI Build 30Archived
 mm2Feb. 2002MGSCv3Archived
 mm1Nov. 2001MGSCv2Archived
OpossummonDom5Oct. 2006Broad Institute release MonDom5Available
 monDom4Jan. 2006Broad Institute release MonDom4Available
 monDom1Oct. 2004Broad Institute release MonDom1Available
OrangutanponAbe2Jul. 2007WUSTL Pongo_albelii-2.0.2Available
PandaailMel1Dec. 2009BGI-Shenzhen AilMel 1.0Available
PigsusScr2Nov. 2009SGSC Sscrofa9.2Available
PlatypusornAna1Mar. 2007WUSTL v5.0.1Available
RabbitoryCun2Apr. 2009Broad Institute release oryCun2Available
Ratrn4Nov. 2004Baylor College of Medicine HGSC v3.4Available
 rn3Jun. 2003Baylor College of Medicine HGSC v3.1Available
 rn2Jan. 2003Baylor College of Medicine HGSC v2.1Archived
 rn1Nov. 2002Baylor College of Medicine HGSC v1.0Archived
RhesusrheMac2Jan. 2006Baylor College of Medicine HGSC v1.0 Mmul_051212Available
 rheMac1Jan. 2005Baylor College of Medicine HGSC Mmul_0.1Archived
SticklebackgasAcu1Feb. 2006Broad Release 1.0Available
TetraodontetNig2Mar. 2007Genoscope v7Available
 tetNig1Feb. 2004Genoscope v7Available
X. tropicalisxenTro2Aug. 2005JGI v.4.1Available
 xenTro1Oct. 2004JGI v.3.0Available
Zebra finchtaeGut1Jul. 2008WUSTL v3.2.4Available
ZebrafishdanRer6Dec. 2008Sanger Institute Zv8 Available
 danRer5Jul. 2007Sanger Institute Zv7 Available
 danRer4Mar. 2006Sanger Institute Zv6 Available
 danRer3May 2005Sanger Institute Zv5 Available
 danRer2Jun. 2004Sanger Institute Zv4 Archived
 danRer1Nov. 2003Sanger Institute Zv3 Archived
     
DEUTEROSTOMES    
C. intestinalisci2Mar. 2005JGI v2.0Available
 ci1Dec. 2002JGI v1.0Available
LanceletbraFlo1Mar. 2006JGI v1.0Available
S. purpuratusstrPur2Sep. 2006Baylor College of Medicine HGSC v. Spur 2.1Available
 strPur1Apr. 2005Baylor College of Medicine HGSC v. Spur_0.5Available
     
INSECTS    
A. melliferaapiMel2Jan. 2005Baylor College of Medicine HGSC v.Amel_2.0 Available
 apiMel1Jul. 2004Baylor College of Medicine HGSC v.Amel_1.2 Available
A. gambiaeanoGam1Feb. 2003IAGP v.MOZ2Available
D. ananassaedroAna2Aug. 2005Agencourt Arachne releaseAvailable
 droAna1Jul. 2004TIGR Celera releaseAvailable
D. erectadroEre1Aug. 2005Agencourt Arachne releaseAvailable
D. grimshawidroGri1Aug. 2005Agencourt Arachne releaseAvailable
D. melanogasterdm3Apr. 2006BDGP Release 5Available
D. melanogasterdm2Apr. 2004BDGP Release 4Available
 dm1Jan. 2003BDGP Release 3Available
D. mojavensisdroMoj2Aug. 2005Agencourt Arachne releaseAvailable
 droMoj1Aug. 2004Agencourt Arachne releaseAvailable
D. persimilisdroPer1Oct. 2005Broad Institute releaseAvailable
D. pseudoobscuradp3Nov. 2004Flybase Release 1.0Available
 dp2Aug. 2003Baylor College of Medicine HGSC Freeze 1Available
D. sechelliadroSec1Oct. 2005Broad Release 1.0Available
D. simulansdroSim1Apr. 2005WUSTL Release 1.0Available
D. virilisdroVir2Aug. 2005Agencourt Arachne releaseAvailable
 droVir1Jul. 2004Agencourt Arachne releaseAvailable
D. yakubadroYak2Nov. 2005WUSTL Release 2.0Available
 droYak1Apr. 2004WUSTL Release 1.0Available
     
NEMATODES    
C. brennericaePb2Feb. 2008WUSTL 6.0.1Available
 caePb1Jan. 2007WUSTL 4.0Available
C. briggsaecb3Jan. 2007WUSTL Cb3Available
 cb1Jul. 2002WormBase v. cb25.agp8Available
C. elegansce6May 2008WormBase v. WS190Available
 ce4Jan. 2007WormBase v. WS170Available
 ce2Mar. 2004WormBase v. WS120Available
 ce1May 2003WormBase v. WS100Archived
C. japonicacaeJap1Mar. 2008WUSTL 3.0.2Available
C. remaneicaeRem3May 2007WUSTL 15.0.1Available
 caeRem2Mar. 2006WUSTL 1.0Available
P. pacificuspriPac1Feb. 2007WUSTL 5.0Available
     
OTHER    
Sea HareaplCal1Sep. 2008Broad Release Aplcal2.0Available
YeastsacCer2June 2008SGD June 2008 sequenceAvailable
 sacCer1Oct. 2003SGD 1 Oct 2003 sequenceAvailable



  Initial assembly release dates
 

Question:
"When will the next assembly be out?"

Response:
UCSC does not produce its own genome assemblies, but instead obtains them from standard sources. For example, the human assembly is obtained from NCBI. Because of this, you can expect us to release a new version of a genome soon after the assembling organization has released the version. A new assembly release initially consists of the genome sequence and a small set of aligned annotation tracks. Additional annotation tracks are added as they are obtained or generated. Bulk downloads of the data are typically available in the first week after the assembly is released in the browser.



  Data sources - UCSC assemblies
 

Question:
"Where does UCSC obtain the assembly and annotation data displayed in the Genome Browser?"

Response:
All the assembly data displayed in the UCSC Genome Browser are obtained from external sequencing centers. To determine the data source and version for a given assembly, see the assembly's description on the Genome Browser Gateway page or the List of UCSC Genome Releases.

The annotations accompanying an assembly are obtained from a variety of sources. The UCSC Genome Bioinformatics Group generates several of the tracks; the remainder are contributed by collaborators at other sites. Each track has an associated description page that credits the authors of the annotation.

For detailed information about the individuals and organizations who contributed to a specific assembly, see the Credits page.



  Comparison of UCSC and NCBI human assemblies
 

Question:
"How do the human assemblies displayed in the UCSC Genome Browser differ from the NCBI human assemblies?

Response:
Recent human assemblies displayed in the Genome Browser (hg10 and higher) are identical to the NCBI assemblies.



  Differences between UCSC and NCBI mouse assemblies
 

Question:
"Is the mouse genome assembly displayed in the UCSC Genome Browser the same as the one on the NCBI website?"

Response:
The mouse genome assemblies featured in the UCSC Genome Browser are the same as those on the NCBI web site with one difference: the UCSC versions contain only the reference strain data (C57BL/6J). NCBI provides data for several additional strains in their builds.



  Accessing older assembly versions
 

Question:
"I need to access an older version of a genome assembly that's no longer listed in the Genome Browser menu. What should I do?"

Response:
In addition to the assembly versions currently available in the Genome Browser, you can access older versions of the browser through our archives. To view an older version, click the Archives link on the Genome Browser home page.



  Frequency of GenBank data updates
 

Question:
"How frequently does UCSC update its databases with new data from GenBank?"

Response:
Daily and weekly incremental updates of mRNA, RefSeq, and EST data are in place for several of the more recent Genome Browser assemblies. Assemblies that are not on an incremental update schedule are updated whenever we load a new assembly or make a major revision to a table.

Data are updated on the following schedule:

  • Native and xeno mRNA and refSeq tracks: updated daily
  • EST data: updated weekly on Saturday morning
  • Downloadable data files: updated weekly on Saturday morning
  • Outdated sequences - removed once per quarter

Mirror sites are not required to use an incremental update process, and should not experience problems as a result of these updates.



  Coordinate changes between assemblies
 

Question:
"I noticed that the chromosomal coordinates for a particular gene that I'm looking at have changed since the last time I used your browser. What happened?"

Response:
A common source of confusion for users arises from mixing up different assemblies. It is very important to be aware of which assembly you are looking at. Within the Genome Browser display, assemblies are labeled by organism and date. To look up the corresponding UCSC database name or NCBI build number, use the release table.

UCSC database labels are of the form hgn, panTron, etc. The letters designate the organism, e.g. hg for human genome or panTro for Pan troglodytes. The number denotes the UCSC assembly version for that organism. For example, ce1 refers to the first UCSC assembly of the C. elegans genome.

The coordinates of your favorite gene in one assembly may not be the same as those in the next release of the assembly unless the gene happens to lie on a completely sequenced and unrevised chromosome. For information on integrating data from one assembly into another, see the Converting positions between assembly versions section.



  Converting positions between assembly versions
 

Question:
"I've been researching a specific area of the human genome on the current assembly, and now you've just released a new version. Is there an easy way to locate my area of interest on the new assembly?"

Response:
See the section on converting coordinates for information on assembly migration tools.



  Missing annotation tracks
 

Question:
"Why is my favorite annotation track missing from your latest release?"

Response:
The initial release of a new genome assembly typically contains a small subset of core annotation tracks. New tracks are added as they are generated. In many cases, our annotation tracks are contributed by scientists not affiliated with UCSC who must first obtain the sequence, repeatmasked data, etc. before they can produce their tracks. If you have need of an annotation that has not appeared on an assembly within a month or so of its release, feel free to send an inquiry to genome@soe.ucsc.edu. Messages sent to this address will be posted to the moderated genome mailing list, which is archived on a public Web-accessible pipermail archive. This archive may be indexed by non-UCSC sites such as Google.



  What next with the human genome?
 

Question:
"Now that the human genome is "finished", will there be any more releases?"

Response:
Rest assured that work will continue. There will be updates to the assembly over the next several years. This has been the case for all other finished (i.e. essentially complete) genome assemblies as gaps are closed. For example, the C. elegans genome has been "finished" for several years, but small bits of sequence are still being added and corrections are being made. NCBI will continue to coordinate the human genome assemblies in collaboration with the individual chromosome coordinators, and UCSC will continue to QC the assembly in conjunction with NCBI (and, to a lesser extent, Ensembl). UCSC, NCBI, Ensembl, and others will display the new releases on their sites as they become available.



  Mouse strain used for mouse genome sequence
 

Question:
"What strain of mouse was used for the Mus musculus genome?"

Response:
C57BL/6J.



  UniProt (Swiss-Prot/TrEMBL) display changes
 

Question:
"What has UCSC done to accommodate the changes to display IDs recently introduced by UniProt (aka Swiss-Prot/TrEMBL)?"

Response:
Here is a detailed description of the database changes we have made to accommodate the UniProt changes. If you are using the proteinID field in our knownGene table or the Swiss-Prot/TrEMBL display ID for indexing or cross-referencing other data, we strongly suggest you transition to the UniProt accession number. These changes will also affect anyone who is mirroring our site.

  1. The latest UniProt Knowledgebase (Release 46.0, Feb. 1st, 2005) was parsed and the results were stored in a newly created database sp050201.
  2. A corresponding database, proteins050201, was constructed based on data in sp050201 and other protein data sources.
  3. Two new symbolic database pointers, uniProt and proteome, have been created to point to the two new databases mentioned above. Some parts of our programs use the data in these two DBs.
       uniProt  ---> sp050201
       proteome ---> proteins050201
  4. The existing protein symbolic database pointers, swissProt and proteins remain unchanged. Some parts of our programs still use these two pointers and the data in their associated protein databases.
       swissProt ---> sp041115
       proteins  ---> proteins041115
  5. Two new tables, spOldNew and uniProtAlias, have been added to the proteome database.

    The spOldNew table contains three columns:
    • acc -- primary accession number
    • oldDisplayId -- old display ID
    • newDisplayId -- new display ID

    The uniProtAlias table contains four columns:
    • acc -- UniProt accession number
    • alias -- alias (could be acc, old and new display IDs, etc.)
    • aliasSrc -- source of the alias type
    • aliasSrcDate -- date of the source data

    The aliases include primary accessions, secondary accessions new display IDs, old display IDs, and old display IDs corresponding to new secondary accessions.

  6. Three new functions have been added to kent/src/hg/spDb.c:
       char *oldSpDisplayId(char *newSpDisplayId);
       /* Convert from new Swiss-Prot display ID to old display ID */
          
       char *newSpDisplayId(char *oldSpDisplayId); 
       /* Convert from old Swiss-Prot display ID to new display ID */
    
       char *uniProtFindPrimAcc(char *id);
       /* Return primary accession given an alias. */

    The uniProtFindPrimAcc() function is enabled by the new uniProtAlias table.

We anticipate additional changes down the road and may eventually merge the two sets of protein DB pointers into one set.

Currently, the proteinID field of the knownGene table for existing genome releases (hg15, hg16, hg17, mm3, mm4, mm5, rn2, and rn3) uses old Swiss-Prot/TrEMBL display IDs (pre-1 Feb. '05). In the future, we may change this field to show the UniProt accession number. Should we choose not to change the content of the proteinID field, we may consider adding a new field, uniProtAcc.

If you have any questions about these changes and their impact on your work, please email us at genome@soe.ucsc.edu. Mirror sites may send questions to genome-mirror@soe.ucsc.edu. Messages sent to these addresses will be posted to the moderated mailing lists, which are archived on a public Web-accessible pipermail archive. This archive may be indexed by non-UCSC sites such as Google.