Ta mig till UPSC hemsidan

Information


Genome sequencing of Populus

The US Department of Energy has provided resources to obtain the full genome sequence of Populus trichocarpa. The tree that will be used is named Nisqually-1 after a river near Seattle where it originally was collected.  The Joint Genome Institute (JGI) have performed full genome shot-gun sequencing to over 7 x coverage.  The assembly, annotation and analysis of the genome sequence is carried out of a group of institutions, JGI,  The Oak Ridge National Laboratory, UPSC, Genome Canada and a few others. The release of the annotated sequence during 2004 will present a tremendous resource for tree researchers worldwide.


The Populus array

Our first microarray (the wood array) with Populus cDNA clones consisted of 2995 clones from a library created from RNA prepared from the wood-forming zone. The second array produced, contained 2667 clones from the leaf library.
The first generation of a global Populus array contains in addition clones from other tissues, in total 13 488 clones, is based on the Unigene set we obtained after sequencing of about 35 000 clones. This, POP1, was ready in 2001. Over 2000 slides were produced and used in over 25 different biological experiments.
 
The second generation of the global Populus arrays, POP2, is now in production. It is based on the Unigene set extracted from the analysis of over 100 000 ESTs. To our knowledge, no single academic project has independent of others produced a more comprehensive microarray. We have ourselves sequenced all the ESTs, performed all the bioinformatics and performed every step in the production of the Populus arrays, that the researchers in the project now have the full benefit from.


The Database

This database is built from 121 495 populus EST-sequences from 19 cDNA librarys.
Sequences that are similar are grouped together into clusters (11891). A cluster should represent sequences coming from a specific gene. Each cluster has been given an annotation based on best arabidopsis hit and sometimes best swissprot hit.
Functional classes are assigned from automaticly derived arabidopsis classifications. Within each cluster sequences may be further grouped into contigs that show very high similarity.

There is a consensus sequence for each contig. Contigs may represent species variants, splice variants etc. Clustered sequences that do not belong to a contig are refered to as singlets. Sequences that do not belong to a cluster are refered to as Singletons (12767) and their annotation is only based on there own sequence. Clusters + singletons should represent a UNIGENE set. Some clones have a PU number. This referes to a DNA preparation spotted on a microarray. The "Re-sequenced" sequences comes from this DNA preparation.

Sometimes clones may not have a unique annotation. There is always a main annotation for each PU number. The details on how an annotation was choosen can be viewed by clicking "show details".

The dataset is described in the publication:

Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM, Campaa L, Jonsson Lindvall J, Tandre K, Strauss SH, Sundberg B, Gustafsson P, Uhlen M, Bhalerao RP, Nilsson O, Sandberg G, Karlsson J, Lundeberg J, Jansson S (2004)
A Populus EST resource for plant functional genomics.
Proc Natl Acad Sci U S A. 2004 Sep 21;101(38):13951-6     Abstract   Full text





Responsible for this page jan.karlsson@plantphys.umu.se