CANTATAdb 2.0 is a database of lncRNAs identified computationally in 36 plant species and 3 algae. Below are lncRNA counts per species.
We mapped reads from hundreds RNA-Seq libraries to corresponding plant genomes and we re-annotated these genomes using gene prediction software and known annotation data as a reference. Then, we applied several filters to discriminate between non-coding and protein-coding transcripts.
CANTATAdb 2.0 collects 239,631 lncRNAs predicted in 39 species. The database presents, among others, lncRNA sequences, expression values across RNA-Seq libraries, genomic locations, hypothetical peptides encoded by lncRNAs, BLAST search results against Swiss-Prot proteins and non-coding RNAs from NONCODE.
A number of model plant species lack comprehensive datasets of lncRNas and their annotations, which poses a difficulty for their further studies. CANTATAdb 2.0 is one of the biggest and most comprehensive databases of plant lncRNAs.
Any questions, remarks, suggestions? Feel free to contact us:
Website of the Laboratory of Integrative Genomics
(group of prof. Izabela Makalowska).
Laboratory of Integrative Genomics
Institute of Antropology
Adam Mickiewicz University in Poznan
ul. Umultowska 89, 61-674 Poznan, Poland