Home-->Genomic Sequencing-->Sequencing Data

Sequencing Data

  Genic-enriched libraries from the selected six genotypes (each with ClaI and HpaII) were sequenced using 454 GS FLX Titanium (Roche).  All the reads generated were searched for contamination of organellar (Plastid and mitochondrial) DNA and removed. Reads shorter than 50 bases were also removed from further processing. All the filtered high quality reads were assembled in three different strategies i.e. super-assembled, genotype-wise and enzyme-wise using de novo gsAssembler v2.5.3. Since, both the enzymes were targeting the same region, reads were also mapped enzyme-wise and genotype-wise using gsReferenceMapper v2.5.3. The resulting unigene set (contigs and singletons) from super-assembly were processed further.  
     
   
  Sequencing Output: Summary of 454 sequencing data generated and quality filteration  
 
Genotypes  Enzymes used Total reads Total bases Avg. Read  Longest read Trashed reads Reads after 
(Mb) Length (b)   (bases) Chloroplast Mitochondrial Too short filtration 
JKC 703 HpaII  1509044 429.1 284 634 389756 89442 23212 1075626
ClaI  1647001 474.5 288 630 252473 41234 20952 1364181
JKC 725 HpaII  1360299 372.9 274 704 270277 62720 20964 1054157
ClaI  1865501 533.4 286 976 205250 41793 23849 1624696
JKC 737 HpaII  1448726 407.7 281 806 380981 69477 20997 1037133
ClaI  1688291 542.9 322 672 125893 31990 11190 1535749
JKC 770 HpaII  1304640 376.1 288 1195 294435 67969 22322 971266
ClaI  1765469 481.1 273 1182 217653 49506 13227 1518511
     MCU5  HpaII  1153230 316.8 275 1196 423654 86618 27300 688989
ClaI  1474521 416.6 283 1188 485167 60026 29095 948433
LRA 5166 HpaII  1449002 433.9 299 645 258726 45684 17378 1164687
ClaI  1703215 513.3 301 627 376497 71235 13537 1297097
Total    18368939 5298.8 288 1196 3680762 717694 244023 14280525
 
     
     
  Genotype-wise and super assembly: Details of genotype specific and super assembly of genic-enriched libraries using Newbler v2.5.3  
 
Parameters JKC 703 JKC 725 JKC 737 JKC 770 MCU5 LRA 5166 Super assembly 
All Contigs (>100 bp) 58,142 61,862 54,731 53,419 27,952 63,002 533271
Singletons 1233638 1438844 1325035 1287063 806670 1234938 3561858
Total bases (Mb)  377.9 428.1 427.3 378.4 233.4 398.9 1272.6
Large contigs (>500 bp) 21,920 20255 17,960 17,657 8,663 25,084 215504
Largest contig size (bp) 29,718 30878 30,387 31,003 36,027 29,615 24275
Average contig size (bp) 826 808 809 826 815 868 900
N50 contig sizea (bp) 808 785 787 802 771 861 894
Aligned Reads (%) 40.77 39.84 36.94 39.26 45.4 43.94 61.9
Aligned Bases (%) 38.05 36.38 32.12 35.6 42.33 41.03 58.5
Inferred read errorb  2.45 2.75 2.97 2.69 2.14 2.38 2.16
Q40 plus basesc  92.78 92.6 92.25 92.63 94.25 92.99 93.8
aN50 corresponds to the length of the smallest contig in the set comprised of largest contigs whose combined length represents 50% of the 
 total assembly size. bPercentage of total number of differences such as "indels" found in aligned reads over numAlignedBases. 
cPercentage of bases called that have a quality score of 40 or above.