A genomic-wide analysis of Apis mellifera: insights into small diverse high copy number ORFs
Supriya Munshaw, Robert W Cutler, Siriwat Wongsiri and Panuwan Chantawannakul
This paper develops a frequency-based analysis of every open reading frame (ORF) in the honey bee (Apis mellifera) genome using a set of PERL algorithms which were developed to identify novel exonic regions. Using the actual amino acid abundances for these regions, this ORF profiles approach found a background Poisson distribution of randomly arranged regions. On top of this background, significant overabundances of small ORF regions greater than 16 amino acids were found. Some of these regions share similarity to known sequences such as the ribosomal proteins, but several families of these regions shared no similarity to any other nucleotide or amino acid sequence. This frequency analysis was developed to search for the odorant binding proteins which are expected to occur with a high copy number yet share little sequence similarity to any other known sequence. incorporate desirable genotypes into a breeding programme.