A simple optimization can improve the performance of single feature polymorphism detection by Affymetrix expression arrays

BMC Genomics. 2010 May 20:11:315. doi: 10.1186/1471-2164-11-315.

Abstract

Background: High-density oligonucleotide arrays are effective tools for genotyping numerous loci simultaneously. In small genome species (genome size: < approximately 300 Mb), whole-genome DNA hybridization to expression arrays has been used for various applications. In large genome species, transcript hybridization to expression arrays has been used for genotyping. Although rice is a fully sequenced model plant of medium genome size (approximately 400 Mb), there are a few examples of the use of rice oligonucleotide array as a genotyping tool.

Results: We compared the single feature polymorphism (SFP) detection performance of whole-genome and transcript hybridizations using the Affymetrix GeneChip Rice Genome Array, using the rice cultivars with full genome sequence, japonica cultivar Nipponbare and indica cultivar 93-11. Both genomes were surveyed for all probe target sequences. Only completely matched 25-mer single copy probes of the Nipponbare genome were extracted, and SFPs between them and 93-11 sequences were predicted. We investigated optimum conditions for SFP detection in both whole genome and transcript hybridization using differences between perfect match and mismatch probe intensities of non-polymorphic targets, assuming that these differences are representative of those between mismatch and perfect targets. Several statistical methods of SFP detection by whole-genome hybridization were compared under the optimized conditions. Causes of false positives and negatives in SFP detection in both types of hybridization were investigated.

Conclusions: The optimizations allowed a more than 20% increase in true SFP detection in whole-genome hybridization and a large improvement of SFP detection performance in transcript hybridization. Significance analysis of the microarray for log-transformed raw intensities of PM probes gave the best performance in whole genome hybridization, and 22,936 true SFPs were detected with 23.58% false positives by whole genome hybridization. For transcript hybridization, stable SFP detection was achieved for highly expressed genes, and about 3,500 SFPs were detected at a high sensitivity (> 50%) in both shoot and young panicle transcripts. High SFP detection performances of both genome and transcript hybridizations indicated that microarrays of a complex genome (e.g., of Oryza sativa) can be effectively utilized for whole genome genotyping to conduct mutant mapping and analysis of quantitative traits such as gene expression levels.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Plant / genetics
  • False Negative Reactions
  • False Positive Reactions
  • Gene Expression Profiling*
  • Genomics
  • Nucleic Acid Hybridization
  • Oligonucleotide Array Sequence Analysis / methods*
  • Plants / genetics
  • Polymorphism, Single Nucleotide*
  • RNA, Complementary / genetics

Substances

  • DNA, Plant
  • RNA, Complementary