Sudoku genome, sudoku genome sequencing

Sudoku: Brainteaser, Time Waster, Genome Sequencer

July 02, 2009 05:00 PM
by Mark E. Moran
Researchers apply the principles behind the popular math puzzle to vastly reduce the time and cost of genome sequencing.

The Scientific Merits of a Puzzle

Researchers at the Cold Spring Harbor Laboratory, a private, not-for-profit research and education institution on Long Island, N.Y., have issued a report suggesting that the principles behind Sudoku (pronounced soo-doe-koo), a popular math-based, 81-box puzzle game, may revolutionize genome sequencing and the study of medical genetics.

The report will be published as the cover story in the July 1 issue of the journal Genome Research, and is available in pre-publication form on the laboratory’s Web site, with a supporting PowerPoint presentation.

According to a press release this week from the laboratory, the researchers combined the 2,000-year-old Chinese remainder theory with concepts from cryptology. Large pools of samples arranged in rows and columns, and the remainder theorem are used to pinpoint specific samples.

This allows tens of thousands of DNA samples to be combined, and their sequences determined, all at once. In the early days of DNA sequencing, researchers could only sequence one DNA sample at a time, and until the advent of this new technique, could only combine a few hundred samples.

Katherine Harmon, writing for Scientific American, explains that in DNA sequencing, the theorem behind Sudoku “is used to organize data points with coordinates in a box, but it can also be used to figure out all sorts of missing information in other domains, such as distant points sensed with high-speed radar, pieces of code, and who that attractive person was that you saw at three out of seven parties on a cruise ship.”

CSHL professor Gregory Hannon, Ph.D., leader of the team that invented the “Sudoku” approach, estimates that it could reduce the cost of a sequencing project that currently costs $10 million to between $50,000 and $80,000. 

Hannon says the method has the potential to analyze specific parts of the genomes of a particular population and identify individuals who carry mutations of a gene that may lead to disease, a process known as genotyping. Hannon’s team has already begun to explore this potential with an organization that has collected DNA from orthodox Jewish communities, which have unusually high incidences of Tay-Sachs disease and cystic fibrosis. 

Yaniv Erlich, a graduate student in the Hannon laboratory and first author on the paper, told Newsday he hopes that “DNA Sudoku” will be “helpful to larger clinical applications such as HLA typing, which is used to determine compatibility of organ donors.”

History of Sudoku

HowStuffWorks explains that the goal of Sudoku is to fill a series of nine-square rows, nine-square columns and nine-square boxes with the numbers one through nine, using each number only once in each section, column or box. It explains that it’s “the interaction between the rows, columns and boxes that tells you where the numbers need to go.”

When Sudoku became a craze in England in 2005, David Smith wrote an article for The Guardian explaining the history of Sudoku. Smith says its origins stem from the work of Swiss mathematician Leonhard Euler, who in 1783 devised a precursor to Sudoku that he called “Latin Squares.”

The game in its current form was published by Dell Puzzles of Manhattan beginning in 1979. Five years later, it became a sensation in Japan, and 20 years later, it made inroads into major newspapers in Great Britain and the United States.

Reference: Genome sequencing


Most Recent Beyond The Headlines