Arquivo para GWAS

Personal Genome Project Sees Whole-Genome Sequencing as ‘Increasingly a Viable Option’

Personal Genome Project

By Julia Karow

This article was originally published Oct. 14.

Organizers of Harvard Medical School’s Personal Genome Project said that as the cost of DNA sequencing declines, they are considering whole-genome sequencing rather than exome sequencing for the second phase of the study, PGP-100.

The project has already added results from the genome of its founder and principal investigator, George Church, to its website, whose genome was recently sequenced by Complete Genomics.

Launched in 2007 with 10 participants, the PGP aims to sequence the genomes of 100,000 people and to correlate their genotypes with trait information. In April, the project said that it plans to scale up to 100 participants for its second phase (see In Sequence 5/5/2009).

In a newsletter e-mailed last week to individuals interested in the study, PGP organizers said that they have been closely monitoring the decrease in cost of whole human genome sequencing “because it will impact our sequencing strategy for the PGP-100.”

The cost has already fallen to less than $50,000 per genome, PGP said, “with some speculating that the arrival of $5,000 genomes is imminent,” a reference to Complete Genomics’ $5,000 human genome sequencing service, which is scheduled to launch in January.

The project’s initial strategy, according to the organizers, was to focus on the exome, since it is “information rich” and seemed “a more economical alternative” to whole-genome sequencing.

“However, the cost of exome sequencing has not fallen as rapidly as whole-genome sequencing,” they noted, and as a result, for the PGP-100, “the decision to pursue whole genome is increasingly a viable option.”

It will depend, though, on factors such as the project’s ability to raise funding as well as “the willingness of sequencing companies to publicly showcase their technologies through sponsorship of PGP-100 genomes.”

According to its website, the PGP is funded by donations from individuals, Google, Orbimed, the COUQ Foundation, a grant from the Broad Institute, technology development grants from the Department of Energy and NIH, and in-kind support from various organizations. PersonalGenomes.org, a 501(c)3 charitable organization, seeks to raise $1.5 million in donations for the project this year from foundations, private companies, and individuals.

The project has already posted results from an analysis of Church’s genome, which was recently sequenced by Complete Genomics. Church told In Sequence last month that the company has committed to sequencing nine additional PGP genomes, though it was unclear whether it will charge the PGP for its services (see In Sequence 9/15/2009).

For the interpretation of the genomic information, the PGP is using Trait-o-matic, an open-source tool developed in house, which automatically identifies, filters, and annotates genetic variants. The project plans to use the software to generate research reports that contain variants “that may be of potential significance” and has already generated prototypes of such reports for its first 10 participants, based for nine of them on partial exome sequence data.

Future releases of Trait-o-matic will “enable a community of volunteers to annotate and interpret integrated genomic and trait datasets from the PGP.”

Deixe um comentário »

The Technology Pilot

If you’ve been wondering how the 1,000 Genomes Project is doing, here’s an account from Dan Koboldt at his MassGenomics blog about last week’s meeting at Baylor where participants discussed the “Pilot 3″ phase of the project. “Unlike pilots 1 and 2, which emphasized whole genome sequencing to low or high coverage, respectively, in Pilot 3, the exons of 1,000 genes (~1.5 Mbp total) were selectively targeted for sequencing by capture technologies,” Koboldt writes. The team is also checking data across platforms and pipelines. “Overall, the Pilot 3 variant calls are looking good – dbSNP concordances in the 70-80% range or higher, and transition/transversion ratios of about 3-3.50 – and consistent across 454 and Solexa data from multiple centers,” he writes.

Deixe um comentário »

First genome-wide, single-base-resolution maps

Joe Ecker is senior author on a paper that provides the first genome-wide, single-base-resolution maps of methylated cytosines in a mammalian genome. Comparing both human embryonic stem cells and fetal fibroblasts, they found “widespread differences” between the two, including almost one-quarter of all methylation in embryonic stem cells was in a non-CG context, suggesting that embryonic stem cells may use different methylation mechanisms to affect gene regulation, they write.

Several opinion pieces appear this week. One checks in with experts and their concerns regarding the stimulus grants, another looks at the open-source Polymath Project, while a third by Cameron Neylon examines the potential of Google’s open-sorce collaboration tool, Google Wave.

A special insight section explores the changing landscape of neuroscience research. Says an editorial, “The experimental landscape has changed markedly over the past few years, given the technological advances in molecular genetics, optogenetics and functional imaging.” Articles cover molecular genetics and imaging technologies for circuit-based neuroanatomy, neuroscience and systems biology, and multimodal techniques for diagnosing Alzheimer’s disease.

Research led by Joel Levine, a neuroscientist from the University of Toronto, has determined that Drosophila melanogaster flies use a single chemical to communicate gender and sibling identity in order to pick the right sex partners. By inserting a transgene into the fly’s genome that killed cells that produced these special hydrocarbon signaling chemicals, they report that hydrocarbon-free male flies attempted copulating with each other, says a story at the BBC. Check out the accompanying video, too.

Deixe um comentário »

Genetics Suggest Population Expansion in Africa Began in Stone Age

July 29, 2009

NEW YORK (GenomeWeb News) – Modern human populations started expanding some 40,000 years ago, according to a paper appearing appeared online today in PLoS ONE.

Researchers from the University of Arizona and the University of California at San Francisco used multi-locus sequence analysis to assess genetic signatures found in nearly 200 individuals from seven populations around the world. Their results suggest human population expansions in Africa started about 40,000 years ago during the Stone Age — a more recent expansion time than that predicted from previous studies.

“[B]oth hunter-gathers (San and Biaka) and food-producers (Mandenka and Yorubans) best fit models with population growth beginning in the Late Pleistocene,” senior author Michael Hammer, a genetics researcher at the University of Arizona, and his co-authors wrote. “These dates are concurrent with the appearance of the Late Stone Age in Africa, supporting the hypothesis that population growth played a significant role in the evolution of Late Pleistocene human cultures.”

Previous studies based on mitochondrial DNA, Y-chromosome data, or autosomal microsatellites provided a broad range of estimates about when modern human population expansion began, dating as far back as about a few hundred thousand years ago. But such estimates often conflict with one another and are based on one or a few sequences that may be under selective pressure, the researchers explained.

In an effort to generate more reliable data for teasing apart human population history, Hammer and his team used Sanger sequencing to re-sequence roughly 6,000 bases of nuclear DNA from each of about 20 autosomal non-coding regions for 184 individuals.

These regions were selected because they were sites with lots of crossing over events but were also far from protein-coding genes and not likely to be under selection. By looking at all of the areas together, Hammer told GenomeWeb Daily News, it’s possible to overcome the noise detected at any single region.

The individuals tested belonged to seven different populations: San, Biaka, Mandenka, Yoruban, French Basque, Han Chinese, and Melanesian.

When the team analyzed their data using multi-locus analysis, they found evidence suggesting that both hunter-gatherer populations (such as the San from Namibia and the Biaka from the Central African Republic) and food-producer populations (such as the Mandenka from Senegal and Yorubans from Nigeria) began expanding roughly 40,000 years ago during the Late Pleistocene period.

That predates the advent of farming in Africa, Hammer noted, and is consistent with archeological evidence suggesting there was a burst of populations interacting and sharing tools and cultural innovations at that time.

Overall, the team concluded that human populations in Africa began a ten-fold expansion some 36,000 years ago. Their data hint that expansion may have been a tad earlier and faster in the hunter-gatherer population — about a 13-fold expansion starting about 41,000 years ago — than in the food-producing populations, which expanded approximately seven-fold starting some 31,000 years ago.

In the future, the team plans to do additional studies looking at more populations from different parts of the world. And, Hammer said, they also hope to employ next-generation sequencing technology to look at even more regions in the genome.

Deixe um comentário »

GWAS and Differences in DNA Between Tissues

Posted by Bob Grant
www.the-scientist.com/blog
Recent findings may spell trouble for genome-wide association studies based on DNA obtained through blood samples: Genetic material may vary between blood cells and other tissues in a single individual, a study in the July issue of Human Mutation reports.

Image: Wikimedia

The study “raises a very interesting question,” Howard Edenberg, director of the Indiana University School of Medicine’s center for medical genomics, told The Scientist. Many genome-wide association studies — especially studies on systemic diseases such as diabetes and atherosclerosis — depend solely upon DNA harvested from blood samples to identify genes associated with medical conditions. But this study “suggests that looking only at blood, you may miss some things.”

Searching for the genes behind a fatal condition called abdominal aortic aneurysm (AAA), researchers from McGill University in Montreal found that complementary DNA from diseased abdominal aortic tissue did not match genomic DNA from leukocytes in blood from the same patient. “We did not expect to find a difference in the tissue [genes] compared to the leukocyte [genes],” said endocrinologist Morris Schweitzer, who led the study.

Schweitzer and his team uncovered three single nucleotide polymorphisms (SNPs) in samples of diseased tissue from 31 AAA patients that were not present in matching blood samples. They also tested five aortic and blood samples from normal individuals and found the same discrepancy. Schweitzer said that the apparent genetic difference between different cells in the body may cast some doubt on genome-wide association studies that only use DNA from blood samples to infer disease states. “I think they may not be accurate because they might not reflect what’s in the tissue,” he said, adding that researchers should look upon such genetic results “very carefully and very trepidatiously”

Edenberg, who was not involved with the study but who conducts genome-wide association studies to explore the genetic roots of alcoholism and bipolar disorder, said that while the findings are interesting, they are very preliminary. “If they’re correct about this, and there are these genomic differences between tissues and blood at certain alleles, then we’re missing some things,” he said. Edenberg explained that experimenters generally take into account that such studies are somewhat “underpowered” in terms of their ability to catch every genetic indicator of disease. Schweitzer’s results, he noted, may add another layer to this consideration, but do not suggest that genome-wide association studies would turn up false positives, or blood-based genes mistakenly attributed to a particular disease.

Sudha Seshadri, a Boston University neurologist who was not involved in the study, told The Scientist that though the McGill group’s results are important, they do not negate genome-wide association data that scientists have already gathered. “I don’t think [the study] says much about the usefulness or validity of genome-wide association studies as they are being done in cohorts around the world.” Genome-wide studies on diabetes, for example, have identified about 16 genes that are related (in varying degrees) to the disease, said Seshadri, who collaborates on the Framingham Heart Study, a six-decade longitudinal study on more than 5,000 people that has more recently included genomic data.

“I think I would have suggested a few more experiments, personally,” Edenberg added. In particular, he pointed to the fact that the McGill researchers were comparing complementary DNA from aortic tissue to genomic DNA from blood. “At the moment,” he said, the discrepancy “seems relatively compatible with RNA editing [rather] than with a genomic issue.” The study should have compared genomic DNA from the aortic tissues with the genomic blood DNA, and cDNA from both cell types, Edenberg said.

Schweitzer said his group is currently working on this experiment and “should have results probably in a couple of weeks.” He noted that differences between tissue and blood DNA may account for the relatively low levels of association turned up by most genome-wide association studies. Of all the genome-wide association studies that have been conducted, he said, “No one has really found that one miracle gene that really points to something.”

Seshadri, however, said it’s hasty to dismiss the value of such studies. “I think [the authors] make some provocative statements that express a viewpoint, but not a widely-accepted viewpoint,” she said. “It’s far too early in the process of genome-wide association studies to conclude that they have not been fruitful.”

Deixe um comentário »

Canadian Initiative Developing Platform to Map Human Interactome, Eyes International Consortium

This story originally ran on July 1 and has been updated to include additional comments.

By Tony Fong

A multi-million dollar effort to create a technology platform to map the human interactome is underway in Canada with an eye to making it international.
Last month the Canada Foundation for Innovation awarded C$9.16 million ($7.89 million) to a national initiative to create a technology platform, bringing the total funding for the project to C$22.9 million ($19.7 million).

A total of 12 universities throughout Canada will be working on the interactome project.

Once the national technology platform becomes operational, the plan is to bring in institutions and partners from around the globe in an international push to create a complete set of cellular interaction networks.

In an interview with ProteoMonitor this week, Benoit Coulombe, who is heading the Canadian work and is a professor and director of the Proteomics Discovery Platform at the Institut de Recherches Cliniques de Montreal, said that the national technology platform comprises the 12 universities along with their instruments, methods, workflows, and expertise in elucidating the human interactome.

Much of the funding will be directed at purchasing new equipment and renovating facilities. The C$9.16 funding from CFI, an independent corporation created by the Canadian government, is for infrastructure. The remaining C$13.74 million, which comes from other partners such as the province of Quebec and companies such as Thermo Fisher Scientific, also will be used for infrastructure costs, not operational expenses, Coulombe said.

Among the new equipment that will be purchased are: Thermo Fisher’s Orbitrap mass spectrometers; Illumina’s Genome Analyzer and Applied Biosystems’ SOLiD second-generation DNA sequencing platforms; robotic liquid handlers; confocal microscopes; and other instruments.

While the 12 universities are already mapping the human interactome, the national initiative brings them together in a collaborative mode that can lead to greater efficiency, more reliable results, and generally better science, Coulombe said.

“The idea of this technology platform is that we put together 12 universities across Canada … that already have activities in protein-protein interaction or interactome studies,” he said. In a virtual manner, “these 12 institutions [will now] sit around the same table and plan their activities relating to protein-protein, protein-RNA interaction studies, et cetera. … Now we have a coordinated platform and now we can plan the equipment [and] the technology pipeline that we want to run.”

New methods development, especially in computational approaches, will also be part of the initiative.

The schools involved in the effort are IRCM, which is affiliated with the University of Montreal; Centre for Cellular and Biomolecular Research at the University of Toronto; Samuel Lunenfeld Research Institute at the Mt. Sinai Hospital; the Ottawa Institute of Systems Biology at the University of Ottawa; the Université de Sherbrooke; Dalhousie University; the University of Victoria; the University of British Columbia; the University of Manitoba; the Institut de Recherché en Immunologie et en Cancérologie at the University of Montreal; McGill University; and the Université Laval.

Because each participating institution has its own area of expertise, the initiative will allow researchers to tap into information that they otherwise might not have access to, Coulombe said. In addition, the organizational structure will facilitate interlaboratory work among the participants, which could improve reproducibility, he added.

When different schools perform a similar experiment, it will be important that common standard operational procedures are in place and followed “so that the data that comes out of the many sites…are comparable,” Coulombe said.

“The only way to achieve this is through communication between the sites. So if some of the sites combine their efforts in [a] project, we have to be able to tell the funders that when we do the same type of experiments in different locations, we’re doing it in a way that the data can be compared, is reproducible, [and] is complementary but can be put together,” he said. “So this is one of the important virtues of this type of platform.”

The initiative is currently performing a multi-site pilot project comparing affinity purification techniques. Each site, using similar equipment and analytical methods for the same proteins, is generating data, which will then be analyzed to determine what steps need to be taken to resolve differences between different labs.

In addition, they are investigating methods aside from mass-spec based technologies to monitor protein-protein interactions such as yeast 2-hybrids and luminescence-based mammalian interactome technology, or LUMIER, Coulombe said.

Within six months, most of the new equipment should be installed and the national platform should be “90 percent operational.” In a year, “we plan to have operational funding for at least one big interactome project,” he said.

If that happens, it would be one of the few examples of such a project. While there have been calls in the past for a large-scale human interactome mapping effort, such proposals have failed to take flight and most of the current work has been confined to individual labs. According to Tony Pawson of the Samuel Lunenfeld Research Institute and a participant in the Canadian effort, only about 5 percent of the human interactome has been mapped to date.

The most prominent proponent of a coordinated interlaboratory approach to describing the human interactome has been Marc Vidal, an associate professor of genetics at the Harvard Medical School, who in 2006 published an article in The Scientist advocating for a $100 million investment into a large-scale human interactome mapping effort. While the funding agencies never took him up on his advice, a number of smaller individual efforts have been started since then, he told ProteoMonitor.

The Center for Cancer Systems Biology at the Dana Farber Cancer Institute, of which Vidal is director, has also adopted the Human Interactome Mapping Project as its flagship project.

“We’re not quite there yet … if you were to compare us to the genome sequencing project at its peak, but it’s definitely starting to crystalize a bit,” he said. “People are getting together, people are publishing four, five, six groups together. … I also think that the field as a whole is already past the single lab, single R01 [stage].”

In January, he and a cadre of other collaborators published a series of articles in Nature Methods describing research into the interactomes of various organisms.

The Systems Biology Center New York has also been exploring the idea of a Quantitative Human Interactome Project to “experimentally obtain kinetic constants for cellular interactions between all of the proteins encoded by the human genome and construct a database of these parameters,” according to a report it released in March 2008.

Coulombe said that the Canadian initiative is the only one he knows of that pulls together the resources of so many institutions and directs it at the human interactome.

But at a time when other similar projects, such as mapping the human proteome, have failed to gain any traction, and protein-protein interactions within the human model are still poorly understood, are Coulombe and his peers jumping ahead of themselves with their ambitions to map the human interactome, which looks not only at protein-protein interactions but also at protein-DNA and protein-RNA interactions?

They don’t see it that way. Pawson said that the technology has reached the stage where “it’s really feasible to think about doing these things on a large scale, and also very importantly, people who use different approaches … are starting to talk to each other much more extensively.”

Indeed, while the funding announced last week focuses on building the national technology platform, Coulombe and others in the initiative are already looking ahead to a large-scale effort that would involve researchers from across the globe to map the human interactome. That effort is called the International Interactome Initiative, or I3.

“This is one of the projects that we hope will be supported by the platform,” Coulombe said. “The national platform is the technology platform in Canada that will serve in the international interactome initiative.”

The Canadian initiative and the proposed I3 plan comes out of a project called the Human Proteotheque Initiative that Coulombe has been working on for several years to chart protein interactions that regulate cell growth, differentiation, and disease progression [see PM 08/02/07].

“What you see now [with I3] is the evolution of this initiative,” Coulombe said. “We’re building our way to the interactome.”

He and others involved in trying to get I3 off the ground have created a steering committee “that includes key players in the interactome field from the US, from Europe and from Canada,” that is exploring funding opportunities for the project and setting scientific objectives, Coulombe said, adding that he hopes to have funding for I3 secured next year so that research can begin in early 2011.

“With this international consortium, we feel that if we have appropriate funding, by joining efforts and technologies such as affinity purifications, mass spectrometry, yeast 2-hybrids, protein complementation assays, LUMIER … in five years we [could] have a draft map of the interactome with pretty much full coverage,” Coulombe said.

Deixe um comentário »

What can DNA tell us? Place your bets now

* 08 July 2009 by Lewis Wolpert and Rupert Sheldrake
* Magazine issue 2716. Subscribe and get 4 free issues.
* For similar stories, visit the Essays and Genetics Topic Guides

Read full article

From Newton to Hawking, scientists love wagers. Now Lewis Wolpert has bet Rupert Sheldrake a case of fine port that: “By 1 May 2029, given the genome of a fertilised egg of an animal or plant, we will be able to predict in at least one case all the details of the organism that develops from it, including any abnormalities.” If the outcome isn’t obvious, then the Royal Society will be asked to adjudicate. Watch this space…

Competition: Challenge New Scientist to a scientific wager
Lewis Wolpert

I HAVE entered into this wager with Rupert Sheldrake because of my interest in the details of how embryos develop, and how our understanding of this process will progress. In my latest book, How We Live and Why We Die, I suggest that it will one day be possible to predict from an embryo’s genome how it will develop, and I believe it is possible for this to happen in the next 20 years.

I am, in fact, being a little over-keen because 40 years is a more likely time frame for such a breakthrough. Cells and embryos are extremely complicated: for their size, embryonic cells are the most complex structures in the universe.

Animals develop from a single cell, a fertilised egg, which divides to produce cells that will form the embryo. How that egg develops into an embryo and newborn animal is controlled by genes in the chromosomes. These genes are passive: they do nothing, just provide the code for proteins. It is proteins that determine how cells behave. While the DNA in every cell contains the code for all the proteins in all the cells, it is the particular proteins produced in particular cells that determine how those cells behave.

Every cell of the embryo contains many copies of several thousand different proteins. These proteins have a plethora of functions: acting as enzymes to break down and build other molecules, providing structures for the cell, interacting with each other, and many more. The complexity of the interactions between millions of molecules is amazing.

As the proteins determine how the cells behave, it is their activity that causes the embryo to develop. Underlying this process, though, are the genes, as they control which proteins are made – including some proteins that activate specific genes. It is essential that there is this control over which cells continue to divide, and of mechanisms to pattern the embryo so that different cells develop into different structures, such as the brain or limbs.

There is a huge incentive to understand these processes and so be able to work out the development of an embryo given only its genome. This ability could pave the way for regenerative medicine by allowing scientists to program stem cells to become structures that could replace damaged parts of the body.

To win the bet, we will have to be able to predict the behaviour of almost all the cells in the embryo. In a small worm, say the nematode Caenorhabditis elegans, there are 959 cells, making it the ideal model to solve this problem. It is a major challenge, but advances in cell biology, systems biology and computing will take us there.
One of the nematode worms, with just 959 cells, is the ideal model to solve this problem

Rupert Sheldrake

LEWIS WOLPERT’s faith in the predictive power of the genome is misplaced. Genes enable organisms to make proteins, but do not contain programs or blueprints, or explain the development of embryos.

The problems begin with proteins. Genes code for the linear sequences of amino acids in proteins, which then fold up into complex three-dimensional forms. Wolpert’s wager presupposes that the folding of proteins can be computed from first principles, given the sequence of amino acids specified by the genes. So far, this has proved impossible. As in all bottom-up calculations, there is a combinatorial explosion. For example, by random folding, the amino-acid chain of the enzyme ribonuclease, a small protein, could adopt more than 1040 different shapes, which would take billions of years to explore. In fact, it folds into its habitual form in 2 minutes.

Even if we could solve protein-folding, the next stage would be to predict the structure of cells on the basis of the interactions of millions of proteins and other molecules. This would unleash a far worse combinatorial explosion, with more possible arrangements than all the atoms in the universe.

Random molecular permutations simply cannot explain how organisms work. Instead, cells, tissues and organs develop in a modular manner, shaped by morphogenetic fields, first recognised by developmental biologists in the 1920s. Wolpert himself acknowledges the importance of such fields. Among biologists, he is best known for “positional information”, by which cells “know” where they are within the field of a developing organ, such as a limb. But he believes morphogenetic fields can be reduced to standard chemistry and physics. I disagree. I believe these fields have organising abilities, or systems properties, that involve new scientific principles.

Issue 2716 of New Scientist magazine

* Like what you’ve just read?
* Don’t miss out on the latest content from New Scientist.
* Get 51 issues of New Scientist magazine plus unlimited access to the entire content of New Scientist online.

Deixe um comentário »

UK Academy Offers Recommendations to ‘Fulfill Promise’ of GWAS

By a GenomeWeb staff reporter

July 09, 2009

NEW YORK (GenomeWeb News) – The UK’s Academy of Medical Sciences is calling for greater investment into genome-wide association studies and has offered several recommendations for ways to ensure that GWAS findings will ultimately benefit patients.

In a report released yesterday, AMS noted that “despite the many successes and exciting potential of GWA studies, there is considerable scope to further capitalize on the opportunities and secure real benefits for healthcare,” though it adds that “fulfilling this promise will take time.”

AMS is a network of biomedical scientists from commercial and academic organizations in the UK. The report’s recommendations are drawn from an October 2008 meeting on GWAS that was supported by GlaxoSmithKline.

A key challenge for GWAS, the report notes, is “moving from a statistical indication that a gene variant or region of
DNA is involved in a disease, to locating and identifying causal variants and the associated biological pathways.” This challenge “can only be met by greater integration between three historically distinct approaches to disease causality: genetic mapping, epidemiology and studies of pathophysiological mechanisms.”

In order to move knowledge gained from GWAS closer to medical use, AMS said, there is a need to identify additional factors that contribute to genetic variance, including the role of SNPs, CNVs, and epigenetics.

This will require the collection of samples from diverse populations for multiple diseases that have “some commonality of clinical datasets, patient consent and data access arrangements,” the group added.

AMS also sees a case for integrated epigenetic GWA studies that would combine sequence-based quantitative genetics and epigenome dynamics. “Crucial to this goal” will be initiatives that develop high-resolution reference epigenome maps such as the Alliance for the Human Epigenome and Disease, the report said.

The group also noted that researchers must have access to high-quality data from prospective studies as well as access to population-based samples such as the UK Biobank and disease registries.

There should also be further investment in bioinformatics and statistical methods to interpret sequence data, and tools must be developed to assess gene-gene and gene-environment effects and clinical endpoints, AMS advised.

AMS also pointed to a need for studies of differences in gene expression using diverse tissue types, and it urged the development of improved in vivo and in vitro models for assessing human causal variants.

The aim is to begin to translate the “wave of genetic findings” on common diseases into improved diagnostics, preventions, and treatments.

GWA studies also could contribute more to understanding genetic variation if their results were transferred to other populations, if re-sequencing studies were conducted to find rare variants, and if there were more cohort studies, AMS advised.

“Continued investment will be needed to translate new knowledge into benefits for patients and to ensure that the UK maintains a leading international position in this exciting area,” John Bell, president of the AMS, said in a statement.

Deixe um comentário »

SOLiD System

Shaf Yousaf, head of the Applied Biosystems SOLiD System, talks about improvements to the genome sequencer.

Deixe um comentário »

ABI Says SOLiD Found More Forms of Variation in HapMap than Illumina; No Formal Comparison Yet

July 07, 2009

By Julia Karow

Applied Biosystems’ SOLiD platform discovered more forms of genome variation in a HapMap sample with less coverage than Illumina’s Genome Analyzer did in a previous analysis of the same sample, although the overall number of variations discovered was smaller, according to an ABI researcher. However, a formal comparison of the two studies — which both used sequencing platforms that have since improved — has not been published to date.

Last month, a team of researchers led by the Life Technologies division published online in Genome Research the genome sequence and analysis of an African man, HapMap sample NA18507, based on data from the SOLiD platform (see In Sequence 6/23/2009). The same sample was sequenced on the Illumina Genome Analyzer by a group led by Illumina scientists, who published their analysis in Nature last fall (see In Sequence 11/11/2008).

According to Kevin McKernan, senior director of SOLiD scientific operations at ABI and the corresponding author of the Genome Research paper, he and his team were able to detect “more forms of variation” than the Illumina group with half the coverage, “perhaps not in number but in structural complexity.”

Overall, the Illumina team, using the Genome Analyzer I, sequenced the sample to about 40-fold average depth and reported approximately 4 million SNPs, as well as 400,000 short insertions and deletions up to 16 bases, and 5,704 structural variants ranging in size from 50 bases to more than 35 kilobases. They generated 35-base reads from libraries with 200-base and 2-kilobase inserts.

The ABI researchers, on the other hand, used the SOLiD 2.0 to sequence the sample to 18-fold haploid coverage and identified 3.87 million SNPs as well as 226,529 small intra-read indels; 5,590 large indels between mate-paired reads, 91 inversions, and 4 gene fusions. They produced 25-base and 50-base mate-paired reads with inserts up to 3.5-kilobases as well as single-end 50-base reads.

Two reads of SOLiD data were required to detect a SNP, whereas previous Illumina publications suggest three or four Illumina reads are needed for the same purpose, McKernan said, attributing this difference to the SOLiD platform’s high accuracy.

In addition, he said, having 50-base paired reads enabled his team to “fill in the blind spot of variation detection” — meaning deletions 20 to 100 base pairs in length — by detecting split reads and contracted or expanded mate pairs. “This blind spot reduction is a huge step forward in next-gen tools starting to completely supplant old-generation features,” he told In Sequence by e-mail last week.

He and his colleagues were also able to resolve haplotype phases from mate-pair data, “which has never been done genome-wide before on a next-generation [sequencing] platform,” he said.

Their analysis also found that “large inserts are especially valuable for structural variation discovery,” he said. Since about half of all breakpoints are thought to be in repeats, he explained, “longer inserts increase our ability to uniquely place one end of a pair in unique sequence.”

The ABI scientists did not include a direct comparison of their results with Illumina’s in their paper, because at the time of submission — Genome Research received the manuscript Feb. 1 — only the genotype locations were publicly available for the Illumina study but not the genotype calls, according to McKernan.

He and his colleagues did compare the indels found in both studies and found that Illumina did not see the same preference for even-sized indels greater than four bases in size that “SOLiD and most other technologies” have detected in the human genome, he said.

According to David Wheeler, an associate professor at Baylor College of Medicine who had not studied the two papers in detail yet, “the results should be fairly comparable, and it would be very interesting to see whether there are any biases in the two technologies.”

Wheeler was involved in the sequencing of Jim Watson’s genome with the Roche/454 platform and, more recently, in sequencing human cancer genomes using the SOLiD technology (see In Sequence 5/12/2009). “I would really expect the two analyses to overlap substantially, but it would be definitely a comparison that should be done,” he said.

Wheeler cautioned that both technologies have improved since the data for the two studies was produced, and that if Illumina repeated its study today with its longer reads, the results would probably be better.

Though the two papers are currently the only published studies of the same human genome sample analyzed independently on two next-generation sequencing platforms, they are not the only such studies.

At the Biology of Genomes conference this spring, for example, Gonçalo Abecasis, a researcher at the University of Michigan and co-chair for the analysis group of the 1000 Genomes Project, said that one of the trio samples that is part of a pilot study for the project was sequenced independently at 30-fold depth coverage on the Illumina and the SOLiD platforms. The best results, he said, emerged when both datasets were combined. “Each platform has different characteristics; none of them is uniformly better than the other,” Abecasis said at the time (see In Sequence 5/12/2009).

Illumina did not get back before deadline with comment for this article.

Deixe um comentário »