Sequencing Study Highlights Genetic Diversity in Southern Africa

By a GenomeWeb staff reporter

NEW YORK (GenomeWeb News) – An international research team reported in Nature today that it has characterized five human genomes from southern Africa, identifying millions of SNPs never before found in the human population.

The American, African, and Australian researchers sequenced the full genomes of two African individuals: a member of a hunter-gatherer population in the Kalahari desert known as the Bushmen, San, or Khoisan, and a Bantu individual from South Africa — Nobel peace prize winner Archbishop Desmond Tutu. After sequencing the exomes of three other Khoisan men, the team compared all five genomes, identifying more than 1.3 million previously undetected SNPs.

During their subsequent analyses, they not only found genetic differences between southern African populations and populations from other parts of Africa and the world but also within the Khoisan population — findings that may eventually inform everything from studies of human population history and adaptations to agriculture to personalized medicine strategies in southern Africa.

“On average, there are more genetic differences between any two Bushmen in our study than between a European and an Asian,” co-lead author Stephan Schuster, a biochemistry and molecular biology researcher with Pennsylvania State University’s Center for Comparative Genomics and Bioinformatics, said in a statement.

Southern Africa is believed to be the source of modern humans and, subsequently, is home to a great deal of human genetic diversity. But despite the decades-long effort to characterize the human genome and human population genetics, most studies have lacked representatives from this region, Schuster explained during a telephone briefing with reporters this morning.

In an effort to get a better sense of the genetic variation within humans, he and his team set out to characterize the genomes of individuals from the Khoisan population — thought to be the oldest modern human population. Schuster described the project at the American Society for Human Genetics meeting last fall, though this paper marks the first publication from the sequencing effort.

Archbishop Tutu, who has ancestry from Sotho-Tswana and Nguni language groups, which represent roughly 90 percent of southern Africans, also participated in the study. Tutu was a good candidate not only because of his ancestry but also because he is known to have survived polio, tuberculosis, and prostate cancer and because he is a voice for southern Africa and indigenous populations, senior author Vanessa Hayes, a cancer genetics researcher at the University of New South Wales, told reporters.

The four Khoisan men who participated in the study all came from different communities in Namibia’s Kalahari Desert. Each was the most elder member of his community.

The team sequenced the genome of a Khoisan man named named !Gubi to 10.2 times coverage using paired-end sequencing with the Roche 454 GS FLX Titanium platform. !Gubi is believed to be around 86 years old and lives in the southern Kalahari on the Namibia-Botswana border. They also used a similar approach to sequence the genome of a second Khoisan man named G/aq’o, from a community in the northern Kalahari, to about two times coverage.

Meanwhile, the researchers sequenced Tutu’s genome using the Applied Biosystems SOLiD 3.0 platform, generating sequence covering the genome about 12.3 times.

For the exome sequencing portion of the study, the team captured protein-coding sequences for each of the five individuals with the NimbleGen 2.1 M array and sequenced them by Roche 454 Titanium sequencing.

The genomic and exomic sequences were verified using a range of approaches, including genotyping and whole-genome and exome sequencing with the Illumina platform, which was used to sequence !Gubi’s genome to 23.2 times coverage and Tutu’s genome to 7.2 times coverage.

When the team compared the genomes and exomes to version 18 of the human reference genome and eight personal genomes sequenced, they found 1.3 million previously undetected SNPs, including 13,146 new SNPs that alter the amino acid sequence of 7,720 genes.

!Gubi’s genome contained more SNPs than Tutu’s, though both contained more SNPs overall — and more novel SNPs — than any other individual genome sequenced so far.

And from their population level analyses, the researchers detected as many or more genetic differences between the Khoisan and West African populations than between West African and European populations.

The team’s preliminary peek at the functional role of genes affected by new SNPs in the Khoisan population suggests that these variants tend to fall in genes involved in immune response, reproduction, and sensory perception.

“We believed that because of their extremely long lineage, their genome would be very different,” co-lead author Webb Miller, a researcher at Penn State’s Center for Comparative Genomics and Bioinformatics, told reporters. And, he said, the findings so far support that hypothesis.

This type of genetic diversity within the human genome is believed to have helped humans thrive over thousands of years, Schuster said, though he emphasized that modern human genomes from all around the world still share far more similarities than differences. “We are genetically one healthy species,” he said.

The team believes understanding human genetic diversity in southern Africa will likely be medically important, both for developing personalized medicine in this region and for identifying and understanding the roles of rare variants in human health and disease in general.

“Adding the described variants to current databases will facilitate the inclusion of southern Africans in medical researchers’ efforts, particularly when family and medical histories can be correlated with genome-wide data,” the researchers wrote.

The researchers have already started developing microarrays incorporating the newly identified southern African SNPs. For the next phase of the study, they plan to use these microarrays to genotype hundreds of individuals from southern Africa.

The Top 10 Everything of 2009

2. The Human Epigenome, Decoded

By EBEN HARRELL Tuesday, Dec. 08, 2009
Visuals Unlimited / Corbis

The decoding of the human genome nearly a decade ago fueled expectations that an understanding of all human hereditary influences was within sight. But the connections between genes and, say, disease turned out to be far more complicated than imagined. What has since emerged is a new frontier in the study of genetic signaling known as epigenetics, which holds that the behavior of genes can be modified by environmental influences and that those changes can be passed down through generations. So people who smoke cigarettes in their youth, for example, sustain certain epigenetic changes, which may then increase the risk that their children’s children will reach puberty early. In October, a team led by Joseph Ecker at the Salk Institute in La Jolla, Calif., studied human skin and stem cells to produce the first detailed map of the human epigenome. By comparing this with the epigenomes of diseased cells, scientists will be able to work out how glitches in the epigenome may lead to cancers and other diseases. The study, which was published in the journal Nature, is a giant leap in geneticists’ quest to better understand the strange witches’ brew of nature and nurture that makes us who we are.

View the full list for “The Top 10 Everything of 2009″

Personal Genome Project Sees Whole-Genome Sequencing as ‘Increasingly a Viable Option’

Personal Genome Project

By Julia Karow

This article was originally published Oct. 14.

Organizers of Harvard Medical School’s Personal Genome Project said that as the cost of DNA sequencing declines, they are considering whole-genome sequencing rather than exome sequencing for the second phase of the study, PGP-100.

The project has already added results from the genome of its founder and principal investigator, George Church, to its website, whose genome was recently sequenced by Complete Genomics.

Launched in 2007 with 10 participants, the PGP aims to sequence the genomes of 100,000 people and to correlate their genotypes with trait information. In April, the project said that it plans to scale up to 100 participants for its second phase (see In Sequence 5/5/2009).

In a newsletter e-mailed last week to individuals interested in the study, PGP organizers said that they have been closely monitoring the decrease in cost of whole human genome sequencing “because it will impact our sequencing strategy for the PGP-100.”

The cost has already fallen to less than $50,000 per genome, PGP said, “with some speculating that the arrival of $5,000 genomes is imminent,” a reference to Complete Genomics’ $5,000 human genome sequencing service, which is scheduled to launch in January.

The project’s initial strategy, according to the organizers, was to focus on the exome, since it is “information rich” and seemed “a more economical alternative” to whole-genome sequencing.

“However, the cost of exome sequencing has not fallen as rapidly as whole-genome sequencing,” they noted, and as a result, for the PGP-100, “the decision to pursue whole genome is increasingly a viable option.”

It will depend, though, on factors such as the project’s ability to raise funding as well as “the willingness of sequencing companies to publicly showcase their technologies through sponsorship of PGP-100 genomes.”

According to its website, the PGP is funded by donations from individuals, Google, Orbimed, the COUQ Foundation, a grant from the Broad Institute, technology development grants from the Department of Energy and NIH, and in-kind support from various organizations. PersonalGenomes.org, a 501(c)3 charitable organization, seeks to raise $1.5 million in donations for the project this year from foundations, private companies, and individuals.

The project has already posted results from an analysis of Church’s genome, which was recently sequenced by Complete Genomics. Church told In Sequence last month that the company has committed to sequencing nine additional PGP genomes, though it was unclear whether it will charge the PGP for its services (see In Sequence 9/15/2009).

For the interpretation of the genomic information, the PGP is using Trait-o-matic, an open-source tool developed in house, which automatically identifies, filters, and annotates genetic variants. The project plans to use the software to generate research reports that contain variants “that may be of potential significance” and has already generated prototypes of such reports for its first 10 participants, based for nine of them on partial exome sequence data.

Future releases of Trait-o-matic will “enable a community of volunteers to annotate and interpret integrated genomic and trait datasets from the PGP.”

The Technology Pilot

If you’ve been wondering how the 1,000 Genomes Project is doing, here’s an account from Dan Koboldt at his MassGenomics blog about last week’s meeting at Baylor where participants discussed the “Pilot 3″ phase of the project. “Unlike pilots 1 and 2, which emphasized whole genome sequencing to low or high coverage, respectively, in Pilot 3, the exons of 1,000 genes (~1.5 Mbp total) were selectively targeted for sequencing by capture technologies,” Koboldt writes. The team is also checking data across platforms and pipelines. “Overall, the Pilot 3 variant calls are looking good – dbSNP concordances in the 70-80% range or higher, and transition/transversion ratios of about 3-3.50 – and consistent across 454 and Solexa data from multiple centers,” he writes.

First genome-wide, single-base-resolution maps

Joe Ecker is senior author on a paper that provides the first genome-wide, single-base-resolution maps of methylated cytosines in a mammalian genome. Comparing both human embryonic stem cells and fetal fibroblasts, they found “widespread differences” between the two, including almost one-quarter of all methylation in embryonic stem cells was in a non-CG context, suggesting that embryonic stem cells may use different methylation mechanisms to affect gene regulation, they write.

Several opinion pieces appear this week. One checks in with experts and their concerns regarding the stimulus grants, another looks at the open-source Polymath Project, while a third by Cameron Neylon examines the potential of Google’s open-sorce collaboration tool, Google Wave.

A special insight section explores the changing landscape of neuroscience research. Says an editorial, “The experimental landscape has changed markedly over the past few years, given the technological advances in molecular genetics, optogenetics and functional imaging.” Articles cover molecular genetics and imaging technologies for circuit-based neuroanatomy, neuroscience and systems biology, and multimodal techniques for diagnosing Alzheimer’s disease.

Research led by Joel Levine, a neuroscientist from the University of Toronto, has determined that Drosophila melanogaster flies use a single chemical to communicate gender and sibling identity in order to pick the right sex partners. By inserting a transgene into the fly’s genome that killed cells that produced these special hydrocarbon signaling chemicals, they report that hydrocarbon-free male flies attempted copulating with each other, says a story at the BBC. Check out the accompanying video, too.

Genetics Suggest Population Expansion in Africa Began in Stone Age

July 29, 2009
By Andrea Anderson

NEW YORK (GenomeWeb News) – Modern human populations started expanding some 40,000 years ago, according to a paper appearing appeared online today in PLoS ONE.

Researchers from the University of Arizona and the University of California at San Francisco used multi-locus sequence analysis to assess genetic signatures found in nearly 200 individuals from seven populations around the world. Their results suggest human population expansions in Africa started about 40,000 years ago during the Stone Age — a more recent expansion time than that predicted from previous studies.

“[B]oth hunter-gathers (San and Biaka) and food-producers (Mandenka and Yorubans) best fit models with population growth beginning in the Late Pleistocene,” senior author Michael Hammer, a genetics researcher at the University of Arizona, and his co-authors wrote. “These dates are concurrent with the appearance of the Late Stone Age in Africa, supporting the hypothesis that population growth played a significant role in the evolution of Late Pleistocene human cultures.”

Previous studies based on mitochondrial DNA, Y-chromosome data, or autosomal microsatellites provided a broad range of estimates about when modern human population expansion began, dating as far back as about a few hundred thousand years ago. But such estimates often conflict with one another and are based on one or a few sequences that may be under selective pressure, the researchers explained.

In an effort to generate more reliable data for teasing apart human population history, Hammer and his team used Sanger sequencing to re-sequence roughly 6,000 bases of nuclear DNA from each of about 20 autosomal non-coding regions for 184 individuals.

These regions were selected because they were sites with lots of crossing over events but were also far from protein-coding genes and not likely to be under selection. By looking at all of the areas together, Hammer told GenomeWeb Daily News, it’s possible to overcome the noise detected at any single region.

The individuals tested belonged to seven different populations: San, Biaka, Mandenka, Yoruban, French Basque, Han Chinese, and Melanesian.

When the team analyzed their data using multi-locus analysis, they found evidence suggesting that both hunter-gatherer populations (such as the San from Namibia and the Biaka from the Central African Republic) and food-producer populations (such as the Mandenka from Senegal and Yorubans from Nigeria) began expanding roughly 40,000 years ago during the Late Pleistocene period.

That predates the advent of farming in Africa, Hammer noted, and is consistent with archeological evidence suggesting there was a burst of populations interacting and sharing tools and cultural innovations at that time.

Overall, the team concluded that human populations in Africa began a ten-fold expansion some 36,000 years ago. Their data hint that expansion may have been a tad earlier and faster in the hunter-gatherer population — about a 13-fold expansion starting about 41,000 years ago — than in the food-producing populations, which expanded approximately seven-fold starting some 31,000 years ago.

In the future, the team plans to do additional studies looking at more populations from different parts of the world. And, Hammer said, they also hope to employ next-generation sequencing technology to look at even more regions in the genome.

GWAS and Differences in DNA Between Tissues

Posted by Bob Grant
[Entry posted at 20th July 2009 04:52 PM GMT]
www.the-scientist.com/blog
Recent findings may spell trouble for genome-wide association studies based on DNA obtained through blood samples: Genetic material may vary between blood cells and other tissues in a single individual, a study in the July issue of Human Mutation reports.

Image: Wikimedia

The study “raises a very interesting question,” Howard Edenberg, director of the Indiana University School of Medicine’s center for medical genomics, told The Scientist. Many genome-wide association studies — especially studies on systemic diseases such as diabetes and atherosclerosis — depend solely upon DNA harvested from blood samples to identify genes associated with medical conditions. But this study “suggests that looking only at blood, you may miss some things.”

Searching for the genes behind a fatal condition called abdominal aortic aneurysm (AAA), researchers from McGill University in Montreal found that complementary DNA from diseased abdominal aortic tissue did not match genomic DNA from leukocytes in blood from the same patient. “We did not expect to find a difference in the tissue [genes] compared to the leukocyte [genes],” said endocrinologist Morris Schweitzer, who led the study.

Schweitzer and his team uncovered three single nucleotide polymorphisms (SNPs) in samples of diseased tissue from 31 AAA patients that were not present in matching blood samples. They also tested five aortic and blood samples from normal individuals and found the same discrepancy. Schweitzer said that the apparent genetic difference between different cells in the body may cast some doubt on genome-wide association studies that only use DNA from blood samples to infer disease states. “I think they may not be accurate because they might not reflect what’s in the tissue,” he said, adding that researchers should look upon such genetic results “very carefully and very trepidatiously”

Edenberg, who was not involved with the study but who conducts genome-wide association studies to explore the genetic roots of alcoholism and bipolar disorder, said that while the findings are interesting, they are very preliminary. “If they’re correct about this, and there are these genomic differences between tissues and blood at certain alleles, then we’re missing some things,” he said. Edenberg explained that experimenters generally take into account that such studies are somewhat “underpowered” in terms of their ability to catch every genetic indicator of disease. Schweitzer’s results, he noted, may add another layer to this consideration, but do not suggest that genome-wide association studies would turn up false positives, or blood-based genes mistakenly attributed to a particular disease.

Sudha Seshadri, a Boston University neurologist who was not involved in the study, told The Scientist that though the McGill group’s results are important, they do not negate genome-wide association data that scientists have already gathered. “I don’t think [the study] says much about the usefulness or validity of genome-wide association studies as they are being done in cohorts around the world.” Genome-wide studies on diabetes, for example, have identified about 16 genes that are related (in varying degrees) to the disease, said Seshadri, who collaborates on the Framingham Heart Study, a six-decade longitudinal study on more than 5,000 people that has more recently included genomic data.

“I think I would have suggested a few more experiments, personally,” Edenberg added. In particular, he pointed to the fact that the McGill researchers were comparing complementary DNA from aortic tissue to genomic DNA from blood. “At the moment,” he said, the discrepancy “seems relatively compatible with RNA editing [rather] than with a genomic issue.” The study should have compared genomic DNA from the aortic tissues with the genomic blood DNA, and cDNA from both cell types, Edenberg said.

Schweitzer said his group is currently working on this experiment and “should have results probably in a couple of weeks.” He noted that differences between tissue and blood DNA may account for the relatively low levels of association turned up by most genome-wide association studies. Of all the genome-wide association studies that have been conducted, he said, “No one has really found that one miracle gene that really points to something.”

Seshadri, however, said it’s hasty to dismiss the value of such studies. “I think [the authors] make some provocative statements that express a viewpoint, but not a widely-accepted viewpoint,” she said. “It’s far too early in the process of genome-wide association studies to conclude that they have not been fruitful.”

UK Academy Offers Recommendations to ‘Fulfill Promise’ of GWAS

By a GenomeWeb staff reporter

July 09, 2009

NEW YORK (GenomeWeb News) – The UK’s Academy of Medical Sciences is calling for greater investment into genome-wide association studies and has offered several recommendations for ways to ensure that GWAS findings will ultimately benefit patients.

In a report released yesterday, AMS noted that “despite the many successes and exciting potential of GWA studies, there is considerable scope to further capitalize on the opportunities and secure real benefits for healthcare,” though it adds that “fulfilling this promise will take time.”

AMS is a network of biomedical scientists from commercial and academic organizations in the UK. The report’s recommendations are drawn from an October 2008 meeting on GWAS that was supported by GlaxoSmithKline.

A key challenge for GWAS, the report notes, is “moving from a statistical indication that a gene variant or region of
DNA is involved in a disease, to locating and identifying causal variants and the associated biological pathways.” This challenge “can only be met by greater integration between three historically distinct approaches to disease causality: genetic mapping, epidemiology and studies of pathophysiological mechanisms.”

In order to move knowledge gained from GWAS closer to medical use, AMS said, there is a need to identify additional factors that contribute to genetic variance, including the role of SNPs, CNVs, and epigenetics.

This will require the collection of samples from diverse populations for multiple diseases that have “some commonality of clinical datasets, patient consent and data access arrangements,” the group added.

AMS also sees a case for integrated epigenetic GWA studies that would combine sequence-based quantitative genetics and epigenome dynamics. “Crucial to this goal” will be initiatives that develop high-resolution reference epigenome maps such as the Alliance for the Human Epigenome and Disease, the report said.

The group also noted that researchers must have access to high-quality data from prospective studies as well as access to population-based samples such as the UK Biobank and disease registries.

There should also be further investment in bioinformatics and statistical methods to interpret sequence data, and tools must be developed to assess gene-gene and gene-environment effects and clinical endpoints, AMS advised.

AMS also pointed to a need for studies of differences in gene expression using diverse tissue types, and it urged the development of improved in vivo and in vitro models for assessing human causal variants.

The aim is to begin to translate the “wave of genetic findings” on common diseases into improved diagnostics, preventions, and treatments.

GWA studies also could contribute more to understanding genetic variation if their results were transferred to other populations, if re-sequencing studies were conducted to find rare variants, and if there were more cohort studies, AMS advised.

“Continued investment will be needed to translate new knowledge into benefits for patients and to ensure that the UK maintains a leading international position in this exciting area,” John Bell, president of the AMS, said in a statement.

SOLiD System

Shaf Yousaf, head of the Applied Biosystems SOLiD System, talks about improvements to the genome sequencer.

ABI Says SOLiD Found More Forms of Variation in HapMap than Illumina; No Formal Comparison Yet

July 07, 2009

By Julia Karow

Applied Biosystems’ SOLiD platform discovered more forms of genome variation in a HapMap sample with less coverage than Illumina’s Genome Analyzer did in a previous analysis of the same sample, although the overall number of variations discovered was smaller, according to an ABI researcher. However, a formal comparison of the two studies — which both used sequencing platforms that have since improved — has not been published to date.

Last month, a team of researchers led by the Life Technologies division published online in Genome Research the genome sequence and analysis of an African man, HapMap sample NA18507, based on data from the SOLiD platform (see In Sequence 6/23/2009). The same sample was sequenced on the Illumina Genome Analyzer by a group led by Illumina scientists, who published their analysis in Nature last fall (see In Sequence 11/11/2008).

According to Kevin McKernan, senior director of SOLiD scientific operations at ABI and the corresponding author of the Genome Research paper, he and his team were able to detect “more forms of variation” than the Illumina group with half the coverage, “perhaps not in number but in structural complexity.”

Overall, the Illumina team, using the Genome Analyzer I, sequenced the sample to about 40-fold average depth and reported approximately 4 million SNPs, as well as 400,000 short insertions and deletions up to 16 bases, and 5,704 structural variants ranging in size from 50 bases to more than 35 kilobases. They generated 35-base reads from libraries with 200-base and 2-kilobase inserts.

The ABI researchers, on the other hand, used the SOLiD 2.0 to sequence the sample to 18-fold haploid coverage and identified 3.87 million SNPs as well as 226,529 small intra-read indels; 5,590 large indels between mate-paired reads, 91 inversions, and 4 gene fusions. They produced 25-base and 50-base mate-paired reads with inserts up to 3.5-kilobases as well as single-end 50-base reads.

Two reads of SOLiD data were required to detect a SNP, whereas previous Illumina publications suggest three or four Illumina reads are needed for the same purpose, McKernan said, attributing this difference to the SOLiD platform’s high accuracy.

In addition, he said, having 50-base paired reads enabled his team to “fill in the blind spot of variation detection” — meaning deletions 20 to 100 base pairs in length — by detecting split reads and contracted or expanded mate pairs. “This blind spot reduction is a huge step forward in next-gen tools starting to completely supplant old-generation features,” he told In Sequence by e-mail last week.

He and his colleagues were also able to resolve haplotype phases from mate-pair data, “which has never been done genome-wide before on a next-generation [sequencing] platform,” he said.

Their analysis also found that “large inserts are especially valuable for structural variation discovery,” he said. Since about half of all breakpoints are thought to be in repeats, he explained, “longer inserts increase our ability to uniquely place one end of a pair in unique sequence.”

The ABI scientists did not include a direct comparison of their results with Illumina’s in their paper, because at the time of submission — Genome Research received the manuscript Feb. 1 — only the genotype locations were publicly available for the Illumina study but not the genotype calls, according to McKernan.

He and his colleagues did compare the indels found in both studies and found that Illumina did not see the same preference for even-sized indels greater than four bases in size that “SOLiD and most other technologies” have detected in the human genome, he said.

According to David Wheeler, an associate professor at Baylor College of Medicine who had not studied the two papers in detail yet, “the results should be fairly comparable, and it would be very interesting to see whether there are any biases in the two technologies.”

Wheeler was involved in the sequencing of Jim Watson’s genome with the Roche/454 platform and, more recently, in sequencing human cancer genomes using the SOLiD technology (see In Sequence 5/12/2009). “I would really expect the two analyses to overlap substantially, but it would be definitely a comparison that should be done,” he said.

Wheeler cautioned that both technologies have improved since the data for the two studies was produced, and that if Illumina repeated its study today with its longer reads, the results would probably be better.

Though the two papers are currently the only published studies of the same human genome sample analyzed independently on two next-generation sequencing platforms, they are not the only such studies.

At the Biology of Genomes conference this spring, for example, Gonçalo Abecasis, a researcher at the University of Michigan and co-chair for the analysis group of the 1000 Genomes Project, said that one of the trio samples that is part of a pilot study for the project was sequenced independently at 30-fold depth coverage on the Illumina and the SOLiD platforms. The best results, he said, emerged when both datasets were combined. “Each platform has different characteristics; none of them is uniformly better than the other,” Abecasis said at the time (see In Sequence 5/12/2009).

Illumina did not get back before deadline with comment for this article.

Seguir

Obtenha todo post novo entregue na sua caixa de entrada.