My personal and professional adventure into Science

The last wild horses on Earth – Przewalski

Yesterday we got our story about the Przewalski horses published in Cell Current Biology. The article describes the world’s last remaining truly wild horses, which only comprises around 2,100 individuals worldwide. Eleven Przewalski horses were sequenced and compared to the genomes of 28 domesticated horses to provide a more detailed look at the endangered species which is shown to have diverged from Domesticated horses ~45.000 years ago. Finally the genomic impact of ∼110 years of captivity is also monitored, revealing reduced heterozygosity, increased inbreeding, and variable introgression of domestic alleles, ranging from non-detectable to as much as 31.1%.

For a great Photo story check out the Discovery news website.



Copyright: Ludovic Orlando

Copyright: Museum of domesticated animals, “Julius Kühn” at University of Halle-Wittenberg

Copyright: Ludovic Orlando

Copyright: Museum of domesticated animals, “Julius Kühn” at University of Halle-Wittenberg

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando


Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Copyright: Ludovic Orlando

Resumé of the Elixir Conference 27-28 august 2015

Remember to follow the Elixir conference on Twitter

The first annual danish Bioinformatics conference, arranged by the danish Elixir hub, is now ongoing in the old docks in Odense. If you are not already attending the conference, you can still follow it on Twitter by following the hashtag #elixirdk.

The program is very exciting and until now we have had talks in the subject of Systems Biology and Medical Informatics, Proteomics Informatics and RNA Bioinformatics.

Tomorrow the talks will be within the areas of Population Genetics, Medical Genomics and Industrial view on Danish Bioinformatics: Challenges and Opportunities. In the end of the conference there will also be a presentation about Computerome and Cloud computing which I am really looking forward to.

Elixir Denmark – First Danish Bioinformatics Conference in Odense on August 27-28, 2015

I have just signed up for the First Danish Bioinformatics Conference, which will be kicking off in Odense, Denmark on August 27-28, 2015, and I can only urge you to do the same. The conference is organized by ELIXIR Denmark Denmark.

Who should join?
Well, everyone interested in bioinformatics as the conference is relevant to all bioinformaticians who wants to be updated on the latest developments within the fields of Protein-, RNA-, and DNA- Bioinformatics, Medical Genomics, Population Genetics, and Systems Biology.

When is the deadline for registering?
The deadline for joining is 10 August, but if you register before 15 July, the registration fee is a bit lower (DKK 1250 | approx. EUR 167) compared to after 15 July (DKK 1687 | approx. EUR 226). Students (Including PhD students) are free of charge !

If you want to present a poster the deadline is June 28. All the details, including program, can be found here: DKBiC-2015-leaflet4.pdf

So go to the website to register and then I’ll see you there 🙂


Solar Eclipse 2015 transmitted live

Solar Eclipse 2015 transmitted live

Tomorrow one of Nature’s spectacularly phenomenas will occur – a total solar eclipse. The moon passes between Earth and the Sun and day will be turned into darkness on some places of the Earth. Check here to see simulations of how many percentage of solar eclipse you will be able to see:

It is very important that you wear special solar eclipse glasses if you follow the eclipse outdoor. Special safety glasses is needed to avoid eye damage. Your cameras may also be destroyed if you don’t use special filters.

If you didn’t get a hand of any of these items, you can follow the event live here: Live transmission from DTU Space, according to a post from the DTU Library.

You can also watch it here: Here or here:


Image from:

CEBio Metagenomics course February 2015

A great course has now come to an end, and I have finally gotten some sleep again. It was well needed. Unfortunately the calendar now says rain, rain, rain, now when I was hoping that I could get out to get a little tanned…

The course was a free course, organized by Guilherme Oliveira, on the topic of Metagenomics.

The course is aimed towards students, post-docs and researchers novice to metagenomics. The course will provide training in: data generation, experimental design, statistical considerations, sampling methods, data generation for 16S, shotgun and transcriptome, data analysis for diversity and functional studies, metadata analysis, single cell genomics.

Speakers included teachers from CEBio, Colombia, Egypt, USA and Denmark, a fantastic team that Guilherme once again had gathered. The list of instructors can be found here: Metagenomics Course 2015 _ instructors.cebio and the course schedule can be found here: Course Program _ metagenome.cebio.

Below are some pictures from throughout the course.

Our Severe Childhood Malaria paper is out in Cell Host & Microbe

Merry Christmas Everyone !
I just got an early Christmas present as I didn’t notice before now, that our Malaria paper was actually published in Cell Host & Microbe on the 4th December. I am actually a little surprised that no email was sent to me from the journal…

Anyways, I am very happy to be involved in such important research so I hope you will enjoy reading the paper as much as I will 🙂

You can download the paper here: Structural Conservation Despite Huge Sequence Diversity Allows EPCR Binding by the PfEMP1 Family Implicated in Severe Childhood Malaria.



Clinton K.Y. Lau, Louise Turner, Jakob S. Jespersen, Edward D. Lowe, Bent Petersen, Christian W. Wang, Jens E.V. Petersen, John Lusingu, Thor G. Theander, Thomas Lavstsen, Matthew K. Higgins

Structural Conservation Despite Huge Sequence Diversity Allows EPCR Binding by the PfEMP1 Family Implicated in Severe Childhood Malaria,

Cell Host & Microbe, Available online 4 December 2014, ISSN 1931-3128, (


The PfEMP1 family of surface proteins is central for Plasmodium falciparum virulence and must retain the ability to bind to host receptors while also diversifying to aid immune evasion. The interaction between CIDRα1 domains of PfEMP1 and endothelial protein C receptor (EPCR) is associated with severe childhood malaria. We combine crystal structures of CIDRα1:EPCR complexes with analysis of 885 CIDRα1 sequences, showing that the EPCR-binding surfaces of CIDRα1 domains are conserved in shape and bonding potential, despite dramatic sequence diversity. Additionally, these domains mimic features of the natural EPCR ligand and can block this ligand interaction. Using peptides corresponding to the EPCR-binding region, antibodies can be purified from individuals in malaria-endemic regions that block EPCR binding of diverse CIDRα1 variants. This highlights the extent to which such a surface protein family can diversify while maintaining ligand-binding capacity and identifies features that should be mimicked in immunogens to prevent EPCR binding.


Diversity and Conservation in the CIDRα1 Domains (A) The 14 completely conserved residues in CIDRα1 domains, shown as red sticks on the HB3var03 CIDRα1 structure. Residues with a property entropy score of less than 0.2 (but not totally conserved) are orange, and those with scores of 0.2–0.3 are yellow. The inset shows a surface representation in the same orientation and colors, showing that conserved residues cluster in the domain center. (B) A sequence logo showing variation in CIDRα1 residues that directly contact EPCR. (C and D) Structure of the EPCR-binding surface of the HB3var03 CIDRα1 domain. Residues shown as sticks make direct interactions with EPCR.




Prehistoric genomes reveal the genetic foundation and cost of horse domestication

Our second PNAS paper on the horse domestication 🙂

Here is the a Danish Press release from CBS.

Below is the press Release from GeoGenetics.

Reshaping the horse through millennia

THE COST OF DOMESTICATION   Whole genome sequencing of modern and ancient horses unveils the genes that have been selected by humans in the process of domestication through the latest 5.500 years, but also reveals the cost of this domestication. A new study led by the Centre for GeoGenetics at the University of Copenhagen, in collaboration with scientists from 11 international universities, reports that a significant part of the genetic variation in modern domesticated horses could be attributed to interbreeding with the descendants of a now extinct population of wild horses. This population was distinct from the only surviving wild horse population, that of the Przewalski’s horses. The study has been published in the scientific journal Proceedings of the National Academy of Sciences (PNAS).

The domestication of the horse some 5,500 years ago ultimately revolutionized human civilization and societies. Horses facilitated transportation as well as the circulation of ideas, languages and religions. Horses also revolutionized warfare with the advent of chariotry and mounted cavalry and beyond the battlefield horses greatly stimulated agriculture. However, the domestication of the horse and the subsequent encroachment of human civilization also resulted in the near extinction of wild horses.

Man catching a domestic Mongolian horse with a lasso in Khomiin Tal, Mongolia. Copyright: Ludovic Orlando.

The only surviving wild horse population, the Przewalski’s horses from Mongolia, descends from mere 13 individuals, preserved only through a massive conservation effort.

As a consequence of this massive loss of genetic diversity, the effects of horse domestication through times have been difficult to unravel on a molecular level. Says Dr. Ludovic Orlando, Associate Professor at the Centre for GeoGenetics, who led this work:

– The classical way to evaluate the evolutionary impact of domestication consists of comparing the genetic information present amongst wild animals and their living domesticates. This approach is ill suited to horses as the only surviving population of wild horses has experienced a massive demographic decline in the 20th century. We therefore decided to sequence the genome of ancient horses that lived prior to domestication to directly assess how pre-domesticated horses looked like genetically.

Recent advances in ancient DNA research have opened the door for reconstructing the genomes of ancient individuals. In 2013, Ludovic Orlando and his team succeeded in decoding the genome of a ~700,000 year-old horse, which represents the oldest genome sequenced to date. This time, the researchers focused on much more recent horse specimens, dating from ~16,000 and ~43,000 years ago. These were carefully selected to unambiguously predate the beginning of domestication, some 5,500 years ago. The bone fossils were excavated in the Taymyr Peninsula, Russia, where arctic conditions favor the preservation of DNA.

The human reshaping of the horse

While the horse contributed to reshaping human civilization, humans in turn reshaped the horse to fit their diverse needs and the diverse environments they lived in. This transformation left specific signatures in the genomes of modern horses, which the ancient genomes helped reveal. The scientists were able to detect a set of 125 candidate genes involved in a wide range of physical and behavioral traits, by comparing the genomes of the two ancient horses with those of the Przewalski’s horse and five breeds of domesticated horses. Says Dr. Dan Chang, post-doctoral researcher at the UCSC Paleogenomics Lab and co-leading author of the study:

– Our selection scans identified genes that were already known to evolve under strong selection in horses. This provided a nice validation of our approach.

Dr. Beth Shapiro, head of the UCSC Paleogenomics Lab continues:

– We provide the most extensive list of gene candidates that have been favored by humans following the domestication of horses. This list is fascinating as it includes a number of genes involved in the development of muscle and bones. This probably reveals the genes that helped utilizing horses for transportation.And Dr. Ludovic Orlando from the Centre for GeoGenetics at the University of Copenhagen concludes:

– Perhaps even more exciting as it represents the hallmark of animal domestication, we identify genes controlling animal behavior and the response to fear. These genes could have been the key for turning wild animals into more docile domesticated forms.

Man catching a domestic Mongolian horse with a lasso in Khomiin Tal, Mongolia. Copyright: Ludovic Orlando.

The ‘cost of domestication’ in horses

However, the reshaping of the horse genome during their domestication also had significant negative impacts. This was apparent in the increasing levels of inbreeding found amongst domesticates, but also through an enhanced accumulation of deleterious mutations in their genomes relative to the ancient wild horses. This finding supports an earlier theory coined ‘the cost of domestication’, which predicted increasing genetic loads in domesticates compared to their wild ancestors. Says Professor Laurent Excoffier, University of Bern and group leader at the Swiss Institute for Bioinformatics:

– Domestication is generally associated with repeated demographic crashes. Yet, mutations that negatively impact genes are not eliminated by selection and can even increase in frequency when populations are small. Domestication thus generally comes at a cost, as deleterious mutations can accumulate in the genome. This had already been shown for rice and dogs. Horses now provide another example of this phenomenon.

This is something that was only detectable in the horse in comparison to the ancient genomes, as Przewalski’s horses were found to show a proportion of deleterious mutations similar to domesticated horses. Says Hákon Jónsson, PhD-student at the Centre for GeoGenetics, co-leading author of the study:

– The recent near extinction of the Przewalski’s horse population resulted in the persistence of deleterious mutations in the population, following the same mechanism that once led to the accumulation of deleterious mutations in the genomes of domesticated horses. What is striking is that a similar order of magnitude was reached even though this occurred in a much shorter time scale than domestication.

Mongolian horses in Khomiin Tal, Mongolia. Copyright: Ludovic Orlando.

An ancient contribution to the present

In addition, comparison of the ancient and modern genomes revealed that the ancient individuals contributed a significant amount of genetic variation to the modern population of domesticated horses, but not to the Przewalski’s horses. This suggests that restocking from a wild population descendant from the ancient horses occurred during the domestication processes that ultimately led to the modern domesticated horses. Mikkel Schubert, PhD- student at the Centre for GeoGenetics, co-leading author of the study concludes:

– This confirms previous findings that wild horses were used to restock the population of domesticated horses during the domestication process. However, as we sequenced whole genomes, we can estimate how much of the modern horse genome has been contributed through this process. Our estimate suggests that at least 13%, and potentially up to as much as 60%, of the modern horse genome has been acquired by restocking from the extinct wild population. That we identified the population that contributed to this process demonstrates that it is possible to identify the ancestral genetic sources that ultimately gave rise to our domesticated horses.


Online publication in Proceedings of the National Academy of Sciences, 15. December 2014: ‘Prehistoric genomes reveal the genetic foundation and cost of horse domestication’.

Deciphering the evolutionary history of horses, asses and zebras

I am proud to say that we now got a new paper published in PNAS, a project that was a continuation of the horse project published last year (A genome world record – A 700.000 year old horse gets its genome sequenced).

The following is the press release from GeoGenetics:

Whole genome-dataset of horses, zebras, and asses unveils their evolutionary origins, revealing genes underlying their specific adaptations, and multiple instances of gene-flow despite drastic differences in chromosomal structure.

The study, which was carried out by an international team of researchers, revealed not only a highly dynamic population history over the last 4.5 million years, but also uncovered multiple cases of gene-flow across a number of species showing extensive chromosomal rearrangements. This suggests that the multiple changes in the structure of chromosomes did not preclude species interbreeding, in contrast to what previous models of speciation have assumed for equids.

The study was led by the Centre for GeoGenetics at the University of Copenhagen, in collaboration with scientists from 8 additional international institutes, including the University of Kentucky and the University of California, Berkeley. It has been published in the scientific journal PNAS on 1. December 2014

Equids, an iconic group of mammals

Horses, zebras, and asses represent the only living members of the equid family, all of which belong to a single genus, Equus. This family originated from a dog-sized ancestor that lived in Northern America some 55 millions ago and flourished into a large number of species during the Tertiary period. Some crossed the Bering Strait and discovered the Old World, where they diversified their diet, mixing grazing and browsing or becoming full grazers.

Zebra in the Zoological Gardens of Copenhagen. Photo: Frank Rønsholt.

Their evolutionary history is well documented in the paleontological record and represents a textbook example of evolution. Most of this past diversity is now gone and all extant equids originate from a most recent common ancestor that lived ~4.5 million years ago. Two such species, horses and donkeys, were domesticated some 5,500 years ago and greatly impacted human history. Says Dr Ludovic Orlando, Assistant Professor at the Centre for GeoGenetics, who led the study:

– Equids are fascinating. A large number of species radiated within the last 4.5 million years, 7 of which are still living. They expanded from the Americas into Eurasia and Africa. They colonized very different environments, from the cold Arctic to the tropics. In comparison, hominins and gorillas, chimpanzees and bonobos arose within the last 13 millions years. With more species emerging in a shorter time period and surviving until now, equids offer an ideal model for understanding the evolutionary process in greater detail.

Whole-genome sequencing unravels the equid family tree

With the publication of a whole-genome dataset encompassing all living members ofEquus, researchers at the Danish Centre for GeoGenetics aimed to cast light upon the recent history of this iconic group of quadrupeds. The research team took advantage of state-of-the-art methods in DNA sequencing to characterize high-quality genomes from zoo animals of the Somali wild ass, the onager, the Tibetan kiang and all three living species of zebras (the plains zebra, the mountains zebra and the Grevyi’s zebra). The team did not limit its efforts to only living species but also reconstructed the genome of a now extinct equine species, the quagga zebra. Says Andaine Seguin-Orlando, PhD student at the Centre for GeoGenetics, co-leading author of the study:

– The quagga zebra was once found in great numbers in South Africa but was hunted to extinction at the beginning of the 20th century. This enigmatic species showed a characteristic zebra stripe pattern only in the front part of the body in contrast to other zebras. Ancient DNA research actually started in 1984 following the sequencing of a short DNA fragment from the extinct quagga zebra.

Says Dr Ludovic Orlando:

– It is amazing to realize how much progress has been made in only 30 years and that we are now able to characterize their entire genome sequence. The genome confirmed the early work from the 1980s, showing that the extinct quagga was closely related to plains zebras. It also revealed some specific genes that were selected in each population after their split, some 300,000 years ago.

By taking advantage of this dataset to reconstruct a phylogenetic tree based on almost 20,000 protein-coding genes, the scientists at the Centre for GeoGenetics were indeed able to resolve the evolutionary relationship of the surviving equids and to identify the evolutionary genetic changes that occurred on each individual lineage. Says, Mikkel Schubert, PhD student at the Centre for GeoGenetics, co-leading author of the study:

– Having whole genomes in hand, we were not limited in sequence information. This allowed us to reconstruct the phylogeny and date population divergences with much greater confidence than has previously been possible.

Says Ernest Bailey, Professor at the Gluck Equine Research Center, who participated in the study:

– Genetic changes at SCL9A4, a channel involved in pH regulation known to counter adverse environmental conditions, appear to have been adaptive in plains zebras. This might partly explain how this species can manage to cope with its wide range of environmental conditions. Adaptive changes at SEMA5A are unique to the extinct quagga zebra, and could be responsible for cranial and cognitive specific to this group.

Says Professor Rasmus Nielsen, from the Department of Integrative Biology at UC Berkeley:

– By revealing what changes were selected along the different branches of the evolutionary tree, such studies start to reveal the genetic makeup of each species. It advances our understanding of what genetic changes make the different species look, live and behave differently.

Surprising interbreeding despite apparent genetic barriers

The genomes of the different equine species did not solely diverge at the sequence level as numerous large-scale rearrangements have been found in their chromosomes. Says, Dr Teri Lear from the Gluck Equine Research Center, who carried out cytogenetic characterization of the onager as part of the study:

– In contrast to humans and chimpanzees which show quite similar karyotypes, the different equine species show extreme differences in their chromosome numbers. It can range from 16 pairs in the Hartmann’s zebra, to 33 pairs in Przewalski’s horses. It is much more plastic than what has been found between humans and chimpanzees, which only differ by one pair of chromosomes. In addition, many chromosomal segments have been shuffled around.

This rapid accumulation of chromosomal changes has long been thought to act as a barrier to interbreeding between species and could therefore have represented speciation drivers, genetically isolating populations and ultimately leading to the emergence of a new species. However, when the research team looked for traces of admixture in their genome dataset, the results came as a surprise: several species show evidence of admixture, suggesting that such massive chromosomal rearrangements did not totally abolish their capacity to reproduce with each other. Says Hákon Jónsson, PhD student at the Centre for GeoGenetics, co-leading author of the study:

– Because of the massive differences in the genome organization, we expected to find no evidence for admixture between equine species. Yet, we find multiple cases of gene flow, implying that cross-species hybridization resulted in fertile offspring in evolutionary times. The most prominent signal concerns the Somali wild ass and the Grevyi’s zebra, two territorial species whose geographic range used to overlap in Eastern Africa. This is in stark contrast with previous models assuming that karyotypic changes would completely stop reproduction between species.

Professor Rasmus Nielsen and Professor Eske Willerslev conclude:

– That mules and hinneys are generally sterile is perhaps the most popular example for illustrating the concept of species in biology classes. Genomics tells us that the species barrier is not always waterproof and now invites us to reinvestigate important biological questions, such as what drives the origin of species.

International team maps ‘big bang’ of bird evolution

Last Thursday the Science paper from bird evolution came out, and it has my name on it 🙂

Below is the press release from the consortium and here is ours from DTU in danish.

Genes reveal deep histories of bird origins, feathers, flight and song

The genomes of modern birds tell a story of how they emerged and evolved after the mass extinction that wiped out dinosaurs and almost everything else 66 million years ago. That story is now coming to light, thanks to an ambitious international collaboration that has been underway for four years.

The first findings of the Avian Phylogenomics Consortium are being reported nearly simultaneously in 23 papers — eight papers in a Dec. 12 special issue of Science and 15 more in Genome Biology, GigaScience and other journals. The full set of papers in Science and other journals can be accessed at and

Scientists already knew that the birds who survived the mass extinction experienced a rapid burst of evolution. But the family tree of modern birds has confused biologists for centuries and the molecular details of how birds arrived at the spectacular biodiversity of more than 10,000 species is barely known.

To resolve these fundamental questions, a consortium led by Guojie Zhang of the National Genebank at BGI in China and the University of Copenhagen, Erich D. Jarvis of Duke University and the Howard Hughes Medical Institute and M. Thomas P. Gilbert of the Natural History Museum of Denmark, has sequenced, assembled and compared full genomes of 48 bird species. The species include the crow, duck, falcon, parakeet, crane, ibis, woodpecker, eagle and others, representing all major branches of modern birds.

Jon Fjeldså, Natural History Museum of Denmark

“BGI’s strong support and four years of hard work by the entire community have enabled us to answer numerous fundamental questions to an unprecedented scale,” said Guojie Zhang. “This is the largest whole genomic study across a single vertebrate class to date. The success of this project can only be achieved with the excellent collaboration of all the consortium members.”

“Although an increasing number of vertebrate genomes are being released, to date no single study has deliberately targeted the full diversity of any major vertebrate group,” added Tom Gilbert. “This is precisely what our consortium set out to do. Only with this scale of sampling can scientists truly begin to fully explore the genomic diversity within a full vertebrate class.”

“This is an exciting moment,” said neuroscientist Erich Jarvis. “Lots of fundamental questions now can be resolved with more genomic data from a broader sampling. I got into this project because of my interest in birds as a model for vocal learning and speech production in humans, and it has opened up some amazing new vistas on brain evolution.”

This first round of analyses suggests some remarkable new ideas about bird evolution. The first flagship paper published in Science presents a well-resolved new family tree for birds, based on whole-genome data. The second flagship paper describes the big picture of genome evolution in birds. Six other papers in the special issue of Science describe how vocal learning may have independently evolved in a few bird groups and in the human brain’s speech regions; how the sex chromosomes of birds came to be; how birds lost their teeth; how crocodile genomes evolved; ways in which singing behavior regulates genes in the brain; and a new method for phylogenic analysis with large-scale genomic data.

The Avian Phylogenomics Consortium has so far involved more than 200 scientists hailing from 80 institutions in 20 countries, including the BGI, the University of Copenhagen, Duke University, the University of Texas at Austin, the Smithsonian Museum, the Chinese Academy of Sciences, Louisiana State University and many others.

A Clearer Picture of the Bird Family Tree

Previous attempts to reconstruct the avian family tree using partial DNA sequencing or anatomical and behavioral traits have met with contradiction and confusion. Because modern birds split into species early and in such quick succession, they did not evolve enough distinct genetic differences at the genomic level to clearly determine their early branching order, the researchers said. To resolve the timing and relationships of modern birds, the consortium authors used whole-genome DNA sequences to infer the bird species tree.

“In the past, people have been using 10 to 20 genes to try to infer the species relationships,” Jarvis said. “What we’ve learned from doing this whole-genome approach is that we can infer a somewhat different phylogeny [family tree] than what has been proposed in the past. We’ve figured out that protein-coding genes tell the wrong story for inferring the species tree. You need non-coding sequences, including the intergenic regions. The protein coding sequences, however, tell an interesting story of proteome-wide convergence among species with similar life histories.”

This new tree resolves the early branches of Neoaves (new birds) and supports conclusions about some relationships that have been long-debated. For example, the findings support three independent origins of waterbirds. They also indicate that the common ancestor of core landbirds, which include songbirds, parrots, woodpeckers, owls, eagles and falcons, was an apex predator, which also gave rise to the giant terror birds that once roamed the Americas.

The whole-genome analysis dates the evolutionary expansion of Neoaves to the time of the mass extinction event 66 million years ago that killed off all dinosaurs except some birds. This contradicts the idea that Neoaves blossomed 10 to 80 million years earlier, as some recent studies suggested.

Based on this new genomic data, only a few bird lineages survived the mass extinction. They gave rise to the more than 10,000 Neoaves species that comprise 95 percent of all bird species living with us today. The freed-up ecological niches caused by the extinction event likely allowed rapid species radiation of birds in less than 15 million years, which explains much of modern bird biodiversity.

Increasingly sophisticated and more affordable genomic sequencing technologies and the advent of computational tools for reconstructing and comparing whole genomes have allowed the consortium to resolve these controversies with better clarity than ever before, the researchers say.

With about 14,000 genes per species, the size of the datasets and the complexity of analyzing them required several new approaches to computing evolutionary family trees. These were developed by computer scientists Tandy Warnow at the University of Illinois at Urbana-Champaign, Siavash Mirarab, a student at the University of Texas at Austin and Alexis Stamatakis at the Heidelburg Institute for Theoretical Studies. Their algorithms required the use of parallel processing supercomputers at the Munich Supercomputing Center (LRZ), the Texas Advanced Computing Center (TACC) and the San Diego Supercomputing center (SDSC).

“The computational challenges in estimating the avian species tree used around 300 years of CPU time, and some analyses required supercomputers with a terabyte of memory,” Warnow said.

The bird project also had support from the Genome 10K Consortium of Scientists (G10K), an international science community working toward rapidly assessing genome sequences for 10,000 vertebrate species.

“The Avian Genomics Consortium has accomplished the most ambitious and successful project that the G10K Project has joined or endorsed,” said G10K co-leader Stephen O’Brien, who co-authored a commentary on the bird sequencing project appearing in GigaScience.

A Genomic Perspective of Avian Evolution and Biodiversity

For all their biological intricacies, birds are surprisingly light on DNA. A study led by Zhang, Cai Li and the consortium authors found that compared to other reptile genomes, avian genomes contain fewer of the repetitive DNA and lost thousands of genes in their early evolution after birds split from other reptiles.

“Many of these genes have essential functions in humans, such as in reproduction, skeleton formation and lung systems,” Zhang said. “The loss of these key genes may have a significant effect on the evolution of many distinct phenotypes of birds. This is an exciting finding, because it is quite different from what people normally think, which is that innovation is normally created by new genetic material, not the loss of it. Sometimes, less is more.”

From the whole chromosome level to the order of genes, this group found that the genomic structure of birds has stayed remarkably the same among species for more than 100 million years. The molecular evolution rate across all bird species is also slower compared to mammals.

Yet some genomic regions display relatively faster evolution in species with similar lifestyles or phenotypes, such as involving vocal learning. This pattern of what is called convergent evolution may be the underlying mechanism that explains how distant bird species evolved similar phenotypes independently. Zhang said these analyses on particular gene families begin to explain how birds evolved a lighter skeleton, a distinct lung system, dietary specialties, color vision, as well as colorful feathers and other sex-related traits.

Important Lessons

The new studies have shed light on several other questions about birds, including:

How did vocal learning evolve? Eight studies in the package examined the subject of vocal learning. According to new evidence in the two flagship papers, vocal learning evolved independently at least twice, and was associated with convergent evolution in many proteins. A Science study led by Andreas Pfenning, Alexander Hartemink, Jarvis and others at Duke, in collaboration with researchers at the Allen Institute for Brain Science in Seattle and the RIKEN Institute in Japan, found that the specialized song-learning brain circuitry of vocal learning birds (songbirds, parrots and hummingbirds) and human brain speech regions have convergent changes in the activity of more than 50 genes. Most of these genes are involved in forming neural connections. Osceola Whitney, Pfenning and Anne West, also of Duke, found in another Science study that singing is associated with the activation of 10 percent of the expressed genome, with diverse activation patterns in different song-learning regions of the brain, controlled by epigenetic regulation of the genome. Duke’s Mukta Chakraborty and others found in a PLoS ONE study that parrots have a song system within a song system, with the surrounding song system unique to them. This might explain their greater ability to imitate human speech. In a BMC Genomics study, Morgan Wirthlin, Peter Lovell and Claudio Mello from Oregon Health & Science University found unique genes in the song-control brain regions of songbirds.

The XYZW of sex chromosomes. Just as the sex of humans is determined by the X and Y chromosomes, the sex of birds is controlled by the Z and W chromosomes. The W makes birds female, just as the Y makes humans male. Most mammals share a similar evolutionary history of the Y chromosome, which now contains many degenerated genes that no longer function and only a few active genes related to ”maleness.” A Science study led by Qi Zhou and Doris Bachtrog from the University of California, Berkeley, and Zhang found that half of bird species still contain substantial numbers of active genes in their W chromosomes. This challenges the classic view that the W chromosome is a ”graveyard of genes” like the human Y.

This group also found that bird species are at drastically different states of sex chromosome evolution. For example, the ostrich and emu, which belong to one of the older branches of the bird family tree, have sex chromosomes resembling their ancestors. Yet some modern birds such as the chicken and zebra finch have sex chromosomes that contain few active genes. This opens a new set of questions on how the diversity of sex chromosomes may drive the diversity of sex differences in the outward appearance of various bird species. Peacocks and peahens are dramatically different; male and female crows are indistinguishable.

How did birds lose their teeth? In a Science study led by Robert Meredith from Montclair State University and Mark Springer from the University of California, Riverside, a comparison between the genomes of living bird species and those of vertebrate species that have teeth identified key mutations in the parts of the genome that code for enamel and dentin, the building blocks of teeth. The evidence suggests that five tooth-related genes were disabled within a short time period in the common ancestor of modern birds more than 100 million years ago.

What’s the connection between birds and dinosaurs? Unlike mammals, birds (along with reptiles, fish and amphibians) have a large number of tiny microchromosomes. These smaller packages of gene-rich material are thought to have been present in their dinosaur ancestors. A study of genome karyotype structure in BMC Genomics analyzed whole genomes of the chicken, turkey, Peking duck, zebra finch and budgerigar. It found the chicken has the most similar overall chromosome pattern to an avian ancestor, which was thought to be a feathered dinosaur. This work was led by Darren Griffin and Michael Romanov from the University of Kent, and by Dennis Larkin and Marta Farré from the Royal Veterinary College, University of London.

Another study in Science examined birds’ closest living relatives, the crocodiles. This team, led by Ed Green and Benedict Paton from the University of California, Santa Cruz, David Ray from Texas Tech University and Ed Braun from the University of Florida, found that crocodiles have one of the slowest-evolving genomes. The researchers were able to infer the genome sequence of the common ancestor of birds and crocodilians (archosaurs) and therefore all dinosaurs, including those that went extinct 66 million years ago.

Do differences in gene trees versus species trees matter? In the phylogenomics flagship study by Jarvis and others, the consortium found that no gene tree has a history exactly the same as the species tree, partly due to a process called incomplete lineage sorting. Another Science study, led by Tandy Warnow at the University of Texas and the University of Illinois, and her student Siavash Mirarab, developed a new computational approach called “statistical binning.” They used this approach to show it does not matter much that the gene trees differ from the species tree because they were able to infer the first coalescent-based, genome-scale species tree, combining gene trees with similar histories to accurately infer a species tree.

Do bird genomes carry fewer virus sequences than other species? Mammalian genomes harbor a diverse set of genomic “fossils” of past viral infections called “endogenous viral elements” (EVEs). A study published in Genome Biology led by Jie Cui of Duke-NUS Graduate Medical School in Singapore, Edward Holmes of the University of Sydney and Zhang, found that bird species had 6-13 times fewer EVE infections in their past than mammals. This finding is consistent with the fact that birds have smaller genomes than mammals. It also suggests birds may either be less susceptible to viral invasions or better able to purge viral genes.

When did colorful feathers evolve? Elaborate, colorful feathers are thought to be evolutionarily advantageous, giving a male bird in a given species an edge over his competitors when it comes to mating. Zhang’s flagship paper in Science, which is further analyzed by Matthew Greenwold and Roger Sawyer from the University of South Carolina in a companion study in BMC Evolutionary Biology, found that genes involved in feather coloration evolved more quickly than other genes in eight of 46 bird lineages. Waterbirds have the lowest number of beta keratin feather genes, landbirds have more than twice as many, and in domesticated pet and agricultural bird species, there are eight times more of these genes.

What happens to species facing extinction or recovering from near-extinction?Birds are like the proverbial canaries in the coal mine because of their sensitivity to environmental changes that cause extinction. In a Genome Biology study led by Shengbin Li, Cheng Cheng and Jun Yu from Xi’an Jiaotong University, Huanming Yang from BGI and Jarvis, researchers analyzed the genomes of species that have recently gone nearly extinct, including the crested ibis in Asia and the bald eagle in the Americas. They found genes that break down environmental toxins have a higher rate of mutations in these species and there is lower diversity of immune system genes in endangered species. In a recovering crested ibis population, genes involved in brain function and metabolism are evolving more rapidly. The researchers found more genomic diversity in the recovering population than was expected, giving greater hope for species conservation.

How do penguins adapt to the cold and hostile Antarctic environment
Penguins have many distinct morphological features that different with other birds. They are flightless, with specialized wings and skin. The two Antarctic penguin genomes have been presented in GigaScience, led by Zhang from BGI and David Lambert from Griffith University, revealing insights into how these birds have been able to adapt to the cold and hostile Antarctic environment. Through sequencing the genomes of Adélie and emperor penguins, researchers have revealed the genetic basis of adaptations related to their feathers, wings, eyes and lipid metabolism. They have also revealed their historical population changes in response to climate change and glaciation, and estimated that the current species of penguins first appeared around 60 million years ago.

The Start of Something Bigger

This sweeping genome-level comparison of an entire class of life is being powered by frozen bird tissue samples collected over the past 30 years by museums and other institutions around the world. Samples are sent as fingernail-sized chunks of frozen flesh mostly to Duke University and University of Copenhagen for DNA separation. Most of the genome sequencing and critical initial analyses of the genomes have then been conducted by BGI.

The avian genome consortium is now creating a database that will be made publicly available in the future for scientists to study the genetic basis of complex avian traits.

Setting up the pipeline for the large-scale study of whole genomes — collecting and organizing tissue samples, extracting the DNA, analyzing its quality, sequencing and managing torrents of new data — has been a massive undertaking. But the scientists say their work should help inform other major efforts for the comprehensive sequencing of vertebrate classes. To encourage other researchers to dig through this ‘big data’ and discover new patterns that were not seen in small-scale data before, the avian genome consortium has released the full dataset to the public in GigaScience, and in NCBI, ENSEMBL and CoGe databases.

To maximize the use of this wealth of data, rather than wait until the publication of these papers the project has been a great example in early data release, with unpublished datasets released to the community over a four-year period. Demonstrating the huge interest in birds, the final genomes were released via twitter over the spring, leading to much discussion on social media and a doubling of the number of users of the GigaSciencedatabase.

Under the leadership of Dave Burt, the National Avian Research Facility at the Roslin Institute and Edinburgh University, UK, has created genome browser databases based on the ENSEMBL model for 48 species.

This project received its main financial support from BGI and the China National GeneBank, as well as from the U.S. National Institutes of Health, the U.S. National Science Foundation, the Howard Hughes Medical Institute, the Lundbeck Foundation and the Danish National Research Foundation, and support from the many other sources of funding for the consortium’s individual scientists.

Other leadership in the Avian Phylogenomics Project include, but are not limited to, Tandy Warnow of the University of Illinois; Stephen O’ Brien, David Haussler and Oliver Ryder of the Genome 10K consortium; Peter Houde of New Mexico State University; Edward Braun of the University of Florida; Joel Cracraft of the American Museum of Natural History; David Mindell of the University of California, San Francisco; Alexandros Stamatakis of the Heidelberg Institute for Theoretical Studies and Karlsruhe Institute of Technology; Jon Fjeldsa and Carsten Rahbek of the University of Copenhagen; Scott Edwards of Harvard University; David Burt of the Roslin Institute of Edinburgh University; Gary Graves of the Smithsonian Institution; Robb Brumfield of Louisiana State University; Agostinho Atunes of the Universidade do Porto in Portugal; Darren Griffin of the University of Kent; Dennis Larkin from the Royal Veterinary College, University of London; Qi Zhou of the University of California, Berkeley; and Wang Jun of BGI.

%d bloggers like this: