All the cells in a complex organism look different and perform specific functions, yet all of them contain the same complete genome sequence.

Understanding a genome encompasses the identification of the protein coding genes and RNA responsible to produce proteins, the building blocks of cells. Of equal importance is the understanding how the production of proteins is orchestrated in each mature cell type or tissue as well as in its development. Key to this understanding is the identification and description, also called annotation, of the gene regulatory regions in the genome together with genes that are not ultimately producing proteins but regulatory RNA molecules. Variation in disease, morphology and behaviour in dog breeds occurs more often in regulatory regions than in protein coding genes.

Of specific interest are gene enhancers, regulatory regions that have gained much interest in the last year or two, which only with recent technologies can be detected globally and are considered very important in giving cells their specific identity.

Functional Annotation of the Dog Genome

The genomes of thousands of species are now available and give insight into the fact that a substantial part of the protein coding genes are similar even between simple and complex organisms. Instead, the complexity of higher organisms is encoded in the regulatory regions of genomes and in the complexity of regulatory RNA, which is scaling with organism complexity. The human genome reference sequence, for example, has been available since the year 2001 with only minor changes until today. At the same time, a range of international large-scale projects aiming at identifying in the human genome the gene regulatory regions are still ongoing and demonstrate the strong need to further understand the regulation on the human genome specifically related to the complexity of the human brain. These projects include the ENCODE, FANTOM and Roadmap Epigenomics projects. The genomes of model organisms require an improved annotation for the very same reasons. An example includes the DANIO-CODE project for zebrafish.

The dog reference genome sequence has been available since 2005 and has greatly facilitated gene discovery in hundreds of conditions (OMIA). However, canine genome annotation of many genes and non-protein coding regulatory RNA is lacking or very incomplete, limiting our ability to use the dog as model for many complex traits affected often by regulatory variation in morphology, behavior and disease such common epilepsy psychiatric diseases, which are our research priority (Steenbeek et al. 2016). We cannot compare the dog brain to the human brain if we don’t understand the dog brain complexity and the regulatory elements in the dog genome.

This project will generate the critical resources to better utilize various canine models to advance both dog and human health.

Additionally, the project has access to fresh tissues from various parts of the brains in both dogs and wolves and aims to compare the gene expression profiles for potential differences, which might hint to an outstanding question about the altered neuronal pathways during the domestication.