Overview — THE DOG GENOME ANNOTATION (DoGA) PROJECT

Understanding the Genome

All cells in complex organisms exhibit unique structures and functions, yet they share an identical genome sequence. To fully comprehend a genome, we must identify:

Protein-coding genes
RNA molecules that facilitate protein synthesis

The Role of Gene Regulation

An essential aspect of genomic understanding is how protein production is regulated in various cell types and during development. This involves:

Identifying and annotating gene regulatory regions
Exploring non-coding genes that produce regulatory RNAs

Research indicates that variations in disease, morphology, and behavior among dog breeds are often linked to changes in regulatory regions rather than protein-coding genes.

Focus on Gene Enhancers

Gene enhancers, regulatory regions of significant interest, have recently become detectable through advanced technologies. These elements play a crucial role in defining cell identity.

Comparative Genomics

The availability of genomes from thousands of species reveals that while many protein-coding genes are conserved, the complexity of higher organisms is largely encoded in their regulatory regions and regulatory RNAs. For example:

The human genome reference sequence has been available since 2001, with ongoing projects like ENCODE, FANTOM, and Roadmap Epigenomics focusing on identifying regulatory regions.
The DANIO-CODE project aims to enhance the genome annotation for zebrafish.

The Dog Genome

The dog reference genome sequence (CanFam3.1) was released in 2005, based on the Boxer breed, and has facilitated gene discovery in various conditions.
Despite updates, over 20,000 gaps in the reference genome still obscure many regulatory elements.
Recent long-read sequencing advancements have significantly improved genome continuity, creating high-quality reference genomes for various breeds and dingoes.

Challenges in Genome Annotation

Despite progress, dog genome annotation is still incomplete, particularly concerning:

Regulatory elements essential for gene expression
Non-protein coding RNAs

This lack of information hinders the use of dogs as models for studying complex traits influenced by regulatory variations, including:

Morphology
Behavior
Diseases (e.g., epilepsy, psychiatric disorders)

The DoGA Consortium

To address these challenges, the Dog Genome Annotation (DoGA) Consortium was formed with goals that include:

Establishing a comprehensive tissue biobank of dogs and wolves
Identifying and annotating functional regions using advanced technologies
Creating a gene expression atlas across over 100 tissues

These efforts aim to facilitate gene discovery in regulatory regions and enhance understanding of gene regulation in dogs, wolves, and humans.

Research Priorities

Understanding the complexity of the dog brain and its regulatory elements is crucial for:

Comparing canine and human brains
Advancing research on common complex traits and their underlying genetic factors

Future Directions

The project will generate critical resources that enhance the utilization of canine models in research, ultimately contributing to advancements in both dog and human health. Additionally, the project will analyze gene expression profiles from fresh brain tissues of dogs and wolves, potentially revealing differences that could provide insights into neuronal pathways affected during domestication.