The human genome contains around 20,000 genes, encoding all the biological instructions that our cells need for life. Yet less than 2% of our DNA is made up of genes, while the rest is so-called ‘junk DNA’ (more accurately called non-coding DNA).
These parts of the genome contains many important functional elements, including more than a million ‘control switches’ that turn genes on and off at the right time and in the right place, and control the levels of gene activity in different parts of the body.
When, where and how much your genes are activated has the power to significantly impact how your cells work and your overall health.
So it shouldn’t be surprising that genetic variations between populations – which are more common in non-coding DNA than in genes – can have a significant impact on gene activity in people from different backgrounds.
Understanding the differences in how our genes are interpreted relies upon building genetic databases containing information from a diverse range of people.
In a recent article published in Nature Reviews Genetics, lead author Dr Deepti Gurdasani and her colleagues at the University of Cambridge discuss why diversity in genomic databases is essential for predicting how our genes impact our health.
One part of their review highlights how a lack of diversity in genomic databases limits our understanding of gene regulation. In turn, this affects our ability to make accurate predictions about how genes are expressed in different populations and the impact of these differences on our health.
Gene regulation makes your eyes see and your stomach digest (and not the other way round!)
Every cell in your body contains the same genome – roughly 6 billion ‘letters’ of DNA, encoding around 20,000 genes.
But different cells serve different purposes, and they need to activate the right genes at the right times. It would not be helpful if the cells in your limbs changed into liver halfway through development in the womb, or if your eyes suddenly started making acid like the cells in your stomach.
When genes are activated (or ‘expressed,’ to use the technical term), cells use the instructions encoded within them to make specific protein molecules. And it’s these proteins that make up the body and fulfil the functions of life.
For example, liver cells produce enzymes for digestion, skin cells make sturdy keratin proteins to protect us, and nerve cells make messenger molecules that transmit signals and allow us to think.
Non-coding DNA is essential for gene regulation
Exactly how much of our non-coding DNA is truly junk is a hotly contested question in genetic research. But at least some of it plays a vital role in controlling and regulating the activity of our genes.
When a gene is switched on, a complex molecular machine called RNA polymerase creates multiple copies of the instructions encoded within it. These copied instructions are then sent out to ‘factories’ in the cell, known as ribosomes, which then make the appropriate protein.
Some sections of non-coding DNA act as ‘ON’ switches, providing a site for RNA polymerase to attach and begin copying. They also provide places for proteins to bind and increase, decrease, or stop the copying process.
In this case, non-coding DNA acts as a bit like a massive bank of control switches that activate, repress or tweak the activity levels of all the genes in a cell.
We don’t fully understand the role of non-coding DNA in gene expression and its implications for our health. But errors and variations in the coding sections of DNA account for less than half of genetic disorders, so changes in non-coding DNA may be responsible for many effects on health and disease risk.
Our non-coding DNA is more diverse than our genes
Not only is there much more non-coding DNA than genes in the human genome, there’s also much more genetic variation within it too.
While coding DNA only varies by around 0.025% from person to person, non-coding DNA can vary by as much as 4%.
The big question is how much does this variation matter?
While most of the differences in non-coding DNA between people probably don’t matter all that much, some changes are much more important, affecting gene activity patterns, how likely we are to develop certain diseases, and how effective drugs are for us.
For example, when embryos are developing in the womb, their cells require a protein called pancreas-specific transcription factor 1a (PTF1A) to grow a pancreas. A section of non-coding DNA close to the PTF1A gene acts as a switch that increases the expression of PTF1A in pancreas cells.
A particular change in this section of non-coding DNA prevents cells from producing enough PTF1A and a baby with this genetic error is born without a fully functional pancreas. As a result, the baby can’t control its blood sugar effectively and requires insulin treatment.
Genome-wide association studies (GWAS) are starting to identify other variations in non-coding DNA that cause disease. But as we have previously discussed on our blog, most genomic databases contain DNA that predominately comes from people with European ancestries.
As a result, predictions about health and which medicines may work best for a particular person are often wrong for people of non-European descent.
What we know about gene expression is based on European genomes, not global genomes
Understanding gene expression is critical for studying disease pathways and developing drugs that precisely target the causes of disease. Yet most current databases lack the diversity needed to make accurate genomic predictions for much of the world’s population.
Unfortunately, what we know about gene expression so far is based on how gene regulation works in people of European ancestry. As Gurdasani and her colleagues discuss in their review, European data does not accurately predict gene expression in other populations.
Here at Global Gene Corp, we believe that genomics and precision medicine should be available to all. That’s why we’re dedicated to building diverse genomic datasets that will bring the promise of precision medicine to everyone, wherever they come from and wherever they live.
Follow us on Twitter @globalgenecorp for the latest insights into the #globalgenome