The Y chromosome has been fully decoded for the first time, giving science its greatest genetic insight into how it influences the development of males and potential health issues that arise during their lives.
More than 100 scientists around the world contributed to the study led by the US National Human Genome Research Institute and published today in the journal Nature. Within their research is the full sequence of Y chromosome DNA, indicating every gene within the sequence, including 41 previously unknown to science.
Less than half of the Y chromosome had been previously decoded, in comparison to the previously translated X and 22 autosomes (non-sex chromosomes).
Deciphering the remainder of the Y means the full set of human chromosomes is now genetically revealed and adds detail on 30 million new bases – the individual units which make up genes – to the human genome reference.
The Y – along with the X – is one of only two sex chromosomes. It’s passed only from males to male offspring, usually with a single X from their mother.
While it’s the key driver of male development, its genes also influence fertility and disease development in males. For this reason, the researchers behind the decoding are hopeful it will give an unparalleled insight into how the Y impacts facets of male development.
As well as fully decoding the Y chromosome, the project sheds light on the nature of several gene families responsible for regulating sperm, as well as the development of the male reproductive system.
Achieving a ‘gapless’ read of the Y chromosome
“We didn’t even know if it could be sequenced, it was so puzzling,” says University of California biologist Dr Monika Cechova, who co-led the research. “This is really a huge shift in what’s possible.”
The ‘shift’ was brought about through new techniques allowing long sequences of genetic information to be read.
Previously, scientists had been unable to fully unravel the Y chromosome because of large regions of DNA which are highly repetitive – much more than other chromosomes – and don’t code for proteins. These ‘satellite’ DNA regions were also coupled together, which placed another barrier for geneticists to overcome.
Combining improved long-read DNA sequencing methods with stronger computational technology and knowledge from all other chromosomes, the scientists were now in possession of tools to strip away the highly repetitive areas.
As the US National Institute of Health explains, these repetitive segments of code are as if single sentences of a story appeared continuously throughout a book. Effectively, they’ve taken out these recurring lines.
“We didn’t know what exactly made up the missing sequence,” says Dr Adam Phillippy, head of the Genome Informatics Section at the National Human Genome Research Institute.
“It could have been very chaotic, but instead, nearly half of the chromosome is made of alternating blocks of two specific repeating sequences known as satellite DNA. It makes a beautiful, quilt-like pattern.”
Promise of insights into what ails
As part of the new study, researchers are excited by the prospect of new insights into medically important parts of human genetics.
These include ways in which repeating genes involved in sperm production can be severed, potentially impacting the way male sex cells are produced. This area of interest is called the azoospermia factor region. Azoospermia – literally an absence of live sperm in semen – is a condition affecting about 1% of males.
By deciphering other repeating regions where important genes might lurk, scientists are hopeful their understanding of genetic variations in the Y chromosome will help identify causes of particular medical issues or sites of genetic risk. Along with having a full reference Y chromosome available, the project has also sequenced 43 genetically diverse copies from individuals.
It’s anticipated this information will be used as part of pangenome research to understand genetic diversity across the entire human population.
“When you find variation that you haven’t seen before, the hope is always that those genomic variants will be important for understanding human health,” says Phillippy. “Medically relevant genomic variants can help us design better diagnostics in the future.”