Almost every protein ever discovered is predicted by artificial intelligence

Almost every protein ever discovered is predicted by artificial intelligence ...

Just one year after the first data release, a new era of biological investigation has been opened.

More than 200 million protein structures have now been shared online in a free-to-access database called AlphaFold DB, which was developed by Google-owned AI business DeepMind.

The achievement opens the way for unimaginable avenues of scientific investigation into proteins, the building blocks of life. Researchers are enthralled.

"Determining the 3D structure of a protein used to take months or years, now it takes seconds," says cardiologist Eric Topol from the Scripps Research Translational Institute.

"We can anticipate additional biological mysteries to be solved every day with this new set of structures that illuminates nearly the entire protein universe."

In July last year, DeepMind unveiled its first batch of AlphaFold predictions in collaboration with scientists from the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBLEBI).

AlphaFold has been touted as a cutting-edge technology that would revolutionize biological research and accelerate drug discovery. Proteins' 3D shape is determined by amino acid sequences.

These amino acid sequences spool long proteins that are folded into pleated sheets and twisting ribbons, together in chains.

Scientists may gain a better understanding of how a protein folds into, and how it functions inside the cells.

AlphaFold was created to speed up the process, since it contains more than 200 million predicted protein structures found in plants, bacteria, animals, and other organisms.

"That hope has become a reality far faster than we had expected," DeepMind CEO Demis Hassabis said in a statement on the latest data release.

Researchers have already used AlphaFold's first batch of predictions to deepen their understanding of deadly illnesses such as malaria, opening the way for improved vaccinations, and unraveling scientific mysteries about gigantic proteins that have remained unanswered at for decades.

Identifying never-seen-before enzymes that might help upcycle plastic pollution, too.

"AlphaFold has sent ripples through the molecular biology community," saidSameer Velankar, a structural biologist who leads up EMBL-EBI's Protein Data Bank.

"Over a thousand scientific papers on a wide spectrum of research topics using AlphaFold structures have been published in the last year alone; I have never seen anything like it.

Velankaradded: "This is just the outcome of 1 million predictions." "Imagine the impact of having over 200 million protein structure predictions freely available in the AlphaFold database."

Although the open-source AlphaFold software has been available to researchers since its release last year, having millions of predicted protein structures at their disposal in a searchable database will without doubt expedite research.

Around one-third of the more than 214 million predictions have been classified as highly accurate, comparable to protein structures obtained using standard experimental techniques, such as X-ray crystallography and cryo-electron microscopy.

Scientists have long ago meticulously derived molecular structures from fuzzy pictures that these methods produce, perhaps the most famous being Rosalind Franklin's interpretation of helical DNA.

AlphaFold's accuracy vary, and may be less accurate for rarer proteins that scientists are unaware of. So in some cases, its predicted structures may be used to make sense of experimental data.

Despite the enormous data dump, AlphaFold still finds a lot of life that it doesn't capture, including predictions on how proteins interacted once assembled.

Although they aren't included in the database, microorganisms represent an untapped source of powerful compounds, since scientists have cataloged only a small portion of all microbial life on Earth.

Several scientists have raised concerns about the AlphaFold database and its staggering 23 terabytes of information, which might be more difficult for certain research teams to access due to the expense of computer power and cloud-based storage.

Nevertheless, the potential benefits to human health that DeepMind claims to have carefully considered versus potential bioethical hazards are so great they are practically unimaginable.

"I expect that this latest update will result in an explosion of new and exciting discoveries in the months and years ahead," said structural biologist and senior scientist at EMBL-EBI. "This is all due to the fact that the data is freely available for everyone to use."

The AlphaFold database will be updated by DeepMind and EMBL-EBI periodically. However, for the time being, you may read more about the most recent data release and previous discoveries here.

You may also like: