Researchers in the United States and Taiwan have presented an unusual artificial intelligence method for developing entirely new proteins that have never previously been seen in nature.
Image Credit: Design_Cells / Shutterstock.com
The team has used machine learning to derive “musical scores” from the structures of proteins, that can be used to train deep learning neural networks to design completely novel proteins.
Essential building blocks for life
Proteins are essential for all cellular processes and, after water, they are the second most abundant molecule found in human tissues. Enzymes, for example, catalyze the biochemical reactions required for metabolism; cell-signaling proteins are essential for controlling multiple cellular activities and antibodies are vital for combatting pathogens such as bacteria and viruses. Proteins are also essential for cell-to-cell interaction or cell adhesion and for controlling the cell cycle for the production of new cells.
As the building blocks of life, proteins have long been researched in efforts to create new molecules with desired functions, activity and processes.
Designing new proteins
Proteins are composed of amino acids that form polypeptides – long sequences of amino acids that are linked by peptide bonds. However, the overall 3D structure, which determines a protein’s function, is significantly more complex.
Historically, scientists have developed new proteins by copying existing proteins or by altering the amino acids that a protein is composed of. However, this process is time-consuming and predicting the impact that altering amino acids has on protein structure is challenging.
However, computational modeling techniques such as physiochemical simulations have been developed that can generate models of 3D protein structure based on the amino acid sequence.
Now, Markus J. Buehler (Massachusetts Institute of Technology) and colleague Chi Hua Yu in Taiwan have used musical theory concepts to translate the chemical structure of proteins into sounds that can be used in machine learning to design completely new proteins.
About machine learning
Machine learning is a type of artificial intelligence where computers are used to automatically analyze and learn from data, identify patterns and make decisions, without requiring preprogramming and with only minimal human input needed.
As reported in APL Bioengineering, a publication from the American Institute of Physics, Buelher and colleagues have used the distinct vibrational frequencies of each amino acid in a protein to train a machine-learning algorithm to design new proteins.
Since each of the twenty amino acids that form a protein has its own distinct vibrational frequency, the whole protein chemical structure can be represented audibly using key aspects of musical theory such as melody and rhythm.
The resulting “musical scores” which are generated depending on how a protein folds were fed into a deep learning neural network.
These artificial neural networks are computational algorithms that mimic the behavior of human interconnected neurons to process and learn from large amounts of information in a similar way to how the brain process information.
"These networks learn to understand the complex language folded proteins speak at multiple time scales," explained Buehler. "And once the computer has been given a seed of a sequence, it can extrapolate and design entirely new proteins by improvising from this initial idea, while considering various levels of musical variations -- controlled through a temperature parameter -- during the generation."
Next, Buehler and colleagues took the newly designed proteins and compared them with information on known proteins held in a large database.
The study “paves the way for making entirely new biomaterials”
By applying molecular characterization and dynamics techniques, the team showed that their new approach had indeed designed proteins that have never been seen in nature.
After finding that the newly designed proteins seemed to be stable and folded, Beulher and colleagues developed a machine-learning algorithm that could take the audible musical representations from sound waves the proteins generated to create matter - an achievement that Beulher says “paves the way for making entirely new biomaterials.”
Alternatively, "perhaps you find an enzyme in nature and want to improve how it catalyzes or come up with new variations of proteins altogether," he suggests. By altering a condition such as temperature, for example, the algorithm can be prompted to create more mutations, which could then be quantified to assess which ones contribute to making up the most effective enzymes.
Buelher also suggests that the musical sounds generated by the new proteins could be used to help compose classical music.
In the evolution of proteins over thousands of years, nature also gives us new ideas for how sounds can be combined and merged,”
Markus J. Buehler, Massachusetts Institute of Technology
Sources:
Buehler M and Yu C. Sonication based de novo protein design using artificial intelligence, structure prediction and analysis using molecular modeling. APL Bioengineering 2020. DOI: 10.1063/1.5133026
Machine Learning. The Royal Society 2019. Available at: https://royalsociety.org/topics-policy/projects/machine-learning/
Machine Learning. SAS 2020. Available at: https://www.sas.com/en_us/insights/analytics/machine-learning.html