Artificial intelligence predicts gene activity inside human cells

Artificial intelligence predicts gene activity inside human cells

Researchers at Columbia University in the United States have developed a new method using artificial intelligence to accurately predict the activity of genes within any human cell. 

The study, published in the journal Nature, highlights a new method that could transform our understanding of diseases ranging from cancer to genetic disorders. This innovative approach enables rapid and accurate detection of biological processes. Moreover, it efficiently conducts large-scale computational experiments, enhancing traditional experimental methods.

Researchers believe that accurately predicting cell activities will lead to a significant shift in our comprehension of basic biological processes. This advancement could change biology from a field that merely describes random events into one that predicts the systems governing cell behavior.

To achieve this, the researchers trained a machine learning model. It predicts which genes are active in specific human cells. This prediction provides valuable data about gene expression, revealing a cell’s identity and how it functions.

Model training

Previous models used specific cancer cell lines, which weren’t representative of normal cells. In a new study, researchers shifted their approach. They trained a machine-learning model using gene expression data from millions of normal human tissue cells. They included genome sequences and data on accessible and expressed parts of the genome.

This process resembles how systems like ChatGPT function. First, a training dataset helps the system discover underlying rules, which it then applies to new cases.

When trained on over 1.3 million human cells, the model accurately predicted gene activity in previously unseen cell types. Its predictions closely aligned with experimental data.

The researchers realized the AI’s potential when they tasked it with uncovering hidden biological aspects of diseased cells, such as hereditary cancer in children. The AI predicted that mutations disrupt interactions between two transcription factors crucial for leukemic cell fate. Laboratory experiments confirmed these predictions.

Understanding mutation impacts helps reveal specific mechanisms driving this disease. Additionally, new computational methods allow researchers to explore the “dark matter” of the genome—regions that don’t encode known genes.

Most mutations found in cancer patients reside within these dark regions. These mutations don’t affect protein function and have received little attention. Currently, researchers are investigating various cancer types, from brain to blood cancer, and studying how cells transform during cancer development.