Artificial intelligence in scientific research: developments, challenges, future

Artificial Intelligence (AI) is transforming scientific research: AI systems like AlphaFold, RoseTTAFold, and ESMfold have revolutionized protein structure prediction. Now large Language Models (LLMs) like ChatGPT are democratizing scientific writing and coding. However, AI use might raise ethical concerns, including potential misuse and authorship issues.

The use of AI in the life sciences recently got into public focus with neuronal networks such as AlphaFold and RoseTTAFold which are able to predict a protein’s three-dimensional structure from its amino acid sequence, a longstanding challenge in molecular biology. They are now accelerating many areas of science by expediting and simplifying hypothesis generation and verification in the wet lab. However, challenges remain, such as predicting protein interactions with RNA, DNA, or small molecules as well as disorder and conformational states.

With publication of LLMs like ChatGPT, GPT code interpreter and Github’s Copilot, now the next wave of AI systems is rolling towards us. LLMs are democratizing access to coding by providing an intuitive, language-based interface for interacting with code and data. They generate code snippets, explain complex coding concepts in plain language, assist with debugging, and help optimize code. However, they can make mistakes, especially with complex or unfamiliar coding tasks, requiring scientists to verify the generated code.

LLMs can also overcome language barriers and assist researchers less proficient in English by effectively conveying their findings, methods, and conclusions, facilitating better global scientific communication. For native speakers, LLMs can increase efficiency by generating coherent and contextually relevant text, reducing the time and effort required to produce a research paper.

It is often cited that using AI for generating scientific texts brings ethical concerns, including potential misuse and authorship issues. However, future rulemaking should consider that LLMs are tools used by human researchers. Therefore, the user should be ultimately responsible for the text generated with the help of an AI. This includes ensuring the text is accurate, meaningful and not plagiarized.

Future AI tools will have access to the world’s scientific literature. It is likely that at some point these models will be capable of hypothesis generation and experimental design. Such systems could potentially reduce cognitive biases that can influence human-driven research. But such a development will likely also lead to a shift in the role of scientists from hypothesis generators to primarily focusing on falsifying machine generated hypotheses. Moreover, review articles will likely be of less value as they can be generated instantly and on demand.
In summary, the benefits of AI-driven research will be significant. However, they will likely also transform the scientific process substantially. As science is always at the forefront of innovation, we cannot ignore these advancements. Now is the time to embrace AI technology and establish guidelines that maintain the integrity of science without obstructing the potential benefits.

The text was written by Jens Bosse with the support of ChatGPT 4. Here you can read the conversation he had with ChatGPT. However, the opinions and conclusions expressed in this text are his alone.