Artificial intelligence is rapidly changing the landscape of scientific discovery. In a groundbreaking move, a virtual scientific conference called Agents4Science 2025 allowed AI agents to take the lead on research tasks – from formulating hypotheses to analyzing data and even conducting peer reviews. This experiment aimed to test AI’s capabilities in scientific research while maintaining human oversight.
A New Approach to Scientific Collaboration
For the first time, a scientific conference welcomed paper submissions from any scientific field, but with one significant condition: AI had to do most of the work. Called Agents4Science 2025, the October 22 virtual event marked a radical departure from traditional scientific publishing.
The conference featured AI agents – systems that combine large language models with specialized tools and databases to perform multistep tasks. From generating research questions to analyzing data and providing initial peer reviews, these AI systems took the lead. Human researchers then stepped in to evaluate the most promising submissions.
In total, 48 papers out of 314 submissions advanced to the final stage. Each paper had to detail the specific ways humans and AI collaborated throughout the research and writing process.
“This represents an interesting paradigm shift,” explained James Zou, a computer scientist at Stanford University and co-organizer of the conference. “People are starting to explore using AI as a co-scientist.”
Pushing the Boundaries of AI in Science
Most scientific journals and conferences currently prohibit AI coauthors and restrict AI use by human reviewers. These policies aim to avoid potential issues like inaccurate information generation (“hallucinations”) associated with AI use.
However, these restrictions create a significant knowledge gap: we simply don’t know how capable AI actually is at scientific work. That’s exactly what the Agents4Science conference aimed to explore, calling it an experiment with all materials publicly available for study.
During the virtual meeting, human researchers presented AI-assisted work spanning diverse fields including economics, biology, and engineering.
Collaboration in Action
Economist Min Min Fong of the University of California, Berkeley, and her team collaborated with AI to study car-towing data from San Francisco. Their study found that waiving high towing fees helped low-income residents keep their vehicles.
“AI was really great at helping us with computational acceleration,” Fong noted. However, she emphasized the need for careful oversight: “you have to be really careful when working with AI.”
A concrete example emerged when the AI repeatedly cited the wrong date for when San Francisco’s fee-waiving rule took effect. Fong had to verify this information against the original source to correct the error. “The core scientific work still remains human-driven,” she concluded.
Expert Perspectives
Computational astrophysicist Risa Wechsler of Stanford University, who participated in the peer review process, offered a balanced perspective. While acknowledging the technical correctness of the papers, she expressed skepticism about AI’s current capabilities.
“The papers were technically correct,” Wechsler said, “but neither particularly interesting nor significant.” She expressed excitement about AI’s potential for research but remained unconvinced that current AI systems can “design robust scientific questions.” Additionally, she noted that AI’s technical capabilities can sometimes “mask poor scientific judgment.”
The Road Ahead
The Agents4Science conference represents a crucial step in understanding the evolving relationship between humans and AI in scientific research. Rather than replacing researchers, these AI agents appear to be functioning as powerful tools that can accelerate certain aspects of the scientific process.
However, the human oversight remains essential, particularly for tasks requiring nuanced judgment, creative insight, and ethical consideration – areas where human researchers continue to play a vital role.
The experiment demonstrates both the potential and limitations of current AI systems in scientific contexts. As AI capabilities continue to evolve, this collaborative approach may become increasingly common, fundamentally reshaping how scientific discovery is conducted





























