The AI-Researcher project, officially titled “AI-Researcher: Autonomous Scientific Innovation,” represents a revolutionary breakthrough in Automated Scientific Discovery. Developed by HKUDS, this state-of-the-art platform presents a new system that fundamentally Reshapes the Traditional Research Paradigm.

The system has already achieved significant recognition, notably being accepted as a Spotlight paper at NeurIPS 2025. AI-Researcher is designed to provide Full Autonomy for complete end-to-end research automation and Seamless Orchestration from initial concept development all the way to final publication.

Core Capabilities and the Comprehensive Research Ecosystem

AI-Researcher delivers a Comprehensive Research Ecosystem powered by cutting-edge AI agents. This platform manages the entire scientific lifecycle by integrating several critical primary research functions:

  1. Literature Review: Conducts comprehensive analysis and synthesis of existing research.
  2. Idea Generation: Systematically gathers, organizes, and formulates novel research directions.
  3. Algorithm Design and Implementation: Develops methodologies and transforms abstract ideas into functional implementations.
  4. Algorithm Validation and Refinement: Automates testing, performance evaluation, and iterative optimization.
  5. Result Analysis: Delivers advanced interpretation of experimental data and insights.
  6. Manuscript Creation: Automatically generates polished, full-length academic papers.

The End-to-End Scientific Automation System

The AI-Researcher pipeline provides comprehensive automation across three strategic phases, eliminating the need for manual intervention throughout the research lifecycle.

Phase 1: Literature Review & Idea Generation 📚💡

This phase focuses on identifying research needs and formulating novel concepts:

  • Resource Collector: Systematically gathers comprehensive research materials across multiple scientific domains, utilizing automated collection from major academic databases (such as arXiv, IEEE Xplore, ACM Digital Library, and Google Scholar), code platforms (like GitHub and Hugging Face), and open datasets.
  • Resource Filter: Evaluates and selects high-impact resources—including quality papers, well-maintained code implementations, and benchmark datasets—based on metrics like citation count, code maintenance, and data completeness.
  • Idea Generator: Formulates novel research directions by leveraging the identified resources, comprehensively analyzing current methodological limitations, mapping emerging technological trends, and exploring uncharted research territories.

Phase 2: Algorithm Design, Implementation & Validation 🧪💻

This phase follows a strict iterative cycle: Design → Implementation → Validation → Refinement:

  • Design Phase: Conceptual development of novel algorithmic ideas, establishing theoretical foundations, and careful planning of the implementation strategy.
  • Implementation Phase: Transforming abstract concepts into concrete code implementations, developing functional modules, and establishing a robust testing environment.
  • Validation Phase: Executing comprehensive, systematic tests to evaluate algorithm performance, collect metrics, and document findings to ensure rigorous implementation verification.
  • Refinement Phase: An iterative cycle involving identifying areas for improvement, optimizing code efficiency, and implementing necessary enhancements based on validation results.

Phase 3: Paper Writing ✍️📝

The final stage is handled by the Writer Agent. This agent automatically generates full-length academic papers by integrating the research ideas, motivations, newly designed algorithm frameworks, and the algorithm validation performance. It leverages a hierarchical writing approach to create polished manuscripts with precision and clarity.

Flexible User Input Mechanisms

The AI-Researcher system is designed to accept user input queries at two distinct levels:

  1. Level 1: Detailed Idea Description: Users provide comprehensive descriptions of their specific research ideas. The system then processes these detailed inputs to develop implementation strategies based on the user’s explicit requirements. This corresponds to Level 1 tasks, which involve extensive surveying and experiments.
  2. Level 2: Reference-Based Ideation: This simpler level allows users to submit reference papers without a specific idea. The system analyzes these provided references to generate and develop novel research concepts, following a typical query format: “I have some reference papers, please come up with an innovative idea and implement it with these papers”.

Rigorous Evaluation: The Comprehensive Benchmark Suite

To objectively assess the academic capabilities of AI researchers and the quality of their scholarly work, the developers created a Comprehensive Benchmark Suite.

Key innovations ensuring thorough and reliable evaluation include:

  • Expert-Level Ground Truth: The benchmark leverages human expert-written papers as ground truth references, setting a high-quality standard for validation.
  • Multi-Domain Coverage: The benchmark spans four major research domains: Computer Vision (CV), Nature Language Processing (NLP), Data Mining (DM), and Information Retrieval (IR).
  • Transparency in Evaluation: The methodology for building the benchmark is fully open-sourced, including processed datasets, data collection pipelines, and processing code.
  • Evaluator Agents: Specialized Evaluator Agents conduct thorough assessments across multiple dimensions, organized systematically into two task levels based on the extent of idea provision.

The key evaluation metrics utilized are:

  1. Novelty: Assessing the innovation and uniqueness of the work.
  2. Experimental Comprehensiveness: Evaluating the design, execution, and rigor of the experiments.
  3. Theoretical Foundation: Measuring the strength of the theoretical background.
  4. Result Analysis: Analyzing the depth and accuracy of result interpretation.
  5. Writing Quality: Reviewing the clarity, coherence, and structure of the written report.

Accessibility and Technical Deployment

AI-Researcher is engineered to be an Easy-to-Use AI Research Assistant that delivers a truly seamless and accessible experience.

The platform offers several features to minimize technical barriers:

  • Minimal Domain Expertise Needed: The tool simplifies the research process by autonomously identifying critical research gaps, proposing innovative approaches, and executing the entire pipeline.
  • Effortless Research Kickoff: Researchers can kickstart their journey simply by providing a list of relevant papers, eliminating the need for complex initial ideas or configurations.
  • Multi-LLM Provider Support: The system effortlessly integrates with leading language model providers such as Claude, OpenAI, Deepseek, and others, allowing researchers to select suitable AI capabilities.
  • Deployment: A user-friendly Web GUI Interface (released in May 2025) is available, making research more convenient. For custom installation, researchers can use uv to manage packages or utilize Docker installation with the provided image tjbtech1/airesearcher:v1t.

The environment configuration allows users to set various parameters, including the completion model (COMPLETION_MODEL), container configuration, and task-specific details such as the research category (e.g., vq, gnn, recommendation) and instance ID.