Alibaba’s New AI Training Method Cuts Search Costs by Nearly 90%

Main Content

Alibaba’s New AI Training Method Cuts Search Costs by Nearly 90%

  • Alibaba’s ZEROSEARCH employs a two-step simulation strategy to eliminate API costs
  • Well-known independent evaluation analysis ranks Qwen3-235B-A22B first in affordability among top models


Alibaba has introduced ZEROSEARCH, a groundbreaking approach for training large language models (LLMs) that significantly reduces the cost of teaching AI to perform search tasks. The innovative approach eliminates expensive API calls to commercial search engines by training models to simulate search behavior, lowering training expenses by nearly 90% and making advanced AI search functionalities more accessible.

“By drastically reducing the costs involved in training LLMs to simulate search engine behavior, we are enabling developers and businesses—especially small and medium sized enterprises —to independently develop their own reinforcement learning (RL) framework without costly search engine interactions,” said Huang Fei, Head of Alibaba’s Tongyi Natural Language Processing Lab. “ZEROSEARCH is a major milestone in democratizing large scale RL technologies by enhancing affordability without compromising performance.”

Lowering Barriers for Developers and Businesses

Searching for relevant information is crucial for enhancing LLMs’ reasoning and response accuracy. Traditional reinforcement learning (RL) methods required hundreds of thousands of interactions with live search engines through costly API requests, making training prohibitively expensive and limiting scalability. Additionally, the inconsistent quality of results obtained from search engines often impacted the effectiveness of the training process.

Simulating Search Effectively and Affordably

To address these challenges, Alibaba’s ZEROSEARCH employs a two-step simulation strategy that eliminates the need for costly API calls:

First, the team applied lightweight supervised fine-tuning to transform the LLM into a retrieval module capable of generating relevant documents in response to user queries, mimicking the behavior of real search engines.

Second, during the RL phase, researchers utilized a curriculum-based rollout strategy. This method progressively reduces the quality of simulated documents generated, challenging the model to adapt and continuously improve its performance.

In testing, models trained using ZEROSEARCH matched or exceeded the performance of those trained with actual search engine APIs. For example, a Qwen2.5-7B retrieval module demonstrated performance comparable to Google Search, while a larger 14B module surpassed Google’s capabilities, achieving an 88% reduction in training costs.

Alibaba’s Continued Commitment to Open and Affordable AI

Beyond cost-efficient training, Alibaba has open-sourced multiple AI models across sizes, languages, and modalities, supporting global developers in building custom AI solutions in a cost-efficient way.

Single
Single2
Artificial Analysis placed Qwen3-235B-A22B fifth overall in Intelligence and first in affordability

Independent evaluations by Artificial Analysis, a well-recognized independent analysis of AI models and API providers, have placed Alibaba’s latest LLM Qwen3-235B-A22B, fifth overall in Intelligence (math, coding, reasoning, and science) and first in affordability, significantly below competing offerings.

Learn more about Alibaba’s AI updates.

AI searchZEROSEARCH
Reuse this content

Sign Up For Our Newsletter

Stay updated on the digital economy with our free weekly newsletter