Alibaba CEO: AI's True Potential Goes Beyond Screens, Transforming the Physical World
The 2024 APSARA Conference(云栖大会), an annual grand event of China's cloud computing industry chain organized by Alibaba, kicked off on September 19 in Hang Zhou.
The APSARA Conference originated in 2009, initially named the First China Website Development Forum. In 2011, it evolved into the Alibaba Cloud Developer Conference, and in 2015, it was officially renamed the "APSARA Conference". To date, it has been successfully held for 15 consecutive sessions, leading and witnessing every important moment as China's cloud computing waves surged forward three times.
Eddie Wu (吴泳铭), Alibaba Group CEO and Chairman and CEO of Alibaba Cloud Intelligence Group, delivered a keynote speech at the conference.
He believes that over the past 22 months, the pace of AI development has surpassed any period in history, but we are still in the early stages of AGI transformation. The greatest potential of generative AI is not in creating one or two new super apps on mobile screens but in taking over the digital world and transforming the physical world.
Wu's key points are as follows:
The speed of AI development has already surpassed any historical period, but we are still in the early stages of AGI transformation.
The investment threshold for advanced models in the next phase will be in the tens or hundreds of billions of dollars.
The greatest potential of generative AI is not in creating new super apps on mobile screens but in taking over the digital world and transforming the physical world.
Robotics will be the next industry to undergo massive changes. In the future, all movable objects will become intelligent robots.
In the future, almost all software and hardware will have inference capabilities. Their computing cores will shift to a model where GPU AI computing power is primary, supplemented by traditional CPU computing.
Over the past year, Alibaba Cloud has invested heavily in new AI computing power but still cannot meet the strong demand from customers.
People often overestimate new technological revolutions in the short term and underestimate them in the long term, but these technologies grow amidst skepticism, and one might miss the major trends due to hesitation.
The full text of the speech is as follows:
Welcome to the 2024 APSARA Conference. In the summer that just passed, Alibaba Cloud fully supported the Paris Olympics, achieving a historic breakthrough: for the first time, cloud computing surpassed satellites to become the main broadcasting method of the Olympics. AI was also widely used in the Olympics for the first time. Today, the focus of the Conference is also AI. I mainly want to share three points:
First, over the past 22 months, the speed of AI development has surpassed any period in history, but we are still in the early stages of AGI transformation.
Large model technology is rapidly iterating, and technological usability has greatly improved. Large models now possess multi-modal capabilities in text, speech, and vision, and can begin to execute complex instructions. Last year, large models could only help programmers write simple code; today, they can directly understand requirements and complete complex programming tasks. Last year, the mathematical abilities of large models were only at the level of middle school students; today, they have reached the level of international Olympiad gold medalists and are approaching doctoral levels in many disciplines such as physics, chemistry, and biology.
At the same time, the cost of model inference has decreased exponentially, far surpassing Moore's Law. Over the past year, the calling price of the Tongyi Qianwen API on Alibaba Cloud's "Pailian" platform has dropped by 97%. The minimum cost for calling one million tokens has dropped to 0.5 yuan. Inference cost is a key issue for the explosion of applications, and Alibaba Cloud will strive to continue lowering costs.
The open-source ecosystem is flourishing. In June this year, Tongyi Qianwen open-sourced Qwen-2, quickly topping Hugging Face's global open-source model rankings. On Hugging Face, Qwen's native and derivative models are close to 50,000, ranking second globally. Alibaba Cloud's ModelScope community has over 10,000 models, serving more than 6.9 million developers.
All of this is just the beginning. To achieve true AGI, the next generation of models needs to have larger scale, more general, and more generalized knowledge systems, and will also have more complex and multi-layered logical reasoning capabilities. The investment threshold for global advanced model competition will reach tens or hundreds of billions of dollars. The path for AI to have creative capabilities and help humans solve complex problems is clearly visible, which also opens up the possibility of widespread application of AI in various industry scenarios.
Second, the greatest potential of AI is not on mobile screens but in taking over the digital world and transforming the physical world.
Today, many industry insiders are pondering what the biggest application of AI is, often thinking about what innovative super apps can emerge on mobile phones in the AI era. But we believe that the greatest potential of AI is definitely not on mobile screens. The greatest potential lies in penetrating the digital world, taking over the digital world, and transforming the physical world.
We cannot view the future solely from the perspective of mobile internet. Generative AI's greatest potential is not in creating new super apps on mobile screens but in taking over the digital world and transforming the physical world.
Over the past thirty years, the essence of the internet wave has been connection. The internet connected people, information, commerce, and factories, improving the world's collaboration efficiency through connectivity, creating enormous value, and changing people's lifestyles. But generative AI creates new value through the supply of productivity, thereby creating greater intrinsic value for the world, essentially increasing the overall productivity level of the entire world. This value creation could be ten or dozens of times greater than the connectivity value of mobile internet.
We believe that generative AI will gradually penetrate and take over the digital world, and most things in the physical world will have AI capabilities, forming a new generation of products with AI abilities, and generating synergistic effects by connecting with the cloud-based AI-driven digital world.
For a long time, the focus of AI has mainly been on simulating human perception abilities, such as natural language understanding, speech recognition, and visual recognition. But the rise of generative AI has brought a qualitative leap. AI is no longer limited to perception but has, for the first time, demonstrated the power of thinking, reasoning, and creation.
Generative AI has given the world a unified language—tokens. These can be any text, code, image, video, sound, or human thoughts accumulated over thousands of years. AI models can tokenize data from the physical world to understand all aspects of the real world, such as human walking, running, driving vehicles, using tools, painting, composing, writing, expressing, teaching, programming skills, and even starting a company. After understanding, AI can mimic humans to perform tasks in the physical world. This will bring about a new industrial revolution.
We are seeing such transformations happening in the automotive industry. Previous autonomous driving technology relied on humans writing algorithm rules—hundreds of thousands of lines of code—still unable to cover all driving scenarios. After adopting "end-to-end" large model training, AI models directly learn massive human driving visual data, enabling cars to possess driving capabilities surpassing most drivers.
Robotics will be the next industry to undergo massive changes. In the future, all movable objects will become intelligent robots. They could be robotic arms in factories, cranes on construction sites, porters in warehouses, firefighters at fire scenes, and even pet dogs, nannies, or assistants in households.
In the future, factories will have many robots producing robots under the command of AI large models. Currently, every urban family might have one or two cars; in the future, each family may have two or three robots to help enhance efficiency in daily life.
It is foreseeable that the AI-driven digital world, connected with the physical world equipped with AI capabilities, will significantly boost the productivity of the entire world and have a revolutionary impact on the operational efficiency of the physical world.
Third, AI computing is accelerating its evolution, becoming the dominant computing architecture.
Whether we look at edge computing or the cloud world, this is a very evident trend. The reconstruction of the digital and physical worlds by generative AI will bring fundamental changes to computing architecture. Over the past few decades, the CPU-dominated computing system is accelerating its shift to a GPU-dominated AI computing system. In the future, almost all software and hardware will have inference capabilities. Their computing cores will shift to a model where GPU AI computing power is primary, supplemented by traditional CPU computing.
We observe that in the new computing power market, over 50% of new demand is driven by AI, and AI computing power demand has become mainstream. This trend will continue to expand. Over the past year, Alibaba Cloud has invested heavily in new AI computing power but still cannot meet the strong demand from customers.
Today, all the customers, developers, and CTOs we interact with are almost all reconstructing their products with AI. A large number of new demands are driven by GPU computing power, and many existing applications are being rewritten using GPUs. In industries such as automotive, biomedicine, industrial simulation, weather forecasting, education, enterprise software, mobile apps, and gaming, AI computing is accelerating its penetration. In all industries, an unseen new industrial revolution is quietly unfolding.
All industries require infrastructure with stronger performance, larger scale, and better adaptation to AI needs.
Alibaba Cloud is investing in AI technology research and development and infrastructure construction with unprecedented intensity. Our single-network cluster has expanded to the level of 100,000 GPUs, and we are rebuilding future-oriented advanced AI infrastructure from aspects like chips, servers, networks, storage, cooling, power supply, and data centers.
From historical experience, people often overestimate new technological revolutions in the short term and underestimate them in the long term. Because in the early stages of new technology applications, penetration rates are relatively low, and people have not experienced such events before, it's normal for most people's instincts to generate skepticism. But new technological revolutions will grow amidst people's doubts, causing many to miss out in their hesitation.
Standing at the dawn of the AI era, I feel incredibly excited. Today, we have invited entrepreneurs and scientists from three fields: large models, autonomous driving, and robotics. They are racing to reconstruct our world with AI, and we look forward to their wonderful sharing.
Thank you all. I hope you have a fulfilling and enjoyable APSARA Conference.