AI company Cerebras Systems is set to revolutionize the advancement of machine learning by developing a network of interconnected supercomputers with excellent performance.

Expanding AI Training Capacity in Cloud

Cerebras CEO Andrew Feldman introduces his company's latest achievement, which involves the development of an AI supercomputer that can operate at two exaflops. The system is called Condor Galaxy 1, which is planned to double in size in the next 12 weeks. By early 2024, the system will be combined with two more networks that have doubled its size. The Silicon Valley company aims to extend Condor Galaxy installations next year until it achieves a network of nine supercomputers with a total capability of 36 exaflops.

Other AI-focused computer manufacturers plan to build massive systems around their specialized processors or the latest GPU H100 from Nvidia. Feldman claims that compared with these systems' sizes and capabilities, Condor Galaxy 1 stands out as the largest among them.

The company boasts its most significant advantages in developing large AI supercomputers with a focus on its ability to scale up resources. For instance, training a 40-billion-parameter network can be done about the same time as a 1-billion-parameter network if 40-fold more hardware resources were devoted to it without requiring additional lines of code. While linear scaling is known for being troublesome due to the difficulty in dividing up large neural networks for a more efficient operation, the company is confident that it can scale linearly from 1 to 32 with a keystroke.

Training large neural network models has become an increasing trend, with the number of training companies increasing from 2 in 2021 to over 100 in 2023. Aside from Cerebras, there are other training companies with 50 billion or more parameters, such as Google, Microsoft, Amazon, and Meta. The industry is almost dominated by computer clusters built around Nvidia GPUs, although some of these companies have also created their own silicon for AI. This strategy was applied in Google's TPU series and Amazon's Trainium. Meanwhile, some startup businesses serve as competitors to Cerebras as they make their own AI accelerators and computers. These companies include Samba Nova and Habana.

READ ALSO: World's Most Powerful Supercomputer, AI Techniques Can Determine How Extreme Stellar Conditions Produce Carbon-12

 

Features of Condor Galaxy 1 Series

In 2022, Cerebras noticed an exponential increase in large language models' size and computational demands (LLMs). Recognizing that LLMs will likely be the single largest opportunity in AI, the company manages and operates the Condor Galaxy 1 series, which is made available through the Cerebras Cloud.

The Condor Galaxy 1 series is owned by G42, an Abu Dhabi-based company with nine AI-based enterprises, including G42 Cloud, the largest cloud-computing provider in the Middle East region. Cerebras is not tasked with the operation of the supercomputers, and it can rent resources that G42 does not use for internal work.

The supercomputer was assembled and started up within ten days and featured 32 Cerebras CS-2 computers set to expand to 64. At the heart of each CS-2 is an AI-specific processor Waferscale Engine-2 composed of 2.6 trillion transistors and hundreds of thousands of AI cores made from silicon wafers.

RELATED ARTICLE: China's Supercomputer Tianhe-2 Reigns Supreme

Check out more news and information on Supercomputer in Science Times.