Three chips, tailor-made by NVIDIA

tech

NVIDIA takes "meeting customer needs" to new heights.

According to reports, the world's largest supplier of graphics chips, NVIDIA, will mass-produce three streamlined versions of artificial intelligence (AI) chips for mainland Chinese customers in the second quarter of this year.

The California-based company originally planned to launch three AI chips—H20, L20, and L2—for the Chinese market in November last year. However, the launch was postponed to 2024 as the company wanted to check if these chips complied with U.S. export controls.

According to an article published on the Canadian tech news website Wccftech, these three chips fully comply with U.S. export policies and will be produced in the second quarter of this year.

The report states that the first batch of H20 chips may be delivered to customers in the mid-to-late second quarter of 2024.

"In terms of parameters, the performance density and computing power of the H20 are in line with U.S. export policies," said a Chinese author from a Shenzhen company in an article published on Tuesday. He stated that in FP8 Tensor Core operations, the H20's speed is 296 trillion floating-point operations per second (teraflops or tflops), while the H100 is at 1979 tflops, and the H200 is at 3958 tflops. The H200 is the world's most powerful AI chip, 13 times faster than the H20.

Advertisement

At the same time, published reports indicate that the H20 is a relatively refined machine. Semianalysis analyst Dylan Patel stated in an article published in November last year that in terms of large language model (LLM) inference, the H20 is actually more than 20% faster than the H100, which can be used to generate content using very large datasets. He said that while the H100 is 6.68 times faster than the H20, when measuring its performance, people should also consider MFU (model FLOP utilization) or actual utilization.

Since the H100's MFU is only 38.1%, while the H20 can reach 90%, the H20's performance in an actual multi-GPU interconnect environment is close to 50% of the H100.

Other technical experts have expressed that the H20 has an advantage in power consumption because its thermal design power is 400 watts, lower than the H100's 700 watts.Chinese Market

This legend dates back to August 2022 when the Biden administration prohibited the export of Nvidia's A100 and H100, as well as AMD's MI250 chips, to Mainland China and Russia, due to these chips' high interconnect bandwidth—600 GB per second or higher.

In response to the Mainland China market, Nvidia subsequently launched the A800 and H800 processors, which operate at speeds of 400 GB and 300 GB per second, respectively. IT experts have stated that the performance of the A800 and H800 is approximately 70% that of the A100 and H100.

On October 17, 2023, the Bureau of Industry and Security (BIS) of the U.S. Department of Commerce announced that it would use "performance" and "performance density" as new parameters to classify restricted chips. According to the new regulations, Nvidia's A800, H800, L40, L40S, and RTX 4090 chips are prohibited from being shipped to Mainland China. Nvidia hopes to fill the resulting gap by shipping the H20 to the country.

Some analysts have indicated that if this new chip can achieve 50% of the H100's speed while consuming 43% less power, it may be attractive to Chinese customers.

"Although the H20's computing power is lower than the H100's, it will be more affordable and support Nvidia's special features, such as NVLink and the CUDA platform," said Ming-Chi Kuo, a technology analyst at TF International Securities Group Limited, headquartered in Hong Kong: "Chinese customers remain highly interested in the H20 chip."

Latest Special Specification Graphics Card

Nvidia recently released a special specification graphics card for China, named GeForce RTX 4090D, equipped with AD102-250, replacing the flagship product GeForce RTX 4090 that is restricted for export.

On October 17, 2023, the United States imposed strict restrictions on the export of artificial intelligence-related chips and semiconductor manufacturing equipment to Mainland China, leading to the limited sale of Nvidia's high-end gaming graphics card, the GeForce RTX 4090, in Mainland China. To address this issue, it was decided to develop a customized GeForce RTX 4090 D graphics card, which meets the U.S. export control requirements by reducing some specifications.

In response to the latest U.S. government export controls on Chinese artificial intelligence chips, the RTX 4090 D needs to meet the comprehensive computing performance (TPP) limit of 4800. The TPP of the RTX 4090, whether in FP8 or FP16, is 5286, which is about 10% higher than the restricted value.Normally, NVIDIA would need to appropriately reduce specifications from the RTX 4090, most directly by reducing the number of SMs, Tensor cores, and CUDA configurations. However, the RTX 4090D must maintain a certain distance between itself and the RTX 4080 SUPER.

NVIDIA has stated that it will continue to fully comply with U.S. regulations. Indeed, NVIDIA does provide services to Mainland Chinese customers in Singapore, including ByteDance, Tencent's international cloud business, and Alibaba Group. The filing shows that NVIDIA's sales to customers in Singapore (including Chinese enterprises) account for about 15% of total revenue.

Both the RTX 4090 D and the RTX 4090 use TSMC's N4 process. If approved by the U.S. Department of Commerce, it could become the key to NVIDIA's market turnaround in China.

The advantage over Chinese competitors is narrowing. The H20 still has advantages over domestic Chinese AI chips in terms of performance and efficiency, but this advantage is diminishing. With policy and financial support, many domestic Chinese chip manufacturers are growing rapidly and may one day break NVIDIA's monopoly in the AI chip market.

In fact, some Mainland Chinese technology companies have already turned to using local chips.

The pressure on NVIDIA may be easing. On December 2, 2023, U.S. Commerce Secretary Gina Raimondo stated at a forum that if any U.S. company redesigns its chips around a specific cut-off line to enable Chinese companies to perform artificial intelligence, the U.S. government will control it the next day.

It cannot be ruled out that if the H20's performance does indeed reach 50% of the H100's, Raimondo may further tighten export rules.