CnTechPost CnTechPost
  • News
    • Tech Industry
    • Gadgets
    • Software
    • Stocks
    • Crypto
    • Cars
    • Software
    • 5G
    • How To
  • Contact
  • About
  • CnEVPost

Home ยป Tech Industry

Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs

By Phate Zhang
Sep 19, 2020 at 9:58 PM UTC
0
0

Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost

A year ago, Alibaba unveiled its first AI chip, the Hanguang 800, the most powerful AI reasoning chip at the time. The company recently shared an update on the Hanguang 800's progress nearly a year after its launch.

Long Xin, director of heterogeneous computing product development at Aliyun (Alibaba Cloud) said that Hanguang 800 NPU instances are now officially available for external service and can be purchased on Aliyun instances without whitelisting.

Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost

Hanguang 800 is not available to the public and its performance is exported through Aliyun.

It supports up to 8-core NPU and 96-core vCPU, 384G memory, network bandwidth up to 30Gbit/s, mainly for data center CNN type model inference acceleration, services including city brain, image video audit, Long Xin said.

Alibaba chipmaking arm expected to become a major TSMC customer

Long Xin said the Hanguang 800 features three aspects of hardware, including:

Deep optimization of CNNs and visual class algorithms.

Accelerated convolution and matrix multiplication with support for inverse convolution, hole convolution, 3D convolution, interpolation, and ROI.

Optimization for ResNet-50, SSD/DSSD, Faster-RCNN, Mask-RCNN, DeepLab, and other models.

High energy efficiency and low latency.

High density compute and storage, greatly reducing I/O requirements.

Soft-hard synergy supports sparse compression of weights and quantitative compression of computations.

The instruction set supports programmable model extensions.

Alibaba Cloud completes three new super data centers, will add over a million servers

In addition to INT8/INT16 quantization acceleration, it also covers FP16/BFP16 vector calculations that directly accelerate various ReLu, Sigmoid, Tanh, etc., as well as supporting new activation functions in the future.

Long Xin emphasized that Hanguang 800's applications are mainly on the data center and large end, focusing on CNN-like model inference acceleration, which can be extended to other DNN models. Currently, there are 4-11 times performance improvement compared to GPU in specific applications.

Long Xin said that in pedestrian detection application, 4-core Hanguang 800 can support 100-channel video, which is 4 times better than the mainstream GPU 25-channel inference performance.

In-vehicle detection, 4-core Hanguang 800 can support 85 video channels, which is 8.5 times better than the mainstream GPU supporting 10 channels of reasoning performance.

In the ResNet50 V2 model for content recognition applications such as live streaming, short videos, and product information streams, the Hanguang 800 (4-core) can reach a frame rate of 20,000 FPS, which is 11 times faster than the 1800 FPS performance of mainstream inference GPUs.

In the Inception V4 model, Hanguang 800 (4-core) can process frame rate up to 5000 FPS, which is 10.8 times higher than the 460 FPS performance acceleration ratio of mainstream inferred GPU.

Alibaba's chip arm is developing chips for wireless headsets, speakers

More on Tech Industry

Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Chinese video platform iQIYI reportedly to lay off 20-40% of its workforce
Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Xiaomi's MIUI surpasses 500 million monthly active users worldwide
Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Huawei posts sales revenue of about $71.3 billion in first 3 quarters
Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Luckin Coffee posts H1 net revenue of $492.9 million, up 106% year-on-year
Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Alibaba unveils ARM server chip Yitian 710, boasting strongest performance in the industry
Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Alibaba reportedly to release Arm server chip
Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Appliance giant Midea unveils OpenHarmony 2.0-based IoT system
Alibaba's Hanguang 800 AI processor has been put to use, delivering 4-11x better performance than GPUs-CnTechPost
Honor resumes ties with Google, Honor 50 series to carry GMS
AlibabaAlibaba CloudAliyunChina TechChina-made ChipsHanguangHanguang 800

Recent Posts

  • DeepSeek makes minor upgrades to its R1 reasoning model May 29, 2025
  • Chinese video platform iQIYI reportedly to lay off 20-40% of its workforce Dec 1, 2021
  • Xiaomi's MIUI surpasses 500 million monthly active users worldwide Nov 24, 2021
  • Education stocks soar with reports that China will resume after-school tutoring Nov 8, 2021
  • Huawei posts sales revenue of about $71.3 billion in first 3 quarters Oct 29, 2021
CnTechPost CnTechPost
CnTechPost.com
  • Home
  • Tech
  • Gadgets
  • Software
Subscribe
  • RSS Feed
About
  • About Us
  • Contact Us
  • Privacy Policy
Copyright ยฉ 2025 CnTechPost.