Use of GPUs in Edge AI Computing
An Artificial Intelligence (AI) accelerator accelerates artificial intelligence applications such as artificial neural networks and machine learning. In the last decade, graphics processing units (GPUs) have seen increasing adoption for these applications since they efficiently perform image processing and mathematical neural network calculations. Fortunately for AI development, GPU manufacturers such as NVIDIA are making GPUs that greatly enhance AI performance, opening the door to many new computing products at the edge. NVIDIA products have led the market with innovation and are widely used in AI computing. However, other specialty GPU providers offer enhanced performance on portions of AI computing, such as AI inferencing.
What is a GPU?
A modern Graphical Processing Unit or GPU is similar to a CPU but makes use of parallel processing and is able to handle many processes and threads at the same time. Due to its parallel processing, a GPU is normally used for graphics processing and rendering.
GPUs have been around since the 1970s, primarily used for arcade games such as Sea Wolf and Space Invaders. Graphics cards were not commonly used in PCs until the mid-1980s, when the NEC μPD7220A became the first processor with a Large Scale Integration circuit chip, making it the most popular GPU. The next significant innovation was in the 1990s with S3 Graphics making the S3 86C911. This new GPU used 2D acceleration to gain a massive performance increase compared to its competitors. Today, a GPU is one of the most crucial hardware components of computer architecture.
Initially, the purpose of a video card was to take a stream of binary data from the central processor and render images to display. But modern graphics processing units are engaged in the most complex calculations, like big data research, machine learning, and AI.
While AI has existed since the 1990s in inference and training, AI accelerators did not enter the market until 10 years ago, when the workloads of AI processes became more intensive. Since 2010, AI accelerators such as field-programmable gate arrays (FGPA) and customized application-specific integrated circuits, (ASIC) are being replaced by commercial GPUs, which offer faster time to market and lower development costs.
The GPU Market
Currently, the GPU industry is dominated by three large companies: NVIDIA, Intel, and AMD. These companies have been around for the longest, with NVIDIA pioneering the discrete GPU market since 2000 and putting them into a firm leadership position. New entrants into the GPU market are picking a product niche and developing solutions that are efficient and designed to fill the appetite for higher performance with lower costs per watt of power. Up to 90% of the power consumed by an image processing application is accessing the RAM. Today, there are new entrants with differing approaches to building more efficient GPUs, including Mythic.AI, UntetherAI, Hailo, Blaize, etc., for edge AI computing. These companies focus on avoiding data transfer between computing and memory to drive efficiency.
GPU vs CPU
In the past, CPUs were used to process information for artificial intelligence. However, as GPUs have become more powerful over time, their ability to process more parallel information faster has made them a much better solution for AI.
CPU | GPU | |
Amount of cores | 10’s of cores | 100’s to 1000’s of cores |
Processing focus | Low latency | High throughput |
Processing | Serial processing for many tasks | Excellent parallel processing of the same task |
Parallel tasks | Performs multiple processes at once | Performs 1000’s of processes at once |
Architecture | MIMD (Multi-instruction, multiple data streams) | SIMD (Single instruction, multiple data streams) or SIMT (Single instruction, multiple threads) |
Cost and Availability | More readily available, more widely manufactured, and cost-effective for consumer and enterprise use | Still significantly more expensive, this cost rises more when talking about a GPU built for specific tasks like mining or analytics. |
Compatibility | Not every system or software is compatible with every processor. | Compatible with all systems |
Integrating a GPU into an application:
GPUs are typically shipped as modules to be easily implemented into various applications. Chip-down GPU designs are intensive hardware and software projects; also, GPU vendors historically will only support Tier 1 customers and projects and encourage the rest of the applications to design a carrier that integrates their modules.
Here are some common GPU module form factors:
PCIe
- This is the first and still most common form factor for GPU modules and is easily integrated with common PC motherboards with up to x16 Gen5 PCIe connections. However, the disadvantage is that they are large and bulky for many edge AI applications and typically require forced air to cool them.
MXM (Mobile Express Module)
- As the name indicates, the MXM form factor was developed to offer graphics processing module capabilities to smaller mobile computers such as laptops. It is also commonly used in edge SFF computing applications in markets such as military, medical, and transportation. The modules can be air cooled or conduction cooled.
M.2
- The M.2 standard replaces the mSATA standard and offers size and speed advantages for storage. Gen4 PCIe x4 connections to the processor allow it to be also widely used for other computing functions, including GPS, LTE, IO, and smaller GPUs by vendors such as Hailo.
E1.S EDSFF (Enterprise and Datacenter Small Form Factor)
- E1.S is the choice of next-generation storage modules. It offers greater density and performance than the M.2 and other earlier form factors.
- Although it is being deployed in the datacenter server industry, it has not yet replaced M.2 as a standard in SFF Edge Computing.
- Blaize is an example of a GPU being deployed in the ES.1 SFF, enabling up to 512 TOPS in a 1U server and 16-64 TOPS in an SFF computer.
SOM (System on Modules)
- System on Modules has been made popular for industrial edge computing by NVIDIA in products such as JETSON.
- Embedded ARM processors in the SOM enable a solutions provider to build a smaller, low-cost edge AI or graphics processing computer without needing a separate embedded CPU.
Chip-Down
- With many edge platforms, it is the most cost-effective to design a solution with a ‘chip-down’ GPU and a ‘chip-down’ CPU. Tauro Technologies has designed these systems for several of the GPU vendors mentioned above.
GPU integration into a carrier board:
System design takes many factors into consideration, including cooling, power, I/O, storage, processing, etc., and each system requirement is different. Typically, GPU modules that are not chip-down are integrated into a main board or carrier board that has been customized to meet the application requirements.
There are 2 main types of processors integrated into carrier boards:
- Intel x86: Due to the engineering complexity of Intel chip-down designs, many edge AI applications choose to use COMe or COM-HPC client-based modules to simplify their carrier design projects. Although the material cost of the module is higher than a chip-down solution, unless the product reaches modest volumes, it is more cost-effective to design with an x86-based module.
- ARM-based processors: ARM processors are typically lower cost and take less power, making them ideal for many computer vision products when paired with a GPU. Since there are few modules available based on open standards, most ARM-based products are custom designed with a chip-down processor such as NXP Cortex or Layerscape.
Summary:
The advent of AI is opening the door for application-specific GPUs that are finely tuned to the objectives of the project. Tauro Technologies has broad experience implementing edge computers optimized for the application. If you wish to discuss a customized high-volume platform or a system which takes advantage of commercially available hardware and tools, reach out to us.