Powering The Future Of Artificial Intelligence

AI’s rapid evolution is producing an explosion in new types of hardware accelerators for machine learning and deep learning. Some people refer to this as a “Cambrian explosion,” which is an apt metaphor for the current period of fervent innovation.

It refers to the period about 500 million years ago when essentially every biological “body plan” among multicellular animals appeared for the first time. From that point onward, these creatures, ourselves included, fanned out to occupy, exploit, and thoroughly transform every ecological niche on the planet.

The range of innovative AI hardware-accelerator architectures continues to expand. Although you may think that graphic processing units (GPUs) are the dominant AI hardware architecture, that is far from the truth.

Over the past several years, both startups and established chip vendors have introduced an impressive new generation of new hardware architectures optimized for machine learning, deep learning, natural language processing, and other AI workloads.

Chief among these new AI-optimised chipset architectures, in addition to new generations of GPUs, are neural network processing units (NNPUs), field programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and various related approaches that go by the collective name of neurosynaptic architectures.

Today’s AI market has no hardware monoculture equivalent to Intel’s x86 CPU, which once dominated the desktop computing space. That’s because these new AI-accelerator chip architectures are being adapted for highly specific roles in the burgeoning cloud-to-edge ecosystem, such as computer vision.

The evolution of AI-accelerator chips

To understand the rapid evolution taking place in AI-accelerator chips, it’s best to focus on the marketplace opportunities and challenges as follows.

AI tiers

To see how AI accelerators are evolving, look to the edge, where new hardware platforms are being optimised to enable greater autonomy for mobile, embedded, and internet of things (IoT) devices.

Beyond the proliferation of smartphone-embedded AI processors, one of the most noteworthy in this regard is innovation in AI robotics, which is permeating everything from self-driving vehicles to drones, smart appliances, and industrial IoT.

One of the most noteworthy developments in this regard is Nvidia’s latest enhancements to its Jetson Xavier AI line of AI systems on a chip (SOCs). Nvidia has released the Isaac software development kit to assist with building robotics algorithms that will run on its dedicated robotics hardware.

Reflecting the complexity of intelligent robotics, Jetson Xavier chip consists of six processing units, including a 512-core Nvidia Volta Tensor Core GPU, an eight-core Carmel Arm64 CPU, a dual Nvidia deep-learning accelerator, and image, vision, and video processors. These let it handle dozens of algorithms to help robots autonomously sense environments, respond effectively, and operate safely alongside human engineers.

AI tasks

AI accelerators are beginning to permeate every tier in distributed cloud-to-edge, high-performance computing, hyper-converged server, and cloud-storage architectures. A steady stream of fresh hardware innovations are coming to all these segments to support more rapid, efficient, and accurate AI processing.

AI hardware innovations are coming to market to accelerate the specific data-driven tasks of these distinct application environments. The myriad AI chipset architectures on the market reflect the diverse range of machine learning, deep learning, natural language processing, and other AI workloads that range from storage-intensive training to compute-intensive inferencing and involve varying degrees of device autonomy and person-in-the-loop interactivity.

To address the range of workloads that AI chipsets are being used to support, vendors are mixing a wide range of technologies in their product portfolios and even in specific embedded-AI deployments, such as the SOCs that drive intelligent robotics and mobile apps.

As an example, Intel’s Xeon Phi CPU architecture has been used to accelerate AI tasks. But Intel recognizes that it will not be able to keep up without specialized AI accelerator chips that let it compete head-on with Nvidia Volta (in GPUs) and with the legions of vendors building NNPUs and other specialised AI chips. Thus, Intel now has a product team working on a new GPU, to be released in the next two years.

At the same time, it continues to hedge its bets with AI-optimized chipsets several architectural categories: neural network processors (Nervana), FPGAs (Altera), computer-vision ASICs (Movidius), and autonomous-vehicle ASICs (MobilEye). It has also projects to build self-learning neuromorphic and quantum computing chips for next-generation AI challenges.

AI tolerances

Every AI-acceleration hardware innovations must be survivable in term of its ability to achieve metrics defined within the relevant operational and economic tolerances.

In operational metrics, every AI chipset must conform to the relevant constraints in terms of form factors, energy efficiency, heat and electromagnetic emission, and ruggedness.

In economic metrics, it must be competitive in performance and cost of ownership for the tiers and tasks into which it’s designed to be deployed. Comparative industry benchmarks will become a key factor in determining whether an AI-accelerator technology has the price-performance profile to survive in a hotly competitive marketplace.

In an industry that’s moving toward workload-optimised AI architectures, users will adopt the fastest, most scalable, most power-efficient and lowest-cost hardware, software, and cloud platforms to run their AI tasks, including development, training, operationalisation, and inferencing, in every tier.

The diversity of AI-accelerator ASICs

AI-accelerator hardware architectures are the opposite of a monoculture. They are so diverse and evolving so rapidly that it’s hard to keep up with the relentless pace of innovation in this market.

Beyond the core AI chipset manufacturers, such as Nvidia and Intel, ASICs for platform-specific AI workloads abound.

You can see this trend in several recent news items:

  • Microsoft is preparing an AI chip for its HoloLens augmented reality headset.
  • Google has a special NNPU, the Tensor Processing Unit, which is available for AI apps on the Google Cloud Platform.
  • Amazon is reportedly working on an AI chip for its Alexa home assistant.
  • Apple is working on an AI processor that will power Siri and FaceID.
  • Tesla is building an AI processor for its self-driving electric cars.

AI-accelerator benchmark frameworks are beginning to emerge

Cross-vendor partnerships in the AI-accelerator market are growing more complicated and overlapping. For example, consider how China-based tech powerhouse Baidu is partnering separately with Intel and Nvidia.

In addition to launching its own NNPU chip for natural language processing, image recognition, and autonomous driving, Baidu is partnering with Intel for FPGA-backed AI-workload acceleration in its public cloud, an AI framework for Xeon CPUs, an AI-equipped autonomous car platform, a computer-vision powered retail camera, and adoption of Intel’s nGraph hardware-agnostic deep neural network compiler.

This is all on the heels of equivalent announcements with Nvidia, including plans to bring Volta GPUs to the Baidu cloud, a tweak to Baidu’s PaddlePaddle AI development framework for Volta, and rollout of Nvidia-powered AI to the Chinese consumer market.

Sorting through this bewildering range of AI-accelerator hardware options, and combinations thereof, both on the cloud and in specialised SoCs, is growing more difficult every day. Isolating the AI-accelerator hardware’s contribution to overall performance on any given task can be tricky without a flexible benchmarking framework.

Fortunately, the AI industry is developing open, transparent, and vendor-agnostic frameworks for benchmarking for evaluating the comparative performance of different hardware/software stacks in the running of diverse workloads.

MLPerf

For example, the MLPerf open source benchmark group is developing of a standard suite for benchmarking the performance of machine learning software frameworks, hardware accelerators, and cloud platforms.

Available on GitHub and currently in a beta version, MLPerf provides reference implementations for some AI tasks that predominate in today’s AI deployments. It scopes the benchmarks to specific AI tasks (such as image classification) performed by specific algorithms (such as Resnet-50 v1) against specific data sets (such as ImageNet).

The core benchmark focuses on specific hardware/software deployments, such image-classification training jobs running in Ubuntu 16.04, Nvidia Docker, and CPython 2 on platforms built from 16 CPU chips, one Nvidia P100 Volta GPU, and 600 gigabytes of local disk.

The MLPerf framework is flexible enough so that conceivably the GPU-based image-classification training can be benchmarked against the same tasks running on a different hardware accelerator, such as the recently announced Baidu Kunlun FPGAs, but within a substantially equivalent software/hardware stack.

Other AI industry benchmarking initiatives also enable comparative performance evaluations on alternate AI-accelerator chips, as well as of other hardware and software components in deployments addressing the same tasks using the same models against the same training or operational data.

These other benchmarking initiatives include DawnBench, ReQuest, the Transaction Processing Performance Council’s AI Working Group, and CEAN2D2. They are all flexible enough to be applied to any AI workload task running in any deployment tier and measured against any economic tolerance.

EEMBC Machine Learning Benchmark Suite

Reflecting the move of AI workloads to the edge, some AI benchmarking initiatives are purely focused on measuring the performance of hardware/software stacks deployed to this tier. For example, the industry alliance EEMBC recently started a new effort to define a benchmark suite for machine learning executing in optimised chipsets running in power-constrained edge devices.

Chaired by Intel, EEMBC’s Machine Learning Benchmark Suite group will use real-world machine learning workloads from virtual assistants, smartphones, IoT devices, smart speakers, IoT gateways, and other embedded/edge systems to identify the performance potential and power efficiency of processor cores used for accelerating machine learning inferencing jobs.

The EEMBC Machine Learning benchmark will measure inferencing performance, neural-net spin-up time, and power efficiency of low-, moderate-, and high-complexity inferencing tasks. It will be agnostic to machine learning front-end frameworks, back-end runtime environments, and hardware-accelerator targets. The group is working on a proof-of-concept and plans to release its initial benchmark suite by June 2019, addressing a range of neural-net architectures and use cases for edge-based inferencing.

EEMBC Adasmark benchmarking framework

Addressing a narrower scope of the edge tier and tasks, EEMBC’s Adasmark benchmarking framework focuses on AI-equipped smart vehicles. Separate from its Machine Learning Benchmark effort, EEMBC is developing a separate performance measurement framework for AI chips embedded in advanced driver assistance systems.

The suite helps measure the performance of AI inferencing tasks executing in multi-device, multichip, multi-application smart-vehicle platforms. It benchmarks real-world inferencing workloads associated with highly parallel smart-vehicle applications, such as computer vision, autonomous driving, automotive surround view, image recognition, and mobile augmented reality. It measures inferencing performance across complex smart-car edge architectures, which usually include multiple specialized CPUs, GPUs, and other hardware-accelerator chipsets performing distinct tasks within a common chassis.

Emerging AI scenarios will require even more specialty chips

Almost certainly, other specialized AI-edge scenarios will emerge that require their own specialized chips, SoCs, hardware platforms, and benchmarks. The next great growth segment in AI chipsets may be in accelerating edge nodes for cryptocurrency mining, a use case that, alongside AI and gaming, has soaked up a lot of demand for Nvidia GPUs.

One vendor specialising in this niche is DeepBrain Chain, which recently announced an computing platform that can be deployed in distributed configurations to power high-performance processing of AI workloads and mining of cryptocurrency tokens. The mining stations come in two-, four-, and eight-GPU configurations, as well as standalone workstations and a 128-GPU customized AI HPC clusters.

Before long, we are almost certain to see a new generation of AI ASICs focused on distributed cryptocurrency mining.

Specialised hardware platforms are the future of AI at every tier and for every task in the cloud-to-edge world in which we live.

InfoWorld

You Might Also Read: 

New IoT Chips See, Think & Act Autonomously:

A Strategic Company: The Internet of Things & How ARM Fits In:

 

« Healthcare Cyber-Attacks Still Going Up
Internet Risks Failure As Sea Levels Rise »

CyberSecurity Jobsite
Perimeter 81

Directory of Suppliers

Alvacomm

Alvacomm

Alvacomm offers holistic VIP cybersecurity services, providing comprehensive protection against cyber threats. Our solutions include risk assessment, threat detection, incident response.

Authentic8

Authentic8

Authentic8 transforms how organizations secure and control the use of the web with Silo, its patented cloud browser.

Syxsense

Syxsense

Syxsense brings together endpoint management and security for greater efficiency and collaboration between IT management and security teams.

ManageEngine

ManageEngine

As the IT management division of Zoho Corporation, ManageEngine prioritizes flexible solutions that work for all businesses, regardless of size or budget.

FT Cyber Resilience Summit: Europe

FT Cyber Resilience Summit: Europe

27 November 2024 | In-Person & Digital | 22 Bishopsgate, London. Business leaders, Innovators & Experts address evolving cybersecurity risks.

Cloudera

Cloudera

Cloudera provide the world’s fastest, easiest, and most secure data platform built on Hadoop.

NXP Semiconductors

NXP Semiconductors

NXP is a world leader in secure connectivity solutions for embedded applications and the Internet of Things.

NTNU Center for Cyber & Information Security (NTNU CCIS)

NTNU Center for Cyber & Information Security (NTNU CCIS)

NTNU CCIS is a national centre for research, education, testing, training and competence development within the area of cyber and information security.

Awake Security

Awake Security

Awake Security offer a security solution built on an AI platform that acts like the human brain to sense, detect, and respond to threats you may not even know exist.

Alyne

Alyne

Alyne is a Munich based 2B RegTech offering organisations risk insight capabilities through a Software as a Service.

ISTC Foundation

ISTC Foundation

ISTC Foundation is one of the leading innovation centers in Armenia, founded by joint initiative of IBM, USAID, Armenian Government and Enterprise Incubator Foundation.

Carson McDowell

Carson McDowell

Carson McDowell are one of Northern Ireland's leading law firms. We are the law firm of choice for many of Northern Ireland's Top 100 companies as well as international companies doing business here.

NordLayer

NordLayer

NordLayer is an adaptive network access security solution for modern businesses — from the world’s most trusted cybersecurity brand, Nord Security. 

Auriga Consulting

Auriga Consulting

Auriga is a center of excellence in Cyber Security, Assurance and Monitoring Services, with a renowned track record of succeeding where others have failed.

Park Place Technologies

Park Place Technologies

Park Place Technologies' mission is to drive uptime, performance and value for critical IT infrastructure.

Pillar Technology Partners

Pillar Technology Partners

Pillar Technology Partners is an Information Security Company with a focus on improving Cyber Risk and optimizing the processes and technology that underpin the security of your information assets.

ThreatFabric

ThreatFabric

ThreatFabric integrates industry-leading threat intel, behavioral analytics, advanced device fingerprinting and over 10.000 adaptive fraud indicators.

ESProfiler

ESProfiler

Enterprise Security Profiler. Empowering CISOs with clarity & confidence in their security programme by visualising capabilities, usage and spend against their key threat priorities.

Foghorn Consulting

Foghorn Consulting

Foghorn can analyze your cloud to enhance performance and security, while reducing costs. Based on AWS’ 6 Pillars, our AWS WAFR Certified Engineers Will Identify Areas of Improvement.

InQuest

InQuest

InQuest specialize in providing comprehensive network-based security solutions that empower organizations to protect their most critical assets: their people.

SecureCyber

SecureCyber

Secure Cyber Defense offers industry-leading technology and managed detection and response solutions.