Powering The Future Of Artificial Intelligence

AI’s rapid evolution is producing an explosion in new types of hardware accelerators for machine learning and deep learning. Some people refer to this as a “Cambrian explosion,” which is an apt metaphor for the current period of fervent innovation.

It refers to the period about 500 million years ago when essentially every biological “body plan” among multicellular animals appeared for the first time. From that point onward, these creatures, ourselves included, fanned out to occupy, exploit, and thoroughly transform every ecological niche on the planet.

The range of innovative AI hardware-accelerator architectures continues to expand. Although you may think that graphic processing units (GPUs) are the dominant AI hardware architecture, that is far from the truth.

Over the past several years, both startups and established chip vendors have introduced an impressive new generation of new hardware architectures optimized for machine learning, deep learning, natural language processing, and other AI workloads.

Chief among these new AI-optimised chipset architectures, in addition to new generations of GPUs, are neural network processing units (NNPUs), field programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and various related approaches that go by the collective name of neurosynaptic architectures.

Today’s AI market has no hardware monoculture equivalent to Intel’s x86 CPU, which once dominated the desktop computing space. That’s because these new AI-accelerator chip architectures are being adapted for highly specific roles in the burgeoning cloud-to-edge ecosystem, such as computer vision.

The evolution of AI-accelerator chips

To understand the rapid evolution taking place in AI-accelerator chips, it’s best to focus on the marketplace opportunities and challenges as follows.

AI tiers

To see how AI accelerators are evolving, look to the edge, where new hardware platforms are being optimised to enable greater autonomy for mobile, embedded, and internet of things (IoT) devices.

Beyond the proliferation of smartphone-embedded AI processors, one of the most noteworthy in this regard is innovation in AI robotics, which is permeating everything from self-driving vehicles to drones, smart appliances, and industrial IoT.

One of the most noteworthy developments in this regard is Nvidia’s latest enhancements to its Jetson Xavier AI line of AI systems on a chip (SOCs). Nvidia has released the Isaac software development kit to assist with building robotics algorithms that will run on its dedicated robotics hardware.

Reflecting the complexity of intelligent robotics, Jetson Xavier chip consists of six processing units, including a 512-core Nvidia Volta Tensor Core GPU, an eight-core Carmel Arm64 CPU, a dual Nvidia deep-learning accelerator, and image, vision, and video processors. These let it handle dozens of algorithms to help robots autonomously sense environments, respond effectively, and operate safely alongside human engineers.

AI tasks

AI accelerators are beginning to permeate every tier in distributed cloud-to-edge, high-performance computing, hyper-converged server, and cloud-storage architectures. A steady stream of fresh hardware innovations are coming to all these segments to support more rapid, efficient, and accurate AI processing.

AI hardware innovations are coming to market to accelerate the specific data-driven tasks of these distinct application environments. The myriad AI chipset architectures on the market reflect the diverse range of machine learning, deep learning, natural language processing, and other AI workloads that range from storage-intensive training to compute-intensive inferencing and involve varying degrees of device autonomy and person-in-the-loop interactivity.

To address the range of workloads that AI chipsets are being used to support, vendors are mixing a wide range of technologies in their product portfolios and even in specific embedded-AI deployments, such as the SOCs that drive intelligent robotics and mobile apps.

As an example, Intel’s Xeon Phi CPU architecture has been used to accelerate AI tasks. But Intel recognizes that it will not be able to keep up without specialized AI accelerator chips that let it compete head-on with Nvidia Volta (in GPUs) and with the legions of vendors building NNPUs and other specialised AI chips. Thus, Intel now has a product team working on a new GPU, to be released in the next two years.

At the same time, it continues to hedge its bets with AI-optimized chipsets several architectural categories: neural network processors (Nervana), FPGAs (Altera), computer-vision ASICs (Movidius), and autonomous-vehicle ASICs (MobilEye). It has also projects to build self-learning neuromorphic and quantum computing chips for next-generation AI challenges.

AI tolerances

Every AI-acceleration hardware innovations must be survivable in term of its ability to achieve metrics defined within the relevant operational and economic tolerances.

In operational metrics, every AI chipset must conform to the relevant constraints in terms of form factors, energy efficiency, heat and electromagnetic emission, and ruggedness.

In economic metrics, it must be competitive in performance and cost of ownership for the tiers and tasks into which it’s designed to be deployed. Comparative industry benchmarks will become a key factor in determining whether an AI-accelerator technology has the price-performance profile to survive in a hotly competitive marketplace.

In an industry that’s moving toward workload-optimised AI architectures, users will adopt the fastest, most scalable, most power-efficient and lowest-cost hardware, software, and cloud platforms to run their AI tasks, including development, training, operationalisation, and inferencing, in every tier.

The diversity of AI-accelerator ASICs

AI-accelerator hardware architectures are the opposite of a monoculture. They are so diverse and evolving so rapidly that it’s hard to keep up with the relentless pace of innovation in this market.

Beyond the core AI chipset manufacturers, such as Nvidia and Intel, ASICs for platform-specific AI workloads abound.

You can see this trend in several recent news items:

  • Microsoft is preparing an AI chip for its HoloLens augmented reality headset.
  • Google has a special NNPU, the Tensor Processing Unit, which is available for AI apps on the Google Cloud Platform.
  • Amazon is reportedly working on an AI chip for its Alexa home assistant.
  • Apple is working on an AI processor that will power Siri and FaceID.
  • Tesla is building an AI processor for its self-driving electric cars.

AI-accelerator benchmark frameworks are beginning to emerge

Cross-vendor partnerships in the AI-accelerator market are growing more complicated and overlapping. For example, consider how China-based tech powerhouse Baidu is partnering separately with Intel and Nvidia.

In addition to launching its own NNPU chip for natural language processing, image recognition, and autonomous driving, Baidu is partnering with Intel for FPGA-backed AI-workload acceleration in its public cloud, an AI framework for Xeon CPUs, an AI-equipped autonomous car platform, a computer-vision powered retail camera, and adoption of Intel’s nGraph hardware-agnostic deep neural network compiler.

This is all on the heels of equivalent announcements with Nvidia, including plans to bring Volta GPUs to the Baidu cloud, a tweak to Baidu’s PaddlePaddle AI development framework for Volta, and rollout of Nvidia-powered AI to the Chinese consumer market.

Sorting through this bewildering range of AI-accelerator hardware options, and combinations thereof, both on the cloud and in specialised SoCs, is growing more difficult every day. Isolating the AI-accelerator hardware’s contribution to overall performance on any given task can be tricky without a flexible benchmarking framework.

Fortunately, the AI industry is developing open, transparent, and vendor-agnostic frameworks for benchmarking for evaluating the comparative performance of different hardware/software stacks in the running of diverse workloads.

MLPerf

For example, the MLPerf open source benchmark group is developing of a standard suite for benchmarking the performance of machine learning software frameworks, hardware accelerators, and cloud platforms.

Available on GitHub and currently in a beta version, MLPerf provides reference implementations for some AI tasks that predominate in today’s AI deployments. It scopes the benchmarks to specific AI tasks (such as image classification) performed by specific algorithms (such as Resnet-50 v1) against specific data sets (such as ImageNet).

The core benchmark focuses on specific hardware/software deployments, such image-classification training jobs running in Ubuntu 16.04, Nvidia Docker, and CPython 2 on platforms built from 16 CPU chips, one Nvidia P100 Volta GPU, and 600 gigabytes of local disk.

The MLPerf framework is flexible enough so that conceivably the GPU-based image-classification training can be benchmarked against the same tasks running on a different hardware accelerator, such as the recently announced Baidu Kunlun FPGAs, but within a substantially equivalent software/hardware stack.

Other AI industry benchmarking initiatives also enable comparative performance evaluations on alternate AI-accelerator chips, as well as of other hardware and software components in deployments addressing the same tasks using the same models against the same training or operational data.

These other benchmarking initiatives include DawnBench, ReQuest, the Transaction Processing Performance Council’s AI Working Group, and CEAN2D2. They are all flexible enough to be applied to any AI workload task running in any deployment tier and measured against any economic tolerance.

EEMBC Machine Learning Benchmark Suite

Reflecting the move of AI workloads to the edge, some AI benchmarking initiatives are purely focused on measuring the performance of hardware/software stacks deployed to this tier. For example, the industry alliance EEMBC recently started a new effort to define a benchmark suite for machine learning executing in optimised chipsets running in power-constrained edge devices.

Chaired by Intel, EEMBC’s Machine Learning Benchmark Suite group will use real-world machine learning workloads from virtual assistants, smartphones, IoT devices, smart speakers, IoT gateways, and other embedded/edge systems to identify the performance potential and power efficiency of processor cores used for accelerating machine learning inferencing jobs.

The EEMBC Machine Learning benchmark will measure inferencing performance, neural-net spin-up time, and power efficiency of low-, moderate-, and high-complexity inferencing tasks. It will be agnostic to machine learning front-end frameworks, back-end runtime environments, and hardware-accelerator targets. The group is working on a proof-of-concept and plans to release its initial benchmark suite by June 2019, addressing a range of neural-net architectures and use cases for edge-based inferencing.

EEMBC Adasmark benchmarking framework

Addressing a narrower scope of the edge tier and tasks, EEMBC’s Adasmark benchmarking framework focuses on AI-equipped smart vehicles. Separate from its Machine Learning Benchmark effort, EEMBC is developing a separate performance measurement framework for AI chips embedded in advanced driver assistance systems.

The suite helps measure the performance of AI inferencing tasks executing in multi-device, multichip, multi-application smart-vehicle platforms. It benchmarks real-world inferencing workloads associated with highly parallel smart-vehicle applications, such as computer vision, autonomous driving, automotive surround view, image recognition, and mobile augmented reality. It measures inferencing performance across complex smart-car edge architectures, which usually include multiple specialized CPUs, GPUs, and other hardware-accelerator chipsets performing distinct tasks within a common chassis.

Emerging AI scenarios will require even more specialty chips

Almost certainly, other specialized AI-edge scenarios will emerge that require their own specialized chips, SoCs, hardware platforms, and benchmarks. The next great growth segment in AI chipsets may be in accelerating edge nodes for cryptocurrency mining, a use case that, alongside AI and gaming, has soaked up a lot of demand for Nvidia GPUs.

One vendor specialising in this niche is DeepBrain Chain, which recently announced an computing platform that can be deployed in distributed configurations to power high-performance processing of AI workloads and mining of cryptocurrency tokens. The mining stations come in two-, four-, and eight-GPU configurations, as well as standalone workstations and a 128-GPU customized AI HPC clusters.

Before long, we are almost certain to see a new generation of AI ASICs focused on distributed cryptocurrency mining.

Specialised hardware platforms are the future of AI at every tier and for every task in the cloud-to-edge world in which we live.

InfoWorld

You Might Also Read: 

New IoT Chips See, Think & Act Autonomously:

A Strategic Company: The Internet of Things & How ARM Fits In:

 

« Healthcare Cyber-Attacks Still Going Up
Internet Risks Failure As Sea Levels Rise »

CyberSecurity Jobsite
Perimeter 81

Directory of Suppliers

The PC Support Group

The PC Support Group

A partnership with The PC Support Group delivers improved productivity, reduced costs and protects your business through exceptional IT, telecoms and cybersecurity services.

Clayden Law

Clayden Law

Clayden Law advise global businesses that buy and sell technology products and services. We are experts in information technology, data privacy and cybersecurity law.

NordLayer

NordLayer

NordLayer is an adaptive network access security solution for modern businesses — from the world’s most trusted cybersecurity brand, Nord Security. 

Jooble

Jooble

Jooble is a job search aggregator operating in 71 countries worldwide. We simplify the job search process by displaying active job ads from major job boards and career sites across the internet.

ZenGRC

ZenGRC

ZenGRC - the first, easy-to-use, enterprise-grade information security solution for compliance and risk management - offers businesses efficient control tracking, testing, and enforcement.

CLUSIL

CLUSIL

CLUSIL is an association for the information security industry in Luxembourg.

Exida

Exida

Exida is a leading product certification and knowledge company specializing in industrial automation system safety, security, and availability.

National Cyber Summit (NCS)

National Cyber Summit (NCS)

The National Cyber Summit is the preeminent event for cyber training, education and workforce development aimed at protecting our nation's infrastructure from the ever-evolving cyber threat.

Gradcracker

Gradcracker

Gradcracker is THE careers website for Science, Technology (including Cybersecurity), Engineering and Maths university students in the UK.

Sky Republic

Sky Republic

Sky Republic offers a Smart Contract Platform to integrate and synchronize business networks beyond EDI and API.

BoldCloud

BoldCloud

BoldCloud's award winning Cybersecurity Advisory services and Layered Security approach adds new critical layers of protection for your data and your business.

Soliton

Soliton

Soliton is a leading Japanese technology company and a pioneer in IT security solutions for protecting company resources and data from external IT security threats.

Trusted Connectivity Alliance (TCA)

Trusted Connectivity Alliance (TCA)

Trusted Connectivity Alliance is a global, non-profit industry association which is working to enable a secure connected future.

FPT Software

FPT Software

As a leading technology service provider, FPT assists customers of all sizes and from any industries in implementing and adapting digital technologies including cybersecurity.

Wazuh

Wazuh

Wazuh is a free, open source and enterprise-ready security monitoring solution for threat detection, integrity monitoring, incident response and compliance.

Seccuri

Seccuri

Seccuri is a unique global cybersecurity talent tech platform. Use our specialized AI algorithm to grow and improve the cybersecurity workforce.

GetHacked.ca

GetHacked.ca

GetHackded.ca is a certified company offering penetration testing and specialized cybersecurity services.

ASMGi

ASMGi

ASMGi is a managed services, security and GRC solutions, and software development provider.

AddSecure

AddSecure

AddSecure is a leading European provider of secure IoT connectivity and end-to-end solutions.

Everfox

Everfox

Everfox (formerly Forcepoint Federal) has been defending the world's most critical data and networks against the most complex cyber threats imaginable for more than 25 years.

PureID

PureID

Protect your enterprise with PureAUTH #IAMFirewall, Resilient SSO platform, purpose built to provide Passwordless Authentication & Zero Trust Access, by default.