Deep Learning Hardware Heats Up with Graphcore Securing Series A Funding


Since publishing our Artificial Intelligence Market Forecasts report in August, Tractica has received a lot of interest and inquiries from established semiconductor companies and startups about how artificial intelligence (AI) will shape hardware requirements for the nearly 200 use cases identified in the report. The majority of semiconductor players have been slow to move, which has allowed NVIDIA to get ahead of the curve with its graphics processing unit (GPU) processors. At the same time, application-specific integrated circuit (ASIC), field programmable gate array (FPGA), and digital signal processor (DSP) suppliers do not want to be left behind. As suggested in my recent “Deep Learning Hardware Battle” blog post, the evolving nature of algorithms and workloads will determine what architecture is best suited for deep learning. Deep learning is the most promising AI technology so far, delivering impressive results, especially in human perception-based use cases, such as vision and language.

Architecture Customization

With many of the established players trying to catch up, and probably being held back by legacy silicon architectures, there is room for startups that are looking to build new silicon architectures customized for AI and deep learning applications. Graphcore is one such U.K. startup that recently came out of stealth mode and has now secured a $30 million Series A round of venture funding. The funding round was led by Robert Bosch Venture Capital GmbH and the Samsung Catalyst Fund, alongside Amadeus Capital Partners, C4 Ventures, Draper Esprit plc, and Foundation Capital. Graphcore was spun out of XMOS, a Bristol-based microcontroller company, which, unsurprisingly, has many of the same investors as Graphcore. It has spent the last 2 years developing the architecture, and has seasoned leadership in Nigel Toon (CEO) and Simon Knowles (CTO), who are semiconductor industry veterans, with their biggest success being Icera, which sold to NVIDIA for $435 million.

Graphcore’s technology is known as the intelligent processing unit (IPU), which is a massively parallel, low-precision, floating point compute architecture with the capability to store machine learning models in the processor itself, rather than in a separate memory unit. Unlike some of the other architectures, such as GPUs, which are constrained by fixed form of parallelism, the IPU is well-suited for applications that are dependent on context, and rely on temporal data, such as in the case of recurrent neural networks (RNNs). According to Graphcore, its technology can increase performance for both training and inference between 10x and 100x compared to existing systems, using less power and for less cost. Its first product, expected to launch in 2017, is aimed at the cloud, which suggests that its first step is to accelerate training of AI and deep learning algorithms. It also has plans for inference-based, embedded application solutions in the near future, possibly in 2018. Some of the initial AI applications where Graphcore sees usage include autonomous vehicles, natural language processing (NLP), and personalized medicine. All of these applications are temporal in nature, with dynamic inputs needing large memory storage. Graphcore’s embedded application solution will expand these use cases to include AI-powered mobile devices and collaborative robots.

Is Graphcore the Next Acquisition Target?

Graphcore is not alone though, and could very soon be an acquisition target. Intel acquired Nervana, and Wave Computing with its Dataflow Processing Unit (DPU), which uses the Hybrid Memory Cube (HMC) approach, had been suggested as another hot acquisition target. Knupath is the other company that is pushing its neural chip architecture as an alternative to GPUs, FPGAs, or central processing units (CPUs).

One interesting development has been the recent release of Baidu’s new open-source deep learning benchmark tool called DeepBench. Using DeepBench, one can compare hardware architectures, rather than AI models, to help understand bottlenecks in operations. It would be very useful to have these architectures, both legacy and new architectures, being compared on DeepBench to provide a better understanding of the pros and cons of different hardware approaches. For now, DeepBench has benchmarked NVIDIA’s TitanX, M40, TitanX Pascal, and Intel’s Knights Landing. The next step would be to include some of the alternative architectures like Graphcore, Nervana, and Knupath.

Comments are closed.