It’s no secret that Google has developed its own custom chips to accelerate its machine learning algorithms. The company first revealed those chips, called Tensor Processing Units (TPUs), at its I/O developer conference back in May 2016, but it never went into all that many details about them, except for saying that they were optimized around the company’s own TensorFlow machine-learning framework. Today, for the first time, it’s sharing more details and benchmarks about the project.
If you’re a chip designer, you can find all the
gory glorious details of how the TPU works in Google’s paper. The numbers that matter most here, though, are that based on Google’s own benchmarks (and it’s worth keeping in mind that this is Google evaluating its own chip), the TPUs are on average 15x to 30x faster in executing Google’s regular machine learning workloads than a standard GPU/CPU combination (in this case, Intel Haswell processors and Nvidia K80 GPUs). And because power consumption counts in a data center, the TPUs also offer 30x to 80x higher TeraOps/Watt (and with using faster memory in the future, those numbers will probably increase).
It’s worth noting that these numbers are about using machine learning models in production, by the way — not about creating the model in the first place.
Google also notes that while most architects optimize their chips for convolutional neural networks (a specific type of neural network that works well for image recognition, for example). Google, however, says, those networks only account for about 5 percent of its own data center workload while the majority of its applications use multi-layer perceptrons.
Google isn’t likely to make the TPUs available outside of its own cloud, but the company notes that it expects that others will take what it has learned and “build successors that will raise the bar even higher.”