At the IO 17 conference on May 17, Google announced server innovations that will provide new power for machine learning in the cloud.
The second-generation Cloud Tensor Processing Units (TPU) are purpose-built hardware that enables massive computations needed for machine learning at scale. Tensorflow, an open-source project for machine learning that Google created, will be the big winner with the new TPUs.
The second-generation TPUs offer 180 teraflops of performance. Google is always interested in scale-out architectures and the TPUs are no exception. As part of the Google Compute Engine (GCE), Google is deploying TPU pods that are comprised of 64 second-generation TPUs.
The total capacity of a TPU pod is a staggering 11.5 petaflops of computing power.
"Using these TPU pods, we've already seen dramatic improvements in training times," Google engineers Jeff Dean and Urs Holzle wrote in a blog post. "One of our new large-scale translation models used to take a full day to train on 32 of the best commercially-available GPUs — now it trains to the same accuracy in an afternoon using just one eighth of a TPU pod."
New TensorFlow Research Cloud Announced As Well
Going a step further Google also announced the new TensorFlow Research Cloud, which provides a cluster that includes a thousand cloud TPUs. The TensorFlow Research Cloud is being made freely available to researchers.
"We hope the TensorFlow Research Cloud will allow as many researchers as possible to explore the frontier of machine learning research and extend it with new discoveries," Zak Stone, Product Manager for TensorFlow, wrote in a blog post.
Sean Michael Kerner is a senior editor at ServerWatch and InternetNews.com. Follow him on Twitter @TechJournalist.