Fastnet: A Distributed Convolutional Neural Network on GPU clusters

Convolutional Neural Networks (CNN) have shown stunning results in image recognition in recent years. Unfortunately, they are extremely expensive to train: the winning CNN in ImageNet 2013 took more than a week to train on a single machine with two GPUs. The goal of Fastnet is to build a distributed framework to parallelize the training of convolutional neural networks across multiple GPUs over a fast network interconnect. Fastnet can leverage different existing CNN backends such as cuda-convnet and Caffe.

People:

Russell Power
Justin Lin
Jinyang Li (PI)

Publications:

Distributed Convolutional Neural Networks across GPU clusters.

under submission

Source code: on github

Acknowledgements:
We thank NSF and the team building and maintaining the NSF PRObE Susitna testbed. The Fastnet project uses the Susitna testbed to scale the CNN training across a GPU cluster of many machines.