Machine learning and deep nets are becoming pervasive in many desktop and mobile applications; however, running deep nets on small embedded devices remains a challenge. Waleed Abdulla will describe how to train deep net models on desktop GPU systems and then export them on to small embedded devices like the Raspberry PI.
Some of the challenges include choosing the right implementation on the embedded device (CUDNN, TensorFlow, etc.) and using other tricks that can improve performance (using half-precision parameters). Waleed will discuss training and deploying an end-to-end deep net on an embedded device.