Biologically Sound Neural Networks for Embedded Systems Using OpenCL


Guest post by Anita Sobe

Presenting our paper

I. Fehervari, A. Sobe, W. Elmenreich. Biologically Sound Neural Networks for Embedded Systems Using OpenCL. Proceedings of the International Conference on NETworked sYStems (NETYS 2013), Marrakech, Morocco, Springer 2013.

in the format of a short announcement was an interesting challenge. The task was to get the other researchers to read our paper by only talking about it for 5 minutes. Furthermore, the audience was wide-ranged from all topics of distributed systems. So, I had to introduce spiking neural networks and the motivation for using them on a distributed embedded system before pointing to the approach of implementing them with OpenCL:


Spiking Neural Network Model

Neural networks are widely used in machine learning and many implementations exist to process images, process information, etc. Biologically sound neural networks are more powerful than standard ANN models, because the encoding is done in a spike train, conveying also information in the time domain.

Thus, spiking neural networks have nice properties, but they require significant computing power to emulate them.


Example structure with 10x10x10 neurons. Typical structures are much larger requiring a high number of parallel calculations

For embedded systems, computation is a critical resource. We propose to use OpenCL for massive parallelization of the neural network model. OpenCL is a framework for programming software running on GPUs. But this is not enough, the most complex part comes from updating neurons and the state of the influenced neighbors. We therefore propose a connection model  where each neuron is only connected to its neighbors, up to a given hop distance. Using this model we were able to simulate 1 million neurons instead of 100.000 (which is big for usual networks). The performance gain is already excellent, but we even went further.

Performance gain

Performance gain

OpenCL supports local memory for so called task groups and a second-level shared memory for all tasks. Shared memory is slower, therefore, we redesigned the implementation in such a way that it only uses the local memory of OpenCL. This final measure improves the latency well enough to run our system with a high number of neurons on an embedded node such as a robot or a smart camera attached to a drone.

About Wilfried Elmenreich

Understanding the communication networks of the future.
This entry was posted in Embedded Software, Sensor Fusion, Uncategorized and tagged , , , . Bookmark the permalink.

1 Response to Biologically Sound Neural Networks for Embedded Systems Using OpenCL

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s