Capsule Networks and other neural architectures that are less known

It looks like Deep (Convolutional) Neural Networks are really powerful. However, there are situations where they don’t deliver as expected. I assume that perhaps many are happy with pre-trained VGG, Resnet, YOLO, SqueezeNext, MobileNet, etc. models because they are “good enough”, even though they break quite easily on really realistic problems and require tons of training data. IMHO there are much smarter approaches out there, which are neglected/ignored. I don’t want to argue why they are ignored but I want to provide a list with other useful architectures.

Instead of staying with real numbers, we should have a look at complex numbers as well. Let’s remember the single reason why we use complex numbers (\(\mathbb{C}\)) or quaternions (\(\mathbb{H}\)). The most important reason why we use complex numbers is not to solve \(x^2 = -1\). The reason why we use complex numbers for everything that involves waves etc. is that we are lazy or efficient ;). Who wants to waste time writing down and solving a bunch of trignometric identities. The same is true for quaternions in robotics. Speaking in terms of computer science, we are using a much more efficient data structure/representation. It seems like complex valued neural networks as well as quaternion, which are a different kind of complex numbers for the mathematical correct reader of this post, seem to outperform real valued neural networks while using less parameters. This makes sense because we are using a different data structure that itself helps to represent certain things in a much more useful way. However, the down-side is that complex and quaternion neural networks can be a bit tricky to train since matrix inversion is quite slow and therefore they are slower to train.

Name	Reference	Implementation
Depthwise Separable Convolutions	Chollet (2016)	Part of Keras
Squeeze Layers	Iandola et al. (2016)	Code on GitHub
Capsule Neural Networks	Sabour et al. (2017)	Code on GitHub
Capsule Graph Neural Networks	Xinyi and Chen (2019)	Code on GitHub
Complex Neural Networks	Trabelsi et al. (2017)	Code on GitHub
Complex Convolutional Neural Networks	Trabelsi et al. (2017)	Code on GitHub
Complex Convolutional LSTMs	Trabelsi et al. (2017)	Code on GitHub
Unitary (Evolutional) Recurrent Neural Networks	Arjovsky et al. (2015)
Full-Capacity Unitary Recurrent Neural Networks	Wisdom et al. (2016)	Code on GitHub
Complex Evolutional Recurrent Neural Networks (ceRNNs)	Shafran et al. (2019)
Complex Gated Recurrent Units	Wolter and Yao (2018)	Code on GitHub
Quaternion convolutional Neural Networks	Parcollet et al. (2018)	Code on GitHub
Quaternion Recurrent layer	Parcollet et al. (2018)	Code on GitHub
Quaternion LSTM layer	Parcollet et al. (2018)	Code on GitHub

Quantum neural networks and optical neural networks are a different set of neural networks. They require specialized hardware to really run them. Hence, they are not a part of this list.