It looks like Deep (Convolutional) Neural Networks are really powerful. However, there are situations where they don’t deliver as expected. I assume that perhaps many are happy with pre-trained VGG, Resnet, YOLO, SqueezeNext, MobileNet, etc. models because they are “good enough”, even though they break quite easily on really realistic problems and require tons of training data. IMHO there are much smarter approaches out there, which are neglected/ignored. I don’t want to argue why they are ignored but I want to provide a list with other useful architectures.
Instead of staying with real numbers, we should have a look at complex numbers as well. Let’s remember the single reason why we use complex numbers (\(\mathbb{C}\)) or quaternions (\(\mathbb{H}\)). The most important reason why we use complex numbers is not to solve \(x^2 = -1\). The reason why we use complex numbers for everything that involves waves etc. is that we are lazy or efficient ;). Who wants to waste time writing down and solving a bunch of trignometric identities. The same is true for quaternions in robotics. Speaking in terms of computer science, we are using a much more efficient data structure/representation. It seems like complex valued neural networks as well as quaternion, which are a different kind of complex numbers for the mathematical correct reader of this post, seem to outperform real valued neural networks while using less parameters. This makes sense because we are using a different data structure that itself helps to represent certain things in a much more useful way. However, the down-side is that complex and quaternion neural networks can be a bit tricky to train since matrix inversion is quite slow and therefore they are slower to train.
Name | Reference | Implementation |
---|---|---|
Depthwise Separable Convolutions | Chollet (2016) | Part of Keras |
Squeeze Layers | Iandola et al. (2016) | Code on GitHub |
Capsule Neural Networks | Sabour et al. (2017) | Code on GitHub |
Capsule Graph Neural Networks | Xinyi and Chen (2019) | Code on GitHub |
Complex Neural Networks | Trabelsi et al. (2017) | Code on GitHub |
Complex Convolutional Neural Networks | Trabelsi et al. (2017) | Code on GitHub |
Complex Convolutional LSTMs | Trabelsi et al. (2017) | Code on GitHub |
Unitary (Evolutional) Recurrent Neural Networks | Arjovsky et al. (2015) | |
Full-Capacity Unitary Recurrent Neural Networks | Wisdom et al. (2016) | Code on GitHub |
Complex Evolutional Recurrent Neural Networks (ceRNNs) | Shafran et al. (2019) | |
Complex Gated Recurrent Units | Wolter and Yao (2018) | Code on GitHub |
Quaternion convolutional Neural Networks | Parcollet et al. (2018) | Code on GitHub |
Quaternion Recurrent layer | Parcollet et al. (2018) | Code on GitHub |
Quaternion LSTM layer | Parcollet et al. (2018) | Code on GitHub |
Quantum neural networks and optical neural networks are a different set of neural networks. They require specialized hardware to really run them. Hence, they are not a part of this list.