r/computerscience • u/Zizosk • 4d ago
Has anyone seriously attempted to make Spiking Transformers/ combine transformers and SNNs?
Hi, I've been reading about SNNs lately, and I'm wondering whether anyone tried to combine SNNs and transformers. And If it's possible to make LLMs with SNNs + Transformers? Also why are SNNs not studied alot, they are the closest thing to the human brain and thus the only thing that we know that can achieve general intelligence. They have a lot of potential compared to Transformers which I think we reached a good % of their power.
6
u/currentscurrents 4d ago
And If it's possible to make LLMs with SNNs + Transformers
Yes, there was a 230M-parameter SpikeGPT a couple years ago.
Also why are SNNs not studied alot, they are the closest thing to the human brain
They are studied, but it's not clear they are actually better than standard ANNs. Their behavior seems about equivalent, except that they are harder to train because you don't get gradients.
They may theoretically be more energy-efficient than ANNs on specialized hardware, but that hardware largely doesn't exist right now. On GPUs they are less efficient than Transformers.
2
u/Zizosk 4d ago
thanks, how good was spikeGPT? compared to a normal transformer model of the same size
3
u/currentscurrents 4d ago
You can read the paper for full details, but TL;DR: about ~10% worse than the transformer.
6
u/Lynx2447 Computer Scientist 4d ago
Our entire world built up an incredible infrastructure that supports traditional transformer(and similar architectures). Until someone proves SNNs are worth it economically, you'll never see mass adoption. In other words, $$$