Gaussian Error Linear Units (GELUs)
This software allows users to reproduce the results in Gaussian Error Linear Units (GELUs), Dan Hendrycks and Kevin Gimpel 2016.
GELU Approximations
The sigmoid(1.702 * x) * x
approximation is fast but is somewhat inaccurate. Meanwhile 0.5 * x * (1 + tanh(x * 0.7978845608 * (1 + 0.044715 * x * x)))
is slower but more accurate.
However, exact versions are now available in pytorch, so approximations are no longer necessary for suitable speed.
Execution
Please install Tensorflow, Lasagne, and Python 3+.
Citation
If you find this useful in your research, please consider citing:
@article{hendrycks2016gelu,
title={Gaussian Error Linear Units (GELUs)},
author={Hendrycks, Dan and Gimpel, Kevin},
journal={arXiv preprint arXiv:1606.08415},
year={2016}
}