Which Activation Function Should I Use?

Activation functions are an extremely important feature in neural networks as they decide whether a neuron should be activated or not.

Essentially, the role of an activation function is to produce a non-linear decision boundary via non-linear combinations of the weighted inputs.

And with so many different activation functions available, a natural follow-up question would be: which activation function should I use?

In a Nutshell

Listed in the order of recommendation:

  1. Exponential Linear Unit (ELU): keras.layers.ELU(alpha=1.0)
  2. Leaky ReLU + variants: keras.layers.LeakyReLU(alpha=0.01)
    • Randomized ReLU (RReLU)
    • Parametric Leaky ReLU (PReLU)
  3. ReLU
  4. tanh
  5. sigmoid/logistic


  • If you care a lot about runtime performance, then you may prefer Leaky ReLUs over ELUs. This is because the exponential function that the ELU uses makes it slower to compute than the ReLU and its variants.
  • If you don’t want to tweak yet another hyperparameter, you may just use the default alpha values, such as 0.01 for the Leaky ReLU and 1.0 for ELU.
  • If you have spare time and computing power, you can use cross-validation to evaluate other activation functions, in particular:
    • RReLU if your network is overfitting, or
    • PReLU if you have a huge training set.

If you enjoyed this post and want to buy me a cup of coffee...

The thing is, I'll always accept a cup of coffee. So feel free to buy me one.

Cheers! ☕️