How to implement the ReLU function in Numpy

Question

I want to make a simple neural network which uses the ReLU function. Can someone give me a clue of how can I implement the function using numpy.

Sid · Accepted Answer · 2015-08-20T12:08:11.360

170

There are a couple of ways.

>>> x = np.random.random((3, 2)) - 0.5
>>> x
array([[-0.00590765,  0.18932873],
       [-0.32396051,  0.25586596],
       [ 0.22358098,  0.02217555]])
>>> np.maximum(x, 0)
array([[ 0.        ,  0.18932873],
       [ 0.        ,  0.25586596],
       [ 0.22358098,  0.02217555]])
>>> x * (x > 0)
array([[-0.        ,  0.18932873],
       [-0.        ,  0.25586596],
       [ 0.22358098,  0.02217555]])
>>> (abs(x) + x) / 2
array([[ 0.        ,  0.18932873],
       [ 0.        ,  0.25586596],
       [ 0.22358098,  0.02217555]])

If timing the results with the following code:

import numpy as np

x = np.random.random((5000, 5000)) - 0.5
print("max method:")
%timeit -n10 np.maximum(x, 0)

print("multiplication method:")
%timeit -n10 x * (x > 0)

print("abs method:")
%timeit -n10 (abs(x) + x) / 2

We get:

max method:
10 loops, best of 3: 239 ms per loop
multiplication method:
10 loops, best of 3: 145 ms per loop
abs method:
10 loops, best of 3: 288 ms per loop

So the multiplication seems to be the fastest.

edited Aug 20 '15 at 12:08

answered Aug 20 '15 at 04:22

Sid

5,662
2
15
18

28

+1. I took the liberty to add some timeit results to your answer. Please feel free to edit them or revert the edit if you wish. – IVlad Aug 20 '15 at 09:17
11

np.maximum(x, 0, x) runs fastest here. – Daniel S. May 29 '16 at 17:37
5

@DanielS. For the future reader: The last `x` in `maximum(x, 0, x)` means "please change `x` in place rather than allocating a new matrix". ([source](https://numpy.org/doc/stable/reference/generated/numpy.maximum.html?highlight=max#numpy.maximum)) – ynn Jul 02 '20 at 11:13
1

@DanielS. if in-place ops are an option, then there are faster in-place ops as pointed out in [Tobias's response](https://stackoverflow.com/a/46837904/3888455). – Sid Jul 04 '20 at 18:49

Shital Shah · Answer 2 · 2020-02-02T08:05:08.200

55

You can do it in much easier way:

def ReLU(x):
    return x * (x > 0)

def dReLU(x):
    return 1. * (x > 0)

edited Feb 02 '20 at 08:05

answered Dec 22 '17 at 05:51

Shital Shah

63,284
17
238
185

Thanks. I found this to be faster than the fancy index method. @Shital Shah Could you please explain this syntax more or share some links? – Sreeragh A R Feb 02 '20 at 07:26
1

It's just broadcasting and element-wise multiplication. The `0` will automatically be turned in to same size as tensor `x`. The `bool` result will be turned in to 0 or 1 and then will get multiplied elementwise. There is no magic :). – Shital Shah Feb 02 '20 at 08:03
the derivative should be `1. * ( x >= 0) ` – Olivier D'Ancona Mar 12 '22 at 10:30

Richard Möhn · Answer 3 · 2020-07-06T01:56:14.023

49

I'm completely revising my original answer because of points raised in the other questions and comments. Here is the new benchmark script:

import time
import numpy as np


def fancy_index_relu(m):
    m[m < 0] = 0


relus = {
    "max": lambda x: np.maximum(x, 0),
    "in-place max": lambda x: np.maximum(x, 0, x),
    "mul": lambda x: x * (x > 0),
    "abs": lambda x: (abs(x) + x) / 2,
    "fancy index": fancy_index_relu,
}

for name, relu in relus.items():
    n_iter = 20
    x = np.random.random((n_iter, 5000, 5000)) - 0.5

    t1 = time.time()
    for i in range(n_iter):
        relu(x[i])
    t2 = time.time()

    print("{:>12s}  {:3.0f} ms".format(name, (t2 - t1) / n_iter * 1000))

It takes care to use a different ndarray for each implementation and iteration. Here are the results:

         max  126 ms
in-place max  107 ms
         mul  136 ms
         abs   86 ms
 fancy index  132 ms

edited Jul 06 '20 at 01:56

answered Oct 13 '16 at 05:44

Richard Möhn

712
7
15

6

How does np.maximum(x,0,x) take less time compared to np.maximum(0,x) ? – pikachuchameleon Jan 13 '17 at 20:05
5

also worth noting that this will modify x – Andrea Feb 27 '17 at 10:44
8

@pikachuchameleon It is faster because it is in-place. The return value of `np.maximum(x, 0, x)` is ignored and the result is directly written to `x`. – Lenar Hoyt Jun 17 '17 at 17:48
2

If in-place ops are an option, then there are faster in-place ops as pointed out in [Tobias's response](https://stackoverflow.com/a/46837904/3888455). – Sid Jul 04 '20 at 18:49

Tobias · Answer 4 · 2020-07-13T07:41:11.293

EDIT As jirassimok has mentioned below my function will change the data in place, after that it runs a lot faster in timeit. This causes the good results. It's some kind of cheating. Sorry for your inconvenience.

I found a faster method for ReLU with numpy. You can use the fancy index feature of numpy as well.

fancy index:

20.3 ms ± 272 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> x = np.random.random((5,5)) - 0.5 
>>> x
array([[-0.21444316, -0.05676216,  0.43956365, -0.30788116, -0.19952038],
       [-0.43062223,  0.12144647, -0.05698369, -0.32187085,  0.24901568],
       [ 0.06785385, -0.43476031, -0.0735933 ,  0.3736868 ,  0.24832288],
       [ 0.47085262, -0.06379623,  0.46904916, -0.29421609, -0.15091168],
       [ 0.08381359, -0.25068492, -0.25733763, -0.1852205 , -0.42816953]])
>>> x[x<0]=0
>>> x
array([[ 0.        ,  0.        ,  0.43956365,  0.        ,  0.        ],
       [ 0.        ,  0.12144647,  0.        ,  0.        ,  0.24901568],
       [ 0.06785385,  0.        ,  0.        ,  0.3736868 ,  0.24832288],
       [ 0.47085262,  0.        ,  0.46904916,  0.        ,  0.        ],
       [ 0.08381359,  0.        ,  0.        ,  0.        ,  0.        ]])

Here is my benchmark:

import numpy as np
x = np.random.random((5000, 5000)) - 0.5
print("max method:")
%timeit -n10 np.maximum(x, 0)
print("max inplace method:")
%timeit -n10 np.maximum(x, 0,x)
print("multiplication method:")
%timeit -n10 x * (x > 0)
print("abs method:")
%timeit -n10 (abs(x) + x) / 2
print("fancy index:")
%timeit -n10 x[x<0] =0

max method:
241 ms ± 3.53 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
max inplace method:
38.5 ms ± 4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
multiplication method:
162 ms ± 3.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
abs method:
181 ms ± 4.18 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
fancy index:
20.3 ms ± 272 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

(+1) Your fancy method is the only method I have actually seen used before! It's not only efficient, but also perfectly describes the ReLU operation, in my opinion. — n1k31t4, Jul 05 '18 at 22:13
This method is only faster than the others when the array has no negative numbers; your test seems fast because timeit modifies the array, so after the first loop, there are no negatives left and it runs faster. In a test that re-generated the array each time, the logical indexing assignment (`a[a < 0] = 0`) performed worst of the methods, with `np.maximum` doing best. — jirassimok, Feb 09 '20 at 07:32
@jirassimok you are right. My function will modify the data in place. And after one run it will be a lot faster. I will change my post — Tobias, Jul 13 '20 at 07:38

ivanpp · Answer 5 · 2018-09-14T09:55:15.803

7

Richard Möhn's comparison is not fair.
As Andrea Di Biagio's comment, the in-place method np.maximum(x, 0, x) will modify x at the first loop.
So here is my benchmark:

import numpy as np

def baseline():
    x = np.random.random((5000, 5000)) - 0.5
    return x

def relu_mul():
    x = np.random.random((5000, 5000)) - 0.5
    out = x * (x > 0)
    return out

def relu_max():
    x = np.random.random((5000, 5000)) - 0.5
    out = np.maximum(x, 0)
    return out

def relu_max_inplace():
    x = np.random.random((5000, 5000)) - 0.5
    np.maximum(x, 0, x)
    return x

Timing it:

print("baseline:")
%timeit -n10 baseline()
print("multiplication method:")
%timeit -n10 relu_mul()
print("max method:")
%timeit -n10 relu_max()
print("max inplace method:")
%timeit -n10 relu_max_inplace()

Get the results:

baseline:
10 loops, best of 3: 425 ms per loop
multiplication method:
10 loops, best of 3: 596 ms per loop
max method:
10 loops, best of 3: 682 ms per loop
max inplace method:
10 loops, best of 3: 602 ms per loop

In-place maximum method is only a bit faster than the maximum method, and it may because it omits the variable assignment for 'out'. And it's still slower than the multiplication method.
And since you're implementing the ReLU func. You may have to save the 'x' for backprop through relu. E.g.:

def relu_backward(dout, cache):
    x = cache
    dx = np.where(x > 0, dout, 0)
    return dx

So i recommend you to use multiplication method.

edited Sep 14 '18 at 09:55

answered Apr 09 '18 at 14:01

ivanpp

71
1
2

Why does your benchmark show that `relu_mul` is fastest, but you say`relu_max_inplace` is slightly faster? Also, why do you initialise the test matrix in each function and no just once at the beginning for each method? Your timings now include time taken to create a matrix with 5000*5000 = 25000000 elements - roughly 200 Mb in size if default `float64` is used. `%timeit np.random.random((5000, 5000)) - 0.5` gives `273 ms ± 7.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`. That is over a third of the actual timings you posted. – n1k31t4 Jul 01 '18 at 15:16
@n1k31t4 First, i say `relu_max_inplace` is slighter faster than `relu_max`, but the most recommended method is `relu_mul`. – ivanpp Sep 14 '18 at 05:35
1

I add the initialise func `np.random.random()` intentionally, because if i don't do this, `relu_max_inplace` method will seem to be extremly fast, like @Richard Möhn 's result. @Richard Möhn 's result shows that `relu_max_inplace` vs `relu_max` is 38.4ms vs 238ms per loop. It's just because the in_place method will only be excuted once. And initialise the matrix in each loop will avoid this situation. The comparison will be fair. – ivanpp Sep 14 '18 at 09:54
1

@ivanpp I'm not sure including the random generation op in timing results is fair at all. – Sid Jul 04 '20 at 18:45

score 3 · Answer 6 · answered Jul 18 '19 at 12:52

3

numpy didn't have the function of relu, but you define it by yourself as follow:

def relu(x):
    return np.maximum(0, x)

for example:

arr = np.array([[-1,2,3],[1,2,3]])

ret = relu(arr)
print(ret) # print [[0 2 3] [1 2 3]]

answered Jul 18 '19 at 12:52

Donald Su

31
2

score 1 · Answer 7 · answered Aug 29 '18 at 10:54

1

If we have 3 parameters (t0, a0, a1) for Relu, that is we want to implement

if x > t0:
    x = x * a1
else:
    x = x * a0

We can use the following code:

X = X * (X > t0) * a1 +  X * (X < t0) * a0

X there is a matrix.

answered Aug 29 '18 at 10:54

Boooooooooms

306
4
21

score 1 · Answer 8 · answered Jun 01 '20 at 03:03

1

ReLU(x) also is equal to (x+abs(x))/2

answered Jun 01 '20 at 03:03

Ramy

11
1

Kishor Bhoyar · Answer 9 · 2023-01-20T16:11:51.250

0

For a single neuron

def relu(net):

    return max(0, net)

Where net is the net activity at the neuron's input(net=dot(w,x)), where dot() is the dot product of w and x (weight vector and input vector respectively). dot() is a function defined in numpy package in Python.

For neurons in a layer with net vector

def relu(net):
  
   return np.maximum(net)

edited Jan 20 '23 at 16:11

answered Jan 20 '23 at 10:27

Kishor Bhoyar

111
1
1
6

score -3 · Answer 10 · answered Feb 23 '18 at 11:56

-3

This is more precise implementation:

def ReLU(x):
    return abs(x) * (x > 0)

answered Feb 23 '18 at 11:56

Patel Sunil

441
7
7

2

Why? The `abs` is unnecessary given that your stamping out all the negative components. – Matt Hancock May 02 '19 at 12:40

How to implement the ReLU function in Numpy

10 Answers10

Linked

Related