I was trying to build a cnn to with Pytorch, and had difficulty in maxpooling. I have taken the cs231n held by Stanford. As I recalled, maxpooling can be used as a dimensional deduction step, for example, I have this (1, 20, height, width) input ot max_pool2d (assuming my batch_size is 1). And if I use (1, 1) kernel, I want to get output like this: (1, 1, height, width), which means the kernel should be slide over the channel dimension. However, after checking the pytorch docs, it says the kernel slides over height and width. And thanks to @ImgPrcSng on Pytorch forum who told me to use max_pool3d, and it turned out worked well. But there is still a reshape operation between the output of the conv2d layer and the input of the max_pool3d layer. So it is hard to be aggregated into a nn.Sequential, so I wonder is there another way to do this?
Asked
Active
Viewed 9,962 times
3
-
In order to get good answers, you should give good answers which shows your effort in solving your problem. Also try to format your post nicely using [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) – jdhao Nov 17 '17 at 15:39
3 Answers
11
Would something like this work?
from torch.nn import MaxPool1d
import torch.nn.functional as F
class ChannelPool(MaxPool1d):
def forward(self, input):
n, c, w, h = input.size()
input = input.view(n, c, w * h).permute(0, 2, 1)
pooled = F.max_pool1d(
input,
self.kernel_size,
self.stride,
self.padding,
self.dilation,
self.ceil_mode,
self.return_indices,
)
_, _, c = pooled.size()
pooled = pooled.permute(0, 2, 1)
return pooled.view(n, c, w, h)
Or, using einops
from torch.nn import MaxPool1d
import torch.nn.functional as F
from einops import rearrange
class ChannelPool(MaxPool1d):
def forward(self, input):
n, c, w, h = input.size()
pool = lambda x: F.max_pool1d(
x,
self.kernel_size,
self.stride,
self.padding,
self.dilation,
self.ceil_mode,
self.return_indices,
)
return rearrange(
pool(rearrange(input, "n c w h -> n (w h) c")),
"n (w h) c -> n c w h",
n=n,
w=w,
h=h,
)

gngdb
- 484
- 3
- 8
-
1I think that `pooled` must be used instead of `input` in the last 3 lines – Guglie Mar 24 '20 at 17:45
-
1Also change imports as : from torch.nn import MaxPool1d import torch.nn.functional as F in recent pytorch versions. – anilsathyan7 Oct 22 '20 at 07:21
5
To max-pool in each coordinate over all channels, simply use layer from einops
from einops.layers.torch import Reduce
max_pooling_layer = Reduce('b c h w -> b 1 h w', 'max')
Layer can be used in your model as any other torch module

Alleo
- 7,891
- 2
- 40
- 30
0
I'm not sure why the other answers are so complicated. Max pooling over the whole channels dimension to get an output with only 1 channel sounds equivalent to just taking the maximum value over that dimension:
torch.amax(left_images, dim=1, keepdim=True)

Alex Li
- 1
- 1