Pytorch maxpooling over channels dimension

Question

I was trying to build a cnn to with Pytorch, and had difficulty in maxpooling. I have taken the cs231n held by Stanford. As I recalled, maxpooling can be used as a dimensional deduction step, for example, I have this (1, 20, height, width) input ot max_pool2d (assuming my batch_size is 1). And if I use (1, 1) kernel, I want to get output like this: (1, 1, height, width), which means the kernel should be slide over the channel dimension. However, after checking the pytorch docs, it says the kernel slides over height and width. And thanks to @ImgPrcSng on Pytorch forum who told me to use max_pool3d, and it turned out worked well. But there is still a reshape operation between the output of the conv2d layer and the input of the max_pool3d layer. So it is hard to be aggregated into a nn.Sequential, so I wonder is there another way to do this?

In order to get good answers, you should give good answers which shows your effort in solving your problem. Also try to format your post nicely using [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) — jdhao, Nov 17 '17 at 15:39

gngdb · Answer 1 · 2021-06-23T16:03:54.067

Would something like this work?

from torch.nn import MaxPool1d
import torch.nn.functional as F


class ChannelPool(MaxPool1d):
    def forward(self, input):
        n, c, w, h = input.size()
        input = input.view(n, c, w * h).permute(0, 2, 1)
        pooled = F.max_pool1d(
            input,
            self.kernel_size,
            self.stride,
            self.padding,
            self.dilation,
            self.ceil_mode,
            self.return_indices,
        )
        _, _, c = pooled.size()
        pooled = pooled.permute(0, 2, 1)
        return pooled.view(n, c, w, h)

Or, using einops

from torch.nn import MaxPool1d
import torch.nn.functional as F
from einops import rearrange


class ChannelPool(MaxPool1d):
    def forward(self, input):
        n, c, w, h = input.size()
        pool = lambda x: F.max_pool1d(
            x,
            self.kernel_size,
            self.stride,
            self.padding,
            self.dilation,
            self.ceil_mode,
            self.return_indices,
        )
        return rearrange(
            pool(rearrange(input, "n c w h -> n (w h) c")),
            "n (w h) c -> n c w h",
            n=n,
            w=w,
            h=h,
        )

I think that `pooled` must be used instead of `input` in the last 3 lines — Guglie, Mar 24 '20 at 17:45
Also change imports as : from torch.nn import MaxPool1d import torch.nn.functional as F in recent pytorch versions. — anilsathyan7, Oct 22 '20 at 07:21

Alleo · Answer 2 · 2021-07-05T11:31:32.150

5

To max-pool in each coordinate over all channels, simply use layer from einops

from einops.layers.torch import Reduce

max_pooling_layer = Reduce('b c h w -> b 1 h w', 'max')

Layer can be used in your model as any other torch module

edited Jul 05 '21 at 11:31

answered Jul 04 '21 at 18:39

Alleo

7,891
2
40
30

score 0 · Answer 3 · answered Jun 26 '23 at 15:24

I'm not sure why the other answers are so complicated. Max pooling over the whole channels dimension to get an output with only 1 channel sounds equivalent to just taking the maximum value over that dimension:

torch.amax(left_images, dim=1, keepdim=True)

Pytorch maxpooling over channels dimension

3 Answers3

Linked