What is the difference between ‘SAME’ and ‘VALID’ padding in tf.nn.max_pool of tensorflow?
In my opinion, ‘VALID’ means there will be no zero padding outside the edges when we do max pool.
According to A guide to convolution arithmetic for deep learning, it says that there will be no padding in pool operator, i.e. just use ‘VALID’ of tensorflow.
But what is ‘SAME’ padding of max pool in tensorflow?
x = tf.constant([[1.,2.,3.],[4.,5.,6.]])
x = tf.reshape(x,[1,2,3,1])# give a shape accepted by tf.nn.max_pool
valid_pad = tf.nn.max_pool(x,[1,2,2,1],[1,2,2,1], padding='VALID')
same_pad = tf.nn.max_pool(x,[1,2,2,1],[1,2,2,1], padding='SAME')
valid_pad.get_shape()==[1,1,1,1]# valid_pad is [5.]
same_pad.get_shape()==[1,1,2,1]# same_pad is [5., 6.]
"VALID" only ever drops the right-most columns (or bottom-most rows).
"SAME" tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).
Edit:
About the name:
With "SAME" padding, if you use a stride of 1, the layer’s outputs will have the same spatial dimensions as its inputs.
With "VALID" padding, there’s no “made-up” padding inputs. The layer only uses valid input data.
Padding is an operation to increase the size of the input data. In case of 1-dimensional data you just append/prepend the array with a constant, in 2-dim you surround matrix with these constants. In n-dim you surround your n-dim hypercube with the constant. In most of the cases this constant is zero and it is called zero-padding.
Here is an example of zero-padding with p=1 applied to 2-d tensor:
You can use arbitrary padding for your kernel but some of the padding values are used more frequently than others they are:
VALID padding. The easiest case, means no padding at all. Just leave your data the same it was.
SAME padding sometimes called HALF padding. It is called SAME because for a convolution with a stride=1, (or for pooling) it should produce output of the same size as the input. It is called HALF because for a kernel of size k
FULL padding is the maximum padding which does not result in a convolution over just padded elements. For a kernel of size k, this padding is equal to k - 1.
To use arbitrary padding in TF, you can use tf.pad()
VALID: Don’t apply any padding, i.e., assume that all dimensions are valid so that input image fully gets covered by filter and stride you specified.
SAME: Apply padding to input (if needed) so that input image gets fully covered by filter and stride you specified. For stride 1, this will ensure that output image size is same as input.
Notes
This applies to conv layers as well as max pool layers in same way
The term “valid” is bit of a misnomer because things don’t become “invalid” if you drop part of the image. Sometime you might even want that. This should have probably be called NO_PADDING instead.
The term “same” is a misnomer too because it only makes sense for stride of 1 when output dimension is same as input dimension. For stride of 2, output dimensions will be half, for example. This should have probably be called AUTO_PADDING instead.
In SAME (i.e. auto-pad mode), Tensorflow will try to spread padding evenly on both left and right.
In VALID (i.e. no padding mode), Tensorflow will drop right and/or bottom cells if your filter and stride doesn’t full cover input image.
The valid padding involves no zero padding, so it covers only the valid input, not including artificially generated zeros. The length of output is ((the length of input) – (k-1)) for the kernel size k if the stride s=1.
Same or half padding:
The same padding makes the size of outputs be the same with that of inputs when s=1. If s=1, the number of zeros padded is (k-1).
Full padding:
The full padding means that the kernel runs over the whole inputs, so at the ends, the kernel may meet the only one input and zeros else. The number of zeros padded is 2(k-1) if s=1. The length of output is ((the length of input) + (k-1)) if s=1.
Therefore, the number of paddings: (valid) <= (same) <= (full)
Padding on/off. Determines the effective size of your input.
VALID: No padding. Convolution etc. ops are only performed at locations that are “valid”, i.e. not too close to the borders of your tensor. With a kernel of 3×3 and image of 10×10, you would be performing convolution on the 8×8 area inside the borders.
SAME: Padding is provided. Whenever your operation references a neighborhood (no matter how big), zero values are provided when that neighborhood extends outside the original tensor to allow that operation to work also on border values. With a kernel of 3×3 and image of 10×10, you would be performing convolution on the full 10×10 area.
回答 9
有效填充:这是零填充。希望没有混乱。
x = tf.constant([[1.,2.,3.],[4.,5.,6.],[7.,8.,9.],[7.,8.,9.]])
x = tf.reshape(x,[1,4,3,1])
valid_pad = tf.nn.max_pool(x,[1,2,2,1],[1,2,2,1], padding='VALID')print(valid_pad.get_shape())# output-->(1, 2, 1, 1)
SAME padding: This is kind of tricky to understand in the first place because we have to consider two conditions separately as mentioned in the official docs.
Let’s take input as , output as , padding as , stride as and kernel size as (only a single dimension is considered)
Case 01: :
Case 02: :
is calculated such that the minimum value which can be taken for padding. Since value of is known, value of can be found using this formula .
To sum up, ‘valid’ padding means no padding. The output size of the convolutional layer shrinks depending on the input size & kernel size.
On the contrary, ‘same’ padding means using padding. When the stride is set as 1, the output size of the convolutional layer maintains as the input size by appending a certain number of ‘0-border’ around the input data when calculating convolution.
Tensorflow 2.0 Compatible Answer: Detailed Explanations have been provided above, about “Valid” and “Same” Padding.
However, I will specify different Pooling Functions and their respective Commands in Tensorflow 2.x (>= 2.0), for the benefit of the community.
Functions in 1.x:
tf.nn.max_pool
tf.keras.layers.MaxPool2D
Average Pooling => None in tf.nn, tf.keras.layers.AveragePooling2D
Functions in 2.x:
tf.nn.max_pool if used in 2.x and tf.compat.v1.nn.max_pool_v2 or tf.compat.v2.nn.max_pool, if migrated from 1.x to 2.x.
tf.keras.layers.MaxPool2D if used in 2.x and
tf.compat.v1.keras.layers.MaxPool2D or tf.compat.v1.keras.layers.MaxPooling2D or tf.compat.v2.keras.layers.MaxPool2D or tf.compat.v2.keras.layers.MaxPooling2D, if migrated from 1.x to 2.x.
Average Pooling => tf.nn.avg_pool2d or tf.keras.layers.AveragePooling2D if used in TF 2.x and
tf.compat.v1.nn.avg_pool_v2 or tf.compat.v2.nn.avg_pool or tf.compat.v1.keras.layers.AveragePooling2D or tf.compat.v1.keras.layers.AvgPool2D or tf.compat.v2.keras.layers.AveragePooling2D or tf.compat.v2.keras.layers.AvgPool2D , if migrated from 1.x to 2.x.
For more information about Migration from Tensorflow 1.x to 2.x, please refer to this Migration Guide.