where
Kernel is 10x10, all values equal to 0.01 | Kernel is 20x20, all values equal to 0.0025 | Kernel is 20x1, all values equal to 0.05 |
We can also achieve a sharpening effect.
From Wikipedia
And we can achieve an edge effect.
Erosion with a 4x4 kernel of 1’s | Lena in binary | Binary image eroded by 4x4 kernel of 1’s |
Dilation with a 4x4 kernel of 1’s | Lena in binary | Binary image dilated by 4x4 kernel of 1’s |
An affine transformation is any transformation that can be specified as a linear transformation (matrix multiplication) plus a translation (vector addition).
Affine transformations include: rotations, translations, scaling, and combinations of these.
\begin{equation} A = \begin{bmatrix} a00 & a01 \ a10 & a11 \end{bmatrix} B = \begin{bmatrix} b00 \ b10 \end{bmatrix} \end{equation}
\begin{equation} M = [A;B] = \begin{bmatrix} a00 & a01 & b00 \ a10 & a11 & b10 \end{bmatrix} \end{equation}
Then we take a point,
\begin{equation} T = M × [x, y, 1]^T = \begin{bmatrix} a00x + a01y + b00 \ a10x + a11y + b10 \end{bmatrix} \end{equation}
The identity transformation:
\begin{equation} A = \begin{bmatrix} 1 & 0 & 0 \ 0 & 1 & 0 \end{bmatrix} \end{equation}
Scale the image to half its size:
\begin{equation} A = \begin{bmatrix} 0.5 & 0 & 0 \ 0 & 0.5 & 0 \end{bmatrix} \end{equation}
Here is a 90-degree rotation:
\begin{eqnarray}
M &=& \begin{bmatrix}
cos(π/2) & sin(π/2) & (1-cos(π/2))x_c - sin(π/2)y_c
-sin(π/2) & cos(π/2) & sin(pi/2)x_c + (1-cos(pi/2))y_c
\end{bmatrix} \
&=& \begin{bmatrix}
0 & 1 & x_c - y_c \
-1 & 0 & x_c + y_c
\end{bmatrix}
\end{eqnarray}
where
Here is a 45-degree rotation:
\begin{equation}
M = \begin{bmatrix}
cos(π/4) & sin(π/4) & (1-cos(π/4))x_c - sin(π/4)y_c
-sin(π/4) & cos(π/4) & sin(pi/4)x_c + (1-cos(pi/4))y_c
\end{bmatrix}
\end{equation}
Scaled, no translation | Scaled, with translation | 45-degree rotation plus translation |
\begin{equation} M = \begin{bmatrix} a00 & a01 & a02 \ a10 & a11 & a12 \ a20 & a21 & 1.0 \end{bmatrix} \end{equation}
\begin{equation}
[x’, y’, w]^T = M × [x, y, 1]^T =
\begin{bmatrix}
a00x + a01y + a02
a10x + a11y + a12 \
a20x + a21y + 1.0
\end{bmatrix}
\end{equation}
Then we divide out
The example on the left uses this kernel:
\begin{equation}
M = \begin{bmatrix}
0.53009 & -0.47993 & 79.0
-0.31048 & 0.50159 & 59.0 \
-0.00094 & -0.00131 & 1.0
\end{bmatrix}
\end{equation}
Because an perspective transformation is a
We could, for example, take a skewed view from a surveillance camera, and figure out the actual plane being viewed (assuming we are monitoring a flat area), by just figuring out the ground coordinates of four pixels in the video.
Of course, surveillance videos often have low resolution, so we probably cannot pick out the exact ground coordinates of any pixels. We can instead approximate the affine transformation; that is, we can find the transformation that minimizes the error. Four point correspondances is now not enough; we need lots, maybe 50 or 100, the more the better. (Perhaps we can automatically find these.) Then different transformations can be tested, and we can pick out the one that minimizes the error between chosen ground coordinates and predicted ground coordinates (from the transformation). This learning can be achieved with a single layer neural network.
Grab the source code for all of these examples: https://github.com/joshuaeckroth/teach-computer-vision