CS180 Project 2

Convolutions From Scratch

Grayscale portrait of Wonyoung — Input: Wonyoung Portrait

In this section we implement a 2D convolution using NumPy only and then compare the results with scipy.signal.convolve2d. I implemented a 4-loop and 2-loop implementation of the convolution with "same" 0-padding. I also used "same" padding when using scipy.signal.convolve2d. We use each approach to convolve a portrait with a 9x9 box filter, and finite difference filters in the x (Dx) and y (Dy) directions.

The outputs looks pretty much identical, except it seems for the Dx and Dy outputs, the high and low frequencies are switched between the NumPy implementations and the scipy implementation. Since I used the same type of padding, the borders of the images look the same as well. In terms of runtime analysis, the 4 loop approach is the slowest, then the 2 loop, then the scipy is most optimized.

Filter	4 loops	2 loops	SciPy
Box Filter
Dx
Dy

Gradients and Edges

In this section we compare 3 methods of producing gradient and edge images.

The first method is using a finite difference filter to simply find the partial derivatives of the image, combine them to find the gradient magnitude of image, then finally pick some threshold to binarize the gradient. The second method is to convolve the image with a gaussian kernel to smooth the image before applying similar actions. The third method is to take the partial derivatives of the gaussian before applying it to the original image, so that we only apply one convolution versus chaining multiple.

Step	Finite Difference	Gaussian + Finite Difference	Derivative of Gaussian
Input
Dx
Dy
Gradient Magnitude
Binarized Edges

Original Taj Mahal image — DoG_x filter (gaussian convolved with Dx)

Blurred Taj Mahal image — DoG_y filter (gaussian convolved with Dy)

Comparing the outputs, we see the most difference in the binarized outputs. I chose a threshold of 0.06 for all 3 methods for proper comparison. This threshold gave a good balance in finding most edges in the gaussian smoothed outputs, but still removed some noise from the finite difference output. Still, we can see there is much more noise in the result where we did not use gaussian blurring. Lastly I wanted to comment that the results for the second and third method appear basically the same.

Image Sharpening

In this section we explore image sharpening by blurring an image, then subtracting these frequencies from the original image. This will give you the high frequencies of the image. Adding the high frequencies to the original image gives the sharpened image.

Extracted high frequencies of Taj Mahal — High Frequencies

For this result, we see how the image changes depending on how much we "sharpen" the image (which is dependent on alpha). We can see that the more we sharpen the image, the darker edges get, almost looking unnatural if we sharpen too much.

Sharpened Taj Mahal (alpha=1) — Sharpened (alpha=1)

Sharpened Taj Mahal (alpha=2) — Sharpened (alpha=2)

Sharpened Taj Mahal (alpha=5) — Sharpened (alpha=5)

For this second result, we blur the image and then attempt to sharpen it again. So, approach is similar, but the high frequencies are taken from and applied to the blurred image rather than the original image. It's pretty clear that the details/high frequencies for this attempt are much fainter, and so when adding the high frequencies I had to use a higher alpha value (I used alpha=2 for Taj, but alpha=10 for Yosemite).

Blurred Yosemite landscape — Yosemite Blurred

Extracted high frequencies of blurred Yosemite — Yosemite Blurred High Frequencies

Hybrid Images

In this section we create hybrid images, which is taking the high frequencies of one image and averaging them with the low frequencies of another. The result is that the high frequency image is "emphasized" from up close whereas the low frequency image will be more prominent from far away. Admittedly, I think the high frequencies in my hybrid images look a bit weak. I would try using some amplification or normalizing to help with this. I also think keeping the images colored makes the results more mixed, so I would be interested in reproducing the images but in grayscale.

Hybrid of Derek and Nutmeg — Derek + Nutmeg Hybrid

Hybrid of Mona Lisa and Minion — Mona Lisa + Minion Hybrid

Hybrid of basketball and tennis ball — Basketball + Tennis Ball Hybrid

For this next pair of images, I illustrate the full process: alignment, fourier transforms, filtered results, and the final image. For cutoff frequency, I chose sigma_high=4 and sigma_low=6.

Fourier transform of cat — Cat Fourier Transform

Fourier transform of lion — Lion Fourier Transform

Fourier transform of filtered cat — Cat Filtered Fourier Transform

Fourier transform of filtered lion — Lion Filtered Fourier Transform

Hybrid of cat and lion — Cat + Lion Hybrid

Fourier transform of hybrid — Cat + Lion Hybrid Fourier Transform

Gaussian and Laplacian Stacks

In this section we create gaussian and laplacian stacks of images. The process of creating a gaussian stack is to essentially repeatedly convolve an image witha gaussian kernel. Then we use the gaussian stack to construct the laplacian stack by taking the difference between each pair of consecutive layers in teh gaussian stack. The last layer of the laplacian stack is simply the last layer of the gaussian stack, so now each stack will have the same amount of layers.

Gaussian and Laplacian stacks of orange — Orange Stacks

Image Blending

We utilize creating gaussian and laplacian to blend 2 images together. We create a laplacian stack for each image, as well a gaussian stack for our mask. Stitching these stacks together using our smoothed mask gives a smooth transition between our images.

In these first examples we use a simple vertical mask.

Orapple:

Stack visualization of apple-orange blend — Apple + Orange Blend Stacks

Apple blended with orange — Apple + Orange Blend

Green Blossom:

Green tree blended with cherry blossom — Green Tree + Cherry Blossom Blend

In these next examples we use irregular masks by taking advantage of the solid background of one image to binarize the image into a mask.

Kirby Moon:

Binary mask for Kirby blending — Kirby Mask

Kirby blended into night sky — Kirby + Night Sky Blend

Lebron dunking emoji: