Computer Vision Algorithms

Tuesday, 12 May 2015

K-Means Clustering

Step (1): K-means clustering method starts with random points for the cluster centers.
Step (2): Cluster membership is assigned to each data point based on the distance to the nearest cluster center.
Step (3): Calculate the new cluster center by taking the mean of the data points that belongs to a cluster. Repeat this for each cluster.

Step (4): Repeat Steps (2) and (3) until the cluster centers are stable or below a given threshold.

Distance metrics to use: Euclidean distance and Mahalanobis distance.
Membership assignment: binary decision (0 or 1), soft decision (probability).

Color Constancy

Color constancy is defined as the ability to discount the effect of the illuminant in an image. Human visual system has the remarkable ability to perceive the color of an object largely invariant to illumination conditions instantaneously and effortlessly. However, attempts to achieve the same level of color constancy in machine vision system shown to be difficult to achieve. This is due to the lack of our understanding of the neural processing involved in the visual perception. The human visual system achieves illuminant invariant color perception from massively parallel and complex feedback connections.

There are a number of color constancy algorithms have been proposed: white patch, gray world, gamut mapping, color by correlation, etc.

White patch algorithm:
This algorithm assumes that there is a white or bright patch in a scene that reflects the entire light falls on its surface. This means the color of the reflected light from this bright surface is the color of the light source. This is used as an estimate of the illuminant to correct for the illuminant.

There are two versions to this algorithm:
(1) Obtain the illuminant estimate from the RGB values of the brightest pixel.
(2) Obtain the illuminant estimate by computing the maximum RGB values in the entire image (this is also known ans maxworld algorithm).

Disadvantage: Noise in the single bright pixel could lead to inaccurate estimate of the illuminant, Brightest pixel could saturate the image sensor (clipped pixels).

Gray world algorithm:
This algorithm assumes that the statistical mean color of a scene isachromatic. Based on this assumption, this algorithm obtains an estimate of the illuminant color by calculating the mean color of an image.

Disadvantage: Mean color of a scene is not always achromatic for example a scene dominated by blue sky of green grass/trees could lead to incorrect estimate for the illuminant.

The following code is an implementation of the greyworld algorithm in OpenCV:

#include <iostream> //preprocessor directive

#include<vector>

#include "opencv2/opencv.hpp" //include OpenCV libraries

//Main function (starting point of execution)
int main(int argc, char argv[])

{
// Read the image from file (replace the image file name in your code)

cv::Mat imgIn = cv::imread("snow.jpg");

//Check if the image is empty

if ( imgIn.empty() )

{
//Print error message

std::cout << "Error occured when loading the image" << std::endl;

//Return negative 1 to indicate an error has occured in the program

return -1;

}
//Sum the colour values in each channel

cv::Scalar sumImg=sum(imgIn);
//normalise by the number of pixes in the image to obtain an extimate for the illuminant

cv::Scalar illum=sumImg/(imgIn.rows*imgIn.cols);
// Split the image into different channels

std::vector<cv::Mat> rgbChannels(3);

cv::split(imgIn, rgbChannels);
//Assign the three colour channels to CV::Mat variables for processing

cv::Mat redImg=rgbChannels[2];

cv::Mat graanImg=rgbChannels[1];

cv::Mat blueImg=rgbChannels[0];

//calculate scale factor for normalisation you can use 255 instead

double scale=(illum(0)+illum(1)+illum(2))/3;

//correct for illuminant (white balancing)

redImg=redImg*scale/illum(2);

graanImg=graanImg*scale/illum(1);

blueImg=blueImg*scale/illum(0);
//Assign the processed channels back into vector to use in the cv::merge() function

rgbChannels[0]=blueImg;

rgbChannels[1]=graanImg;

rgbChannels[2]=redImg;
cv::Mat imgOut; //to hold the output image

/// Merge the processed colour channels

merge(rgbChannels, imgOut);
//Create a window

cv::namedWindow("My Window", 1);

//Display the image

imshow("My Window", imgOut);

// Wait until a key is pressed

cv::waitKey(0);
//Return 0 to indicate normal run of the program

return 0;

}

Expectation–maximization (EM) algorithm

Expectation maximization (EM) algorithms is an iterative algorithm applied to problems that does not have a closed form solution. In other words if the problem is under constrained then the EM algorithm finds a maximum likelihood (ML) or maximum a posteriori (MAP) estimates of model parameters and the function. The EM algorithm has two steps (1) Expectation (E) step and (2) Maximization (M) step. The algorithm iterates between these two steps (E step and M step) until it finds the optimum solution. In the E step the algorithm estimates the function for the expectation of the likelihood based on the current estimates of the model parameters. In the M step the algorithm computes the model parameters that maximizes the expected likelihood found in the previous E step.

In an unsupervised clustering problem, if one applies hard decision instead of a soft decision for membership function in an EM setting then this becomes equivalent to K-means algorithm. In other words, if one applied probability to to assign cluster membership to each data point then in this case the K-means clustering becomes equivalent to the EM algorithm.

K-means with soft decision => EM algorithm

EM algorithm in clustering with hard decision => K-means clustering

Sunday, 22 February 2015

Bilateral filter

Bilateral filter is a non-linear filter used for many applications in computational photography including image smoothing, denoising, tone mapping, relighting and image blur. Bilateral filter smooths an image while keeping the edges preserved. To calculate the filtered value, the filter uses (1) euclidean distance and (2) perceptual difference between the current pixel and its local neighborhood.The euclidean distance is calculated from the two-dimensional image coordinates of the center pixel and the pixels in the local neighborhood. The perceptual difference is the difference between the tristimulus values (RGB) of the center pixel and the pixels in the neighborhood. To be more perceptually meaningful colour difference convert the image into a perceptually uniform colour space (CIELab or CIELuv) and calculate the colour difference. In calculating colour difference, you can use the chromaticity (or any feature that is invariant to illuminant such as hue, chromaticity, normalised rgb values) of the image to smooth edges caused by the illumination while preserving the edges caused by the reflectance.

An intuitive explanation is that bilateral filter compares the colour difference in a local neighborhood and averages only the neighbouring pixels that have similar colour value as that of the center pixel. This process avoids smoothing any edges in the image.

Simple Implementation: Use a threshold value to decide whether to include a neighborhood pixel in the process of averaging. In particular calculate the colour difference of all the pixels in the local neighborhood of a pixel and average only the neighborhood pixels with colour difference smaller than the predefined threshold. This should result in a reasonably smoothed image with edges preserved. To improve the quality of the smoothed image vary the threshold, vary the size of the neighborhood and repeat the smoothing step a few times and find the best parameters (colour difference threshold, neighborhood and number of repeats) that gives the best result.

Better Implementation: Instead of using simple threshold use a kernel function such as a Gaussian function to weight the euclidean distance and perceptual difference. This should result in a better result. To further improve the results try different kernel functions and find the one gives the best result.

For more information :http://people.csail.mit.edu/sparis/bf_course/