Sunday, June 3, 2012

Sudoku Solver - Part 2

Hi,

This is the continuation of the article : Sudoku Solver - Part 1

So we start implementing here.

Load the image :

Below is the image I used to work with.

Original  Image
So, first we import necessary libraries.

import cv2
import numpy as np

Then we load the image, and convert to grayscale.

img =  cv2.imread('sudoku.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

Image Pre-processing :

I have done just noise removal and thresholding. And it is working. So I haven't done anything extra.

gray = cv2.GaussianBlur(gray,(5,5),0)
thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)

Below is the result :

Result of adaptive thresholding
Now two questions may arise :

1) What is the need of smoothing here?
2) Why Adaptive Thresholding ? Why not normal Thresholding using cv2.threshold()  ? 

Find the answers here : Some Common Questions

Find Sudoku Square and Corners :

Now we find the sudoku border. For that, we are taking a practical assumption : The biggest square in the image should be Sudoku Square. In short, image should be taken close to Sudoku, as you can see in the input image of demo.

So a lot of things are clear from this : Image should have only one square, Sudoku Square, or not, Sudoku Square must be the biggest. If this condition is not true, method fails.

It is because, we find the sudoku square by finding the biggest blob ( an independant particle) in the image. So if biggest blob is something other than Sudoku, that blob is processed. So, I think you will keep an eye on it.

We start by finding contours in the thresholded image:

contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

Now we find the biggest blob, ie blob with max. area.

For this, first we find area of each blob. Then we filter them by area. We consider a blob for next processing only if its area is greater than a particular value (here, it is 100). If so, we approximate the contours. It removes unwanted coordinate values in the contour and keep only the corners. So if number of corners equal to four, that is a square (actually, a rectangle). If it has the maximum area among all detected squares, it is out Sudoku square.

biggest = None
max_area = 0
for i in contours:
        area = cv2.contourArea(i)
        if area > 100:
                peri = cv2.arcLength(i,True)
                approx = cv2.approxPolyDP(i,0.02*peri,True)
                if area > max_area and len(approx)==4:
                        biggest = approx
                        max_area = area

For you to understand between original contour and approximated contour, I have drawn it on the image (using cv2.drawContours() function). Red line is the original contour, Green line is the approximated contour and corners marked in blue color circles.

Border and corners detected
Look at the top edge of sudoku. Original contour ( Red line) grazes on the edge of square and it is curved. Approximated contour ( Green line) just made it into a straight line.

Now, a simple question may arise. What is the benefit of filtering contours with respect to area? What is the need of removing them ? In simple words, it is done for speed up of the program. Although it may give you a little performance ( in the range of few milliseconds), even that will be good for those who want to implement it in real time. For more explanation, visit : Some Common Questions

Summary :

So, in this section, we have found the boundary of sudoku. Next part is the image transformation. I will explain it in next post.

Until then, I would like to know your feedback, doubts etc.

With Regards
ARK