Friday, June 22, 2012

Sudoku Solver - Part 3

Hi friends,

Recently I got busy with some other projects, so I couldn't post remaining part.

In the last article, we found the four corners of Sudoku border. In this session, we will be correcting the perspective of Sudoku and making it straight with a uniform size for further works.

So what we have now ? We have four corners of the Sudoku Square. These are the corners, or approx when printed.

[[[491 68]]
[[ 73 84]]
[[ 34 516]]
[[520 522]]]

This is a 4x1x2 array. But check out the values. First row is the TOP-RIGHT corner. Second row is the TOP-LEFT corner. Third row is the BOTTOM-LEFT corner. Finally, fourth one is the BOTTOM-RIGHT corner.

The problem is that, there is no guarantee that for next image, the corners found out will be in this same order. And why should we have the same order?

In the next step, we have to convert these points into another square of size 450x450 such that, the point at [491,68] will be at [449,0] in the new image, similarly for all points. If the order changes, we get some rotated images or even mirrored images.

Therefore , we have to keep them in uniform order. I used this order, [ TOP-LEFT, TOP-RIGHT, BOTTOM-RIGHT, BOTTOM-LEFT ]. You can use whatever you like.

I did it as follows.

First take the sum of x,y coordinates. TOP-LEFT has least sum, and BOTTOM-RIGHT has maximum sum. Now find the difference, ie y-x. TOP-RIGHT has minimum and BOTTOM-LEFT has maximum. It is written as a function. For this, we need to reshape 'approx' to (4,2).

def rectify(h):
        h = h.reshape((4,2))
        hnew = np.zeros((4,2),dtype = np.float32)

        add = h.sum(1)
        hnew[0] = h[np.argmin(add)]
        hnew[2] = h[np.argmax(add)]
        
        diff = np.diff(h,axis = 1)
        hnew[1] = h[np.argmin(diff)]
        hnew[3] = h[np.argmax(diff)]
 
        return hnew

This function should be included at the beginning. Now let's come back to end of code.

approx=rectify(approx)

Now we have the 4 points in order. Now we need corresponding points to where they should be mapped. I took a 450x450 image, and took points as below:

h = np.array([ [0,0],[449,0],[449,449],[0,449] ],np.float32)

Hope it is clear for you. We want the point [ 73 84] to be at [0 0] in new image, point [520 522] should be at [449,449] and so on.

Why image of 450x450 ? Now it has no particular reason, I just took it. 

So earlier ? Yeah, it had some significance back then, when I first trying to develop this. But later I understood it is not needed, and there are more better ways to do it, so gave up the idea,but I retained the size. (Well, that is a little story)

Okay, but why all images should be resized to this ? You will understand it in OCR part. Digits are found using their height and width. So we need all digits to have same size, whatever input image we give.

So now we have the input array (approx) and output array (h). Next we apply the perspective correction.

retval = cv2.getPerspectiveTransform(approx,h)
warp = cv2.warpPerspective(gray,retval,(450,450))

The result is obtained as below :


There is some defects at the top. To correct that also, we need to do some extra efforts, which I will explain in another article.

So we are ready to do OCR work. In next article, I will explain the OCR training.

With Regards,
ARK