
When trying to read the credit card information with OCR engine, I came to realize that the object in the image
always
has perspective distortion. Therefore we must perform perspective
correction to the object before trying to read the text. And it turns
out that this is a difficult task, although at first it
looks simple.
For the impatient: Download the
source code and a
test image.
In this post I will show you how to perform automatic perspective
correction for quadrilateral objects like credit card, playing card, and
a sheet of paper. This is the first step before extracting the text
with OCR engine.
I’m using this algorithm for recovering the distorted rectangle object:
- Get the edge map.
- Detect lines with Hough transform.
- Get the corners by finding intersections between lines.
- Check if the approximate polygonal curve has 4 vertices.
- Determine top-left, bottom-left, top-right, and bottom-right corner.
- Apply the perspective transformation.
Consider the playing card shown below. We will try to segment the
playing card and perform automatic perspective correction to obtain the
normal view. Note that I simplified the problem by making the playing
card as the only quadrilateral object on the scene.

Figure 1. (a) A playing card with perspective distortion (b) The recovered playing card.
The technique should be applicable to other quadrilateral objects other than playing card.
Get the edge map
The first step is getting the edge map from the source image. The
edge map is required for finding line segments with the Hough transform
in the next step.
Get the edge map using Canny operator:
// bw is the grayscaled source image
cv::Canny(bw, bw, 100, 100, 3);
Detect lines with Hough transform
We will be using the
probabilistic Hough transform rather than
standard Hough transform
for finding line segments. In my experiments the probabilistic Hough
transform yields less line segments but with higher accuracy than the
standard Hough transform.
std::vector<cv::Vec4i> lines;
cv::HoughLinesP(bw, lines, 1, CV_PI/180, 70, 30, 10);
If we visualize the line segments on the source image, we will get this image:
Notice that the line segments only occupy less than the corresponding
edges. In order to get the quadrilateral object, we need to obtain the
corner points i.e. the intersections of the line segments.
Expand the line segments to fit the image,
Update (Jan 19, 2013):
as Feng Chao
commented out, expanding the lines is not needed for finding the intersections of the line segments. I’ll leave it for visualization only.
// Needed for visualization only
for (int i = 0; i < lines.size(); i++)
{
cv::Vec4i v = lines[i];
lines[i][0] = 0;
lines[i][1] = ((float)v[1] - v[3]) / (v[0] - v[2]) * -v[0] + v[1];
lines[i][2] = src.cols;
lines[i][3] = ((float)v[1] - v[3]) / (v[0] - v[2]) * (src.cols - v[2]) + v[3];
}
Now we can compute the intersections from the expanded line segments.
Get the corners by finding intersections between lines
From
Wikipedia, the intersection of
L1(x1, y1) and
L2(x2, y2) is given by:
Now we can loop the
lines vector and pass a pair of line segment to the equation above.
cv::Point2f computeIntersect(cv::Vec4i a, cv::Vec4i b)
{
int x1 = a[0], y1 = a[1], x2 = a[2], y2 = a[3];
int x3 = b[0], y3 = b[1], x4 = b[2], y4 = b[3];
if (float d = ((float)(x1-x2) * (y3-y4)) - ((y1-y2) * (x3-x4)))
{
cv::Point2f pt;
pt.x = ((x1*y2 - y1*x2) * (x3-x4) - (x1-x2) * (x3*y4 - y3*x4)) / d;
pt.y = ((x1*y2 - y1*x2) * (y3-y4) - (y1-y2) * (x3*y4 - y3*x4)) / d;
return pt;
}
else
return cv::Point2f(-1, -1);
}
...
std::vector<cv::Point2f> corners;
for (int i = 0; i < lines.size(); i++)
{
for (int j = i+1; j < lines.size(); j++)
{
cv::Point2f pt = computeIntersect(lines[i], lines[j]);
if (pt.x >= 0 && pt.y >= 0)
corners.push_back(pt);
}
}
Visualizing the corner points,
Check if the approximate polygonal curve has 4 vertices
There is a chance that the object being observed is not a
quadrilateral. We check this by approximate a polygonal curve for the
corner points. For a quadrilateral, the approximation curve will have 4
vertices.
std::vector<cv::Point2f> approx;
cv::approxPolyDP(cv::Mat(corners), approx,
cv::arcLength(cv::Mat(corners), true) * 0.02, true);
if (approx.size() != 4)
{
std::cout << "The object is not quadrilateral!" << std::endl;
return -1;
}
Determine top-left, bottom-left, top-right, and bottom-right corner
Given the four corner points, we have to determine the top-left,
bottom-left, top-right, and bottom-right corner so we can apply the
perspective correction. The illustration is shown below.

Figure 2. Match the corners with the destination image.
To determine the top-left, bottom-left, top-right, and bottom right corner, we’ll use the simplest method:
- Get the mass center.
- Points that have lower y-axis than mass center are the top points, otherwise they are bottom points.
- Given two top points, the one with lower x-axis is the top-left. The other is the top-right.
- Given two bottom points, the one with lower x-axis is the bottom-left. The other is the bottom-right.
void sortCorners(std::vector<cv::Point2f>& corners, cv::Point2f center)
{
std::vector<cv::Point2f> top, bot;
for (int i = 0; i < corners.size(); i++)
{
if (corners[i].y < center.y)
top.push_back(corners[i]);
else
bot.push_back(corners[i]);
}
cv::Point2f tl = top[0].x > top[1].x ? top[1] : top[0];
cv::Point2f tr = top[0].x > top[1].x ? top[0] : top[1];
cv::Point2f bl = bot[0].x > bot[1].x ? bot[1] : bot[0];
cv::Point2f br = bot[0].x > bot[1].x ? bot[0] : bot[1];
corners.clear();
corners.push_back(tl);
corners.push_back(tr);
corners.push_back(br);
corners.push_back(bl);
}
...
// Get mass center
cv::Point2f center(0,0);
for (int i = 0; i < corners.size(); i++)
center += corners[i];
center *= (1. / corners.size());
sortCorners(corners, center);
Now the corners is perfectly sorted i.e.
corners[0] = top-left,
corners[1] = top-right,
corners[2] = bottom-right, and
corners[3] = bottom-left.
Apply the perspective transformation
OpenCV provides
cv::warpPerspective to apply a
perspective transformation to an image. The function accepts a source
image, a destination image, and a transformation matrix. The
transformation matrix is the relation between the two images as
illustrated if Figure 2. We obtain the transformation matrix from the
corner points of the object above and the corner points of the
destination image.
// Define the destination image
cv::Mat quad = cv::Mat::zeros(300, 220, CV_8UC3);
// Corners of the destination image
std::vector<cv::Point2f> quad_pts;
quad_pts.push_back(cv::Point2f(0, 0));
quad_pts.push_back(cv::Point2f(quad.cols, 0));
quad_pts.push_back(cv::Point2f(quad.cols, quad.rows));
quad_pts.push_back(cv::Point2f(0, quad.rows));
// Get transformation matrix
cv::Mat transmtx = cv::getPerspectiveTransform(corners, quad_pts);
// Apply perspective transformation
cv::warpPerspective(src, quad, transmtx, quad.size());
cv::imshow("quadrilateral", quad);
The quadrilateral object now transformed to “normal view” similar to Figure 1b.
Summary
In this tutorial I have shown you how to apply perspective correction
to a quadrilateral object in an image, as the first step before passing
the image to an OCR engine. The full source code is available on
Github.
copied form:
http://opencv-code.com/tutorials/automatic-perspective-correction-for-quadrilateral-objects/