Reading Numbers Off Number Plates Machine Learning

In this tutorial, you volition build a basic Automated License/Number Plate Recognition (ANPR) system using OpenCV and Python.

ANPR is ane of the well-nigh requested topics hither on the PyImageSearch web log.

I've covered it in item inside the PyImageSearch Gurus course, and this blog post also appears every bit a affiliate in my upcoming Optical Character Recognition volume. If you savour the tutorial, you should definitely have a look at the book for more OCR educational content and instance studies!

Automatic License/Number Plate Recognition systems come up in all shapes and sizes:

  • ANPR performed in controlled lighting conditions with predictable license plate types can utilize basic image processing techniques.
  • More than advanced ANPR systems employ defended object detectors, such equally HOG + Linear SVM, Faster R-CNN, SSDs, and YOLO, to localize license plates in images.
  • Land-of-the-art ANPR software utilizes Recurrent Neural Networks (RNNs) and Long Short-Term Memory networks (LSTMs) to assist in amend OCR'ing of the text from the license plates themselves.
  • And even more than avant-garde ANPR systems apply specialized neural network architectures to pre-procedure and make clean images before they are OCR'd, thereby improving ANPR accurateness.

Automatic License/Number Plate Recognition is further complicated past the fact that it may need to operate in real fourth dimension.

For example, suppose an ANPR organization is mounted on a toll road. Information technology needs to be able to discover the license plate of each car passing by, OCR the characters on the plate, and then store this data in a database then the owner of the vehicle tin be billed for the cost.

Several compounding factors make ANPR incredibly challenging, including finding a dataset you can use to train a custom ANPR model! Large, robust ANPR datasets that are used to railroad train land-of-the-art models are closely guarded and rarely (if ever) released publicly:

  • These datasets contain sensitive identifying information related to the vehicle, driver, and location.
  • ANPR datasets are slow to curate, requiring an incredible investment of time and staff hours to annotate.
  • ANPR contracts with local and federal governments tend to be highly competitive. Considering of that, it's often non the trained model that is valuable, merely instead the dataset that a given company has curated.

For that reason, you lot'll see ANPR companies acquired non for their ANPR arrangement but for the data itself!

In this tutorial nosotros'll be building a bones Automatic License/Number Plate Recognition system. By the finish of this guide, you'll have a template/starting betoken to use when building your own ANPR projects.

To learn how to build a basic Automatic License Plate Recognition arrangement with OpenCV and Python, merely keep reading.

Looking for the source lawmaking to this post?

Jump Right To The Downloads Section

OpenCV: Automatic License/Number Plate Recognition (ANPR) with Python

My first run-in with ANPR was almost half dozen years ago.

Later on a grueling 3-day marathon consulting projection in Maryland, where information technology did nothing but rain the entire fourth dimension, I hopped on I-95 to drive dorsum to Connecticut to visit friends for the weekend.

It was a beautiful summer day. Sunday shining. Non a cloud in the sky. A soft breeze bravado. Perfect. Of grade, I had my windows down, my music turned up, and I had totally zoned out — not a care in the world.

I didn't even notice when I drove by a small gray box discreetly positioned along the side of the highway.

Two weeks later … I got the speeding ticket in the post.

Certain enough, I had unknowingly driven past a speed-trap photographic camera doing 78 MPH in a 65 MPH zone.

That speeding camera defenseless me with my pes on the pedal, quite literally, and it had the pictures to bear witness it too. There is was, clear equally day! You could see the license plate number on my old Honda Civic (before it got burnt to a well-baked in an electric fire.)

Now, here'southward the ironic role. I knew exactly how their Automatic License/Number Plate Recognition system worked. I knew which image processing techniques the developers used to automatically localize my license plate in the epitome and excerpt the plate number via OCR.

In this tutorial, my goal is to teach you one of the quickest ways to build such an Automatic License/Number Plate Recognition arrangement.

Using a chip of OpenCV, Python, and Tesseract OCR knowledge, you could help your homeowners' association monitor cars that come up and go from your neighborhood.

Or mayhap you want to build a camera-based (radar-less) system that determines the speed of cars that bulldoze past your house using a Raspberry Pi. If the car exceeds the speed limit, you can analyze the license plate, apply OCR to it, and log the license plate number to a database. Such a system could assistance reduce speeding violations and create better neighborhood safety.

In the outset role of this tutorial, you lot'll learn and define what Automated License/Number Plate Recognition is. From there, we'll review our project structure. I'll so evidence you how to implement a bones Python form (aptly named PyImageSearchANPR) that volition localize license plates in images and then OCR the characters. Nosotros'll wrap up the tutorial by examining the results of our ANPR system.

What is Automated License/Number Plate Recognition (ANPR/ALPR)?

Figure ane: An example of a real-time Automatic License/Number Plate Recognition system (image source: Chem on Pinterest).

Automatic License/Number Plate Recognition (ANPR/ALPR) is a process involving the following steps:

  • Pace #one: Detect and localize a license plate in an input image/frame
  • Step #two: Excerpt the characters from the license plate
  • Step #3: Apply some form of Optical Character Recognition (OCR) to recognize the extracted characters

ANPR tends to be an extremely challenging subfield of figurer vision, due to the vast multifariousness and assortment of license plate types across states and countries.

License plate recognition systems are farther complicated by:

  • Dynamic lighting conditions including reflections, shadows, and blurring
  • Fast-moving vehicles
  • Obstructions

Additionally, large and robust ANPR datasets for training/testing are difficult to obtain due to:

  1. These datasets containing sensitive, personal information, including fourth dimension and location of a vehicle and its driver
  2. ANPR companies and government entities closely guarding these datasets as proprietary information

Therefore, the first office of an ANPR project is usually to collect data and amass enough instance plates nether various weather condition.

So let's assume we don't have a license plate dataset (quality datasets are hard to come by). That rules out deep learning object detection, which ways we're going to have to exercise our traditional computer vision knowledge.

I agree that information technology would be prissy if nosotros had a trained object detection model, only today I want you lot to ascent to the occasion.

Earlier long, we'll be able to ditch the training wheels and consider working for a toll technology company, carmine-calorie-free camera integrator, speed ticketing system, or parking garage ticketing house in which nosotros demand 99.97% accuracy.

Given these limitations, we'll be building a basic ANPR system that you tin use as a starting point for your ain projects.

Configuring your OCR development environment

In this tutorial, we'll use OpenCV, Tesseract, and PyTesseract to OCR number plates automatically. Simply before we go alee of ourselves, let'southward commencement larn how to install these packages.

I recommend installing Python virtual environments and OpenCV earlier moving forward.

We are going to utilize a combination of pip, virtualenv, and virtualenvwrapper. My pip install opencv tutorial will assistance you get up and running with these tools, as well as the OpenCV binaries installed in a Python virtual environment.

You will likewise demand imutils and scikit-image for today's tutorial. If you lot're already familiar with Python virtual environments and the virtualenv + virtualenvwrapper tools, simply install the post-obit packages via pip:

$ workon {your_env} # replace with the name of your Python virtual environment $ pip install opencv-contrib-python $ pip install imutils $ pip install scikit-image

Then information technology's time to install Tesseract and its Python bindings. If you oasis't already installed Tesseract/PyTesseract software, delight follow the instructions in the "How to install Tesseract 4" section of my blog mail OpenCV OCR and text recognition with Tesseract. This volition configure and ostend that Tesseract OCR and PyTesseract bindings are set to go.

Note: Tesseract should be installed on your organization (not in a virtual environs). MacOS users should NOT execute any system-level brew commands while they are inside a Python virtual environs. Please deactivate your virtual environment first. You can always workon your environment once again to install more than packages, such as PyTesseract.

Project construction

If you haven't done then, become to the "Downloads" section and take hold of both the code and dataset for today'south tutorial. You'll need to unzip the archive to find the following:

$ tree --dirsfirst . ├── license_plates │   ├── group1 │   │   ├── 001.jpg │   │   ├── 002.jpg │   │   ├── 003.jpg │   │   ├── 004.jpg │   │   └── 005.jpg │   └── group2 │       ├── 001.jpg │       ├── 002.jpg │       └── 003.jpg ├── pyimagesearch │   ├── anpr │   │   ├── __init__.py │   │   └── anpr.py │   └── __init__.py └── ocr_license_plate.py  5 directories, 12 files

The projection folder contains:

  • license_plates : Directory containing two sub-directories of JPG images
  • anpr.py : Contains the PyImageSearchANPR form responsible for localizing license/number plates and performing OCR
  • ocr_license_plate.py : Our primary driver Python script, which uses our PyImageSearchANPR class to OCR entire groups of images

Now that we have the lay of the state, let's walk through our two Python scripts, which locate and OCR groups of license/number plates and display the results.

Implementing ANPR/ALPR with OpenCV and Python

We're prepare to first implementing our Automated License Plate Recognition script.

As I mentioned earlier, we'll keep our code neat and organized using a Python class appropriately named PyImageSearchANPR. This class provides a reusable means for license plate localization and character OCR operations.

Open anpr.py and let'due south go to work reviewing the script:

# import the necessary packages from skimage.segmentation import clear_border import pytesseract import numpy as np import imutils import cv2  class PyImageSearchANPR: 	def __init__(self, minAR=4, maxAR=5, debug=Imitation): 		# shop the minimum and maximum rectangular aspect ratio 		# values along with whether or not we are in debug mode 		cocky.minAR = minAR 		cocky.maxAR = maxAR 		self.debug = debug

If you've been following forth with my previous OCR tutorials, you might recognize some of our imports. Scikit-learn's clear_ border role may be unfamiliar to you, though — this method assists with cleaning up the borders of images.

Our PyImageSearchANPR course begins on Line 8. The constructor accepts three parameters:

  • minAR : The minimum aspect ratio used to detect and filter rectangular license plates, which has a default value of 4
  • maxAR: The maximum aspect ratio of the license plate rectangle, which has a default value of 5
  • debug : A flag to betoken whether we should display intermediate results in our image processing pipeline

The aspect ratio range (minAR to maxAR) corresponds to the typical rectangular dimensions of a license plate. Keep the following considerations in mind if y'all demand to alter the attribute ratio parameters:

  • European and international plates are often longer and not as alpine as United States license plates. In this tutorial, we're not considering U.Southward. license/number plates.
  • Sometimes, motorcycles and big dumpster trucks mount their plates sideways; this is a true edge instance that would have to be considered for a highly authentic license plate system (one we won't consider in this tutorial).
  • Some countries and regions permit for multi-line plates with a near 1:1 attribute ratio; again, we won't consider this border example.

Each of our constructor parameters becomes a course variable on Lines 12-14 so the methods in the course can access them.

Debugging our figurer vision pipeline

With our constructor ready to go, let'due south define a helper role to brandish results at various points in the imaging pipeline when in debug mode:

            def debug_imshow(self, title, image, waitKey=False): 		# check to encounter if we are in debug manner, and if and so, show the 		# image with the supplied championship 		if self.debug: 			cv2.imshow(title, image)  			# check to see if we should wait for a keypress 			if waitKey: 				cv2.waitKey(0)

Our helper function debug_imshow (Line 16) accepts iii parameters:

  • title : The desired OpenCV window title. Window titles should be unique; otherwise OpenCV will replace the paradigm in the same-titled window rather than creating a new 1.
  • image: The prototype to display within the OpenCV GUI window.
  • waitKey : A flag to see if the brandish should wait for a keypress before completing.

Lines xix-24 display the debugging prototype in an OpenCV window. Typically, the waitKey boolean volition exist False. However, in this tutorial nosotros take set it to True then we tin can audit debugging images and dismiss them when we are gear up.

Locating potential license plate candidates

Our kickoff ANPR method helps u.s. to find the license plate candidate contours in an image:

            def locate_license_plate_candidates(cocky, grey, keep=5): 		# perform a blackhat morphological operation that will permit 		# us to reveal dark regions (i.e., text) on light backgrounds 		# (i.e., the license plate itself) 		rectKern = cv2.getStructuringElement(cv2.MORPH_RECT, (xiii, 5)) 		blackhat = cv2.morphologyEx(grey, cv2.MORPH_BLACKHAT, rectKern) 		self.debug_imshow("Blackhat", blackhat)

Our locate_license_plate_candidates expects two parameters:

  • grayness: This office assumes that the driver script will provide a grayscale image containing a potential license plate.
  • keep : We'll merely render up to this many sorted license plate candidate contours.

Nosotros're now going to brand a generalization to aid us simplify our ANPR pipeline. Let's assume from here frontward that most license plates have a lite background (typically it is highly reflective) and a dark foreground (characters).

I realize there are plenty of cases where this generalization does not hold, just allow's continue working on our proof of concept, and we tin brand accommodations for inverse plates in the future.

Lines thirty and 31 perform a blackhat morphological performance to reveal dark characters (letters, digits, and symbols) against lite backgrounds (the license plate itself). As you tin encounter, our kernel has a rectangular shape of 13 pixels broad ten 5 pixels tall, which corresponds to the shape of a typical international license plate.

If your debug pick is on, you lot'll see a blackhat visualization similar to the i in Figure 2 (bottom):

Effigy two: OpenCV's blackhat morphological operator highlights the license plate numbers against the rest of the photograph of the rear cease of the car. You tin can see that the license plate numbers "popular" as white text against the black background and virtually of the groundwork noise is done out.

Equally you tin can see from above, the license plate characters are clearly visible!

In our adjacent step, we'll find regions in the image that are light and may contain license plate characters:

            # next, observe regions in the image that are low-cal 		squareKern = cv2.getStructuringElement(cv2.MORPH_RECT, (3, iii)) 		lite = cv2.morphologyEx(grey, cv2.MORPH_CLOSE, squareKern) 		light = cv2.threshold(lite, 0, 255, 			cv2.THRESH_BINARY | cv2.THRESH_OTSU)[i] 		self.debug_imshow("Light Regions", light)

Using a small square kernel (Line 35), we apply a closing operation (Lines 36) to fill up small holes and help us identify larger structures in the image. Lines 37 and 38 perform a binary threshold on our image using Otsu'south method to reveal the low-cal regions in the image that may comprise license plate characters.

Effigy three shows the effect of the closing functioning combined with Otsu'south inverse binary thresholding. Discover how the regions where the license plate is located are almost ane big white surface.

Figure iii: OpenCV is used to perform a closing and threshold performance as a pre-processing pipeline step for Automatic License/Number Plate Recognition (ANPR) with Python.

Effigy 3 shows the region that includes the license plate continuing out.

The Scharr gradient will detect edges in the image and emphasize the boundaries of the characters in the license plate:

            # compute the Scharr gradient representation of the blackhat 		# image in the x-direction and then scale the upshot back to 		# the range [0, 255] 		gradX = cv2.Sobel(blackhat, ddepth=cv2.CV_32F, 			dx=1, dy=0, ksize=-ane) 		gradX = np.absolute(gradX) 		(minVal, maxVal) = (np.min(gradX), np.max(gradX)) 		gradX = 255 * ((gradX - minVal) / (maxVal - minVal)) 		gradX = gradX.astype("uint8") 		self.debug_imshow("Scharr", gradX)

Using cv2.Sobel, we compute the Scharr slope magnitude representation in the ten-direction of our blackhat prototype (Lines 44 and 45). We and then scale the resulting intensities back to the range [0, 255] (Lines 46-49).

Figure 4 demonstrates an emphasis on the edges of the license plate characters:

Effigy 4: Applying Scharr'due south algorithm in the x-direction emphasizes the edges in our blackhat epitome equally another ANPR image processing pipeline step.

As y'all can see above, the license plate characters appear noticeably different from the residuum of the paradigm.

We tin can now smooth to group the regions that may contain boundaries to license plate characters:

            # mistiness the gradient representation, applying a closing 		# functioning, and threshold the paradigm using Otsu's method 		gradX = cv2.GaussianBlur(gradX, (5, v), 0) 		gradX = cv2.morphologyEx(gradX, cv2.MORPH_CLOSE, rectKern) 		thresh = cv2.threshold(gradX, 0, 255, 			cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1] 		self.debug_imshow("Grad Thresh", thresh)

Here we employ a Gaussian mistiness to the gradient magnitude prototype (gradX) (Line 54). Again we apply a closing operation (Line 55) and another binary threshold using Otsu's method (Lines 56 and 57).

Effigy 5 shows a face-to-face white region where the license plate characters are located:

Figure v: Blurring, closing, and thresholding operations using OpenCV and Python result in a contiguous white region on summit of the license plate/number plate characters.

At first glance, these results expect chaotic. The license plate region is somewhat defined, but at that place are many other large white regions likewise. Let'southward run into if we tin eliminate some of the racket:

            # perform a series of erosions and dilations to clean up the 		# thresholded image 		thresh = cv2.erode(thresh, None, iterations=two) 		thresh = cv2.dilate(thresh, None, iterations=2) 		self.debug_imshow("Grad Erode/Dilate", thresh)

Lines 62 and 63 perform a series of erosions and dilations in an endeavour to denoise the thresholded epitome:

Effigy 6: Erosions and dilations with OpenCV and Python clean upward our thresholded paradigm, making information technology easier to detect our license plate characters for our ANPR organization.

As you can meet in Effigy half-dozen, the erosion and dilation operations cleaned upwards a lot of racket in the previous result from Figure 5. Nosotros conspicuously aren't done yet though.

Let'due south add together another footstep to the pipeline, in which nosotros'll put our light regions image to use:

            # take the bitwise AND between the threshold result and the 		# low-cal regions of the image 		thresh = cv2.bitwise_and(thresh, thresh, mask=light) 		thresh = cv2.dilate(thresh, None, iterations=2) 		thresh = cv2.erode(thresh, None, iterations=1) 		self.debug_imshow("Final", thresh, waitKey=Truthful)

Back on Lines 35-38, we devised a method to highlight lighter regions in the epitome (keeping in mind our established generalization that license plates volition have a light background and dark foreground).

This light image serves equally our mask for a bitwise-AND between the thresholded issue and the light regions of the image to reveal the license plate candidates (Line 68). We follow with a couple of dilations and an erosion to fill holes and make clean up the paradigm (Lines 69 and 70).

Our "Final" debugging image is shown in Figure 7. Notice that the terminal call to debug_imshow overrides waitKey to True, ensuring that as a user, we can inspect all debugging images upward until this point and press a key when we are ready.

Figure vii: Afterwards a serial of image processing pipeline steps for ANPR/ALPR performed with OpenCV and Python, we can clearly meet the region with the license plate characters is one of the larger contours.

Y'all should notice that our license plate profile is not the largest, but it'southward far from being the smallest. At a glance, I'd say it is the second or tertiary largest profile in the image, and I also notice the plate contour is not touching the edge of the image.

Speaking of contours, permit's find and sort them:

            # notice contours in the thresholded image and sort them by 		# their size in descending order, keeping only the largest 		# ones 		cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, 			cv2.CHAIN_APPROX_SIMPLE) 		cnts = imutils.grab_contours(cnts) 		cnts = sorted(cnts, central=cv2.contourArea, opposite=True)[:go on]  		# return the list of contours 		return cnts

To shut out our locate_license_plate_candidates method, we:

  • Notice all contours (Lines 76-78)
  • Reverse-sort them according to their pixel area while but keeping at near keep contours
  • Return the resulting sorted and pruned list of cnts (Line 82).

Take a step back to call up most what we've accomplished in this method. We've accepted a grayscale image and used traditional paradigm processing techniques with an emphasis on morphological operations to find a selection of candidate contours that might contain a license plate.

I know what yous are thinking: "Why haven't we applied deep learning object detection to find the license plate? Wouldn't that be easier?"

While that is perfectly adequate (and don't get me incorrect, I love deep learning!), it is a lot of work to train such an object detector on your own. We're talking requires countless hours to annotate thousands of images in your dataset.

Just call back nosotros didn't have the luxury of a dataset in the first place, so the method we've developed and so far relies on so-called "traditional" image processing techniques.

If you lot're hungry to learn the ins and outs of morphological operations (and want to be a more well-rounded calculator vision engineer), I suggest yous enroll in the PyImageSearch Gurus class.

Pruning license plate candidates

In this side by side method, our goal is to observe the most likely contour containing a license plate from our set of candidates. Let's encounter how it works:

            def locate_license_plate(self, gray, candidates, 		clearBorder=False): 		# initialize the license plate contour and ROI 		lpCnt = None 		roi = None  		# loop over the license plate candidate contours 		for c in candidates: 			# compute the bounding box of the contour and and so utilise 			# the bounding box to derive the aspect ratio 			(x, y, w, h) = cv2.boundingRect(c) 			ar = west / bladder(h)

Our locate_license_plate function accepts three parameters:

  • gray: Our input grayscale prototype
  • candidates: The license plate contour candidates returned by the previous method in this course
  • clearBorder : A boolean indicating whether our pipeline should eliminate any contours that touch on the border of the image

Before we brainstorm looping over the license plate contour candidates, first nosotros initialize variables that will presently concord our license plate contour (lpCnt) and license plate region of interest (roi) on Lines 87 and 88.

Starting on Line 91, our loop begins. This loop aims to isolate the contour that contains the license plate and extract the region of interest of the license plate itself. We proceed by determining the bounding box rectangle of the contour, c (Line 94).

Calculating the aspect ratio of the contour's bounding box (Line 95) will help united states ensure our profile is the proper rectangular shape of a license plate.

As you can see in the equation, the attribute ratio is a relationship betwixt the width and height of the rectangle.

            # check to see if the aspect ratio is rectangular 			if ar >= self.minAR and ar <= self.maxAR: 				# store the license plate contour and extract the 				# license plate from the grayscale paradigm and then 				# threshold it 				lpCnt = c 				licensePlate = greyness[y:y + h, x:x + west] 				roi = cv2.threshold(licensePlate, 0, 255, 					cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[i]

If the profile's bounding box ar does not meet our license plate expectations, and so there'south no more than work to do. The roi and lpCnt volition remain equally None, and it is upwardly to the driver script to handle this scenario.

Hopefully, the aspect ratio is acceptable and falls within the bounds of a typical license plate'south minAR and maxAR. In this case, nosotros presume that we accept our winning license plate contour! Let's go ahead and populate lpCnt and our roi:

  • lpCnt is set up from the current contour, c (Line 102).
  • roi is extracted via NumPy slicing (Line 103) and subsequently binary-inverse thresholded using Otsu'southward method (Lines 104 and 105).

Allow'south wrap up the locate_license_plate method so we tin can move onto the next stage:

            # check to run into if we should articulate any foreground 				# pixels touching the border of the image 				# (which typically, not but ever, indicates dissonance) 				if clearBorder: 					roi = clear_border(roi)  				# display any debugging information and so break 				# from the loop early since we accept found the license 				# plate region 				cocky.debug_imshow("License Plate", licensePlate) 				self.debug_imshow("ROI", roi, waitKey=True) 				break  		# return a 2-tuple of the license plate ROI and the contour 		# associated with it 		render (roi, lpCnt)

If our clearBorder flag is set, nosotros can clear any foreground pixels that are touching the border of our license plate ROI (Lines 110 and 111). This helps to eliminate noise that could touch on our Tesseract OCR results.

Lines 116 and 117 display our:

  • licensePlate : The ROI pre-thresholding and edge cleanup (Effigy 8, acme)
  • roi: Our last license plate ROI (Figure 8, lesser)

Once again, notice that the concluding call to debug_imshow of this function overrides waitKey to True, ensuring that as a user we have the opportunity to inspect all debugging images for this function and tin press a key when we are gear up.

After that key is pressed, nosotros interruption out of our loop, ignoring other candidates. Finally, we render the 2-tuple consisting of our ROI and license plate contour to the caller.

Figure 8: The results of our Python and OpenCV-based ANPR localization pipeline. This sample is very suitable to laissez passer on to be OCR'd with Tesseract.

The bottom issue is encouraging because Tesseract OCR should be able to decipher the characters.

Defining Tesseract ANPR options including an OCR Character Whitelist and Page Segmentation Mode (PSM)

Leading up to this bespeak, we've used our knowledge of OpenCV's morphological operations and profile processing to both observe the plate and ensure we accept a make clean image to transport through the Tesseract OCR engine.

It is now time to do just that. Shifting our focus to OCR, allow'due south define the build_tesseract_options method:

            def build_tesseract_options(self, psm=vii): 		# tell Tesseract to only OCR alphanumeric characters 		alphanumeric = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" 		options = "-c tessedit_char_whitelist={}".format(alphanumeric)  		# set the PSM mode 		options += " --psm {}".format(psm)  		# return the congenital options string 		render options

Tesseract and its Python bindings brother, PyTesseract, accept a range of configuration options. For this tutorial we're only concerned with two:

  • Page Division Method (PSM): Tesseract's setting indicating layout analysis of the document/paradigm. There are 13 modes of performance, only we volition default to seven — "care for the epitome equally a unmarried text line" — per the psm parameter default.
  • Whitelist: A listing of characters (letters, digits, symbols) that Tesseract will consider (i.eastward., report in the OCR'd results). Each of our whitelist characters is listed in the alphanumeric variable (Line 126).

Lines 127-130 concatenate both into a formatted string with these option parameters. If you're familiar with Tesseract's command line arguments, you'll notice that our PyTesseract options cord has a direct relationship.

Our options are returned to the caller via Line 133.

The central method of the PyImageSearchANPR course

Our last method brings all the components together in one centralized identify so our driver script can instantiate a PyImageSearchANPR object, and and then make a single function telephone call. Permit'due south implement find_and_ocr:

            def find_and_ocr(cocky, image, psm=seven, clearBorder=Fake): 		# initialize the license plate text 		lpText = None  		# convert the input image to grayscale, locate all candidate 		# license plate regions in the paradigm, and and then process the 		# candidates, leaving us with the *bodily* license plate 		greyness = cv2.cvtColor(prototype, cv2.COLOR_BGR2GRAY) 		candidates = self.locate_license_plate_candidates(gray) 		(lp, lpCnt) = cocky.locate_license_plate(gray, candidates, 			clearBorder=clearBorder)  		# only OCR the license plate if the license plate ROI is not 		# empty 		if lp is non None: 			# OCR the license plate 			options = self.build_tesseract_options(psm=psm) 			lpText = pytesseract.image_to_string(lp, config=options) 			self.debug_imshow("License Plate", lp)  		# return a 2-tuple of the OCR'd license plate text along with 		# the profile associated with the license plate region 		render (lpText, lpCnt)

This method accepts three parameters:

  • image: The three-channel color epitome of the rear (or front) of a car with a license plate tag
  • psm : The Tesseract Page Partitioning Mode
  • clearBorder : The flag indicating whether we'd like to make clean upward contours touching the border of the license plate ROI

Given our function parameters, we now:

  • Convert the input image to grayscale (Line 142)
  • Decide our set of license plate candidates from our gray image via the method nosotros previously defined (Line 143)
  • Locate the license plate from the candidates resulting in our lp ROI (Lines 144 and 145)

Bold nosotros've found a suitable plate (i.east., lp is not None), we set our PyTesseract options and perform OCR via the image_to_string method (Lines 149-152).

Finally, Line 157 returns a 2-tuple consisting of the OCR'd lpText and lpCnt contour.

Phew! You did it! Nice job implementing the PyImageSearchANPR class.

If you found that implementing this class was challenging to understand, and then I would recommend you study Module ane of the PyImageSearch Gurus course, where you'll learn the basics of calculator vision and image processing.

In our adjacent section, we'll create a Python script that utilizes the PyImageSearchANPR class to perform Automated License/Number Plate Recognition on input images.

Creating our license/number plate recognition driver script with OpenCV and Python

Now that our PyImageSearchANPR class is implemented, we can movement on to creating a Python driver script that volition:

  1. Load an input image from disk
  2. Detect the license plate in the input image
  3. OCR the license plate
  4. Display the ANPR result to our screen

Let's take a look in the project directory and find our driver file ocr_license_plate.py:

# import the necessary packages from pyimagesearch.anpr import PyImageSearchANPR from imutils import paths import argparse import imutils import cv2

Here nosotros accept our imports, namely our custom PyImageSearchANPR class that we implemented in the "Implementing ANPR/ALPR with OpenCV and Python" section and subsections.

Before nosotros become farther, we need to write a little string-cleanup utility:

def cleanup_text(text): 	# strip out non-ASCII text so we tin draw the text on the image 	# using OpenCV 	return "".join([c if ord(c) < 128 else "" for c in text]).strip()

Our cleanup_text function only accepts a text string and parses out all not-alphanumeric characters. This serves every bit a safe mechanism for OpenCV's cv2.putText function, which isn't e'er able to render special characters during epitome annotation (OpenCV will return them every bit "?", question marks).

Equally you can see, we're ensuring that just ASCII characters with ordinals [0, 127] pass through. If you are unfamiliar with ASCII and alphanumeric characters, bank check out my post OCR with Keras, TensorFlow, and Deep Learning or take hold of a copy of my upcoming OCR book, which encompass this extensively.

Allow'south familiarize ourselves with this script's command line arguments:

# construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--input", required=True, 	help="path to input directory of images") ap.add_argument("-c", "--articulate-border", type=int, default=-1, 	aid="whether or to articulate edge pixels before OCR'ing") ap.add_argument("-p", "--psm", type=int, default=7, 	help="default PSM manner for OCR'ing license plates") ap.add_argument("-d", "--debug", blazon=int, default=-ane, 	assist="whether or not to show boosted visualizations") args = vars(ap.parse_args())

Our command line arguments include:

  • --input : The required path to the input directory of vehicle images.
  • --clear-border : A flag indicating if we'll clean up the edges of our license plate ROI prior to passing it to Tesseract (further details are presented in the "Pruning license plate candidates" department higher up).
  • --psm : Tesseract's Page Segmentation Style; a 7 indicates that Tesseract should only look for one line of text.
  • --debug : A boolean indicating whether we wish to display intermediate epitome processing pipeline debugging images.

With our imports in place, text cleanup utility defined, and an understanding of our control line arguments, at present it is time to automatically recognize license plates!

# initialize our ANPR form anpr = PyImageSearchANPR(debug=args["debug"] > 0)  # grab all image paths in the input directory imagePaths = sorted(list(paths.list_images(args["input"])))

Get-go, we instantiate our PyImageSearchANPR object while passing our --debug flag (Line 26). We also go alee and bring in all the --input image paths with imutils' paths module (Line 29).

We'll process each of our imagePaths in hopes of finding and OCR'ing each license plate successfully:

# loop over all prototype paths in the input directory for imagePath in imagePaths: 	# load the input image from disk and resize information technology 	epitome = cv2.imread(imagePath) 	image = imutils.resize(image, width=600)  	# apply automated license plate recognition 	(lpText, lpCnt) = anpr.find_and_ocr(image, psm=args["psm"], 		clearBorder=args["clear_border"] > 0)  	# simply continue if the license plate was successfully OCR'd 	if lpText is not None and lpCnt is non None: 		# fit a rotated bounding box to the license plate contour and 		# draw the bounding box on the license plate 		box = cv2.boxPoints(cv2.minAreaRect(lpCnt)) 		box = box.astype("int") 		cv2.drawContours(epitome, [box], -ane, (0, 255, 0), two)  		# compute a normal (unrotated) bounding box for the license 		# plate and then draw the OCR'd license plate text on the 		# prototype 		(x, y, westward, h) = cv2.boundingRect(lpCnt) 		cv2.putText(image, cleanup_text(lpText), (x, y - 15), 			cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 255, 0), 2)  		# evidence the output ANPR image 		impress("[INFO] {}".format(lpText)) 		cv2.imshow("Output ANPR", image) 		cv2.waitKey(0)

Looping over our imagePaths, nosotros load and resize the prototype (Lines 32-35).

A call to our find_and_ocr method — while passing the image, --psm mode, and --clear-border flag — primes our ANPR pipeline pump to spit out the resulting OCR'd text and license plate contour on the other end.

You lot've just performed ANPR/ALPR in the driver script! If y'all need to revisit this method, refer to the walkthrough in the "The central method of the PyImageSearchANPR class" section, begetting in listen that the bulk of the work is done in the class methods leading upward to the find_and_ocr method.

Assuming that both lpText and lpCnt did not render as None (Line 42), let's comment the original input image with the OCR result. Within the conditional, we:

  • Summate and describe the bounding box of the license plate contour (Lines 45-47)
  • Annotate the cleanedup lpText string (Lines 52-54)
  • Brandish the license plate string in the final and the annotated paradigm in a GUI window (Lines 57 and 58)

Yous can at present cycle through all of your --input directory images past pressing any primal (Line 59).

You lot did information technology! Give yourself a pat on the back before proceeding to the results department — yous deserve information technology.

ANPR results with OpenCV and Python

We are now ready to apply Automatic License/Number Plate Recognition using OpenCV and Python.

Starting time past using the "Downloads" section of this tutorial to download the source lawmaking and example images.

From there, open a terminal and execute the following command for our first grouping of test images:

$ python ocr_license_plate.py --input license_plates/group1 [INFO] MH15TC584 [INFO] KL55R2473 [INFO] MH20EE7601 [INFO] KLO7BF5000 [INFO] HR26DA2330
Figure 9: Our Automatic License/Number Plate Recognition algorithm adult with Python, OpenCV, and Tesseract is successful on all v of the test images in the first grouping!

Every bit y'all can see, we've successfully practical ANPR to all of these images, including license/number plate examples on the forepart or back of the vehicle.

Let's try another set up of images, this fourth dimension where our ANPR solution doesn't work equally well:

$ python ocr_license_plate.py --input license_plates/group2 [INFO] MHOZDW8351 [INFO] SICAL [INFO] WMTA
Effigy 10: Unfortunately, "grouping ii" vehicle images lead to mixed results. In this instance, we are not invoking the pick to clear foreground pixels around the edge of the license plate, which is detrimental to Tesseract's ability to decipher the number plate.

While the outset consequence image has the correct ANPR consequence, the other 2 are wildly incorrect.

The solution hither is to apply our clear_border function to strip foreground pixels that touch the border of the image that misfile Tesseract OCR:

$ python ocr_license_plate.py --input license_plates/group2 --clear-border i [INFO] MHOZDW8351 [INFO] KA297999 [INFO] KE53E964
Figure 11: By applying the clear_border option to "group 2" vehicle images, we see an comeback in the results. Withal, we however have OCR mistakes nowadays in the top-right and lesser examples.

Nosotros're able to improve the ANPR OCR results for these images past applying the clear_border office.

However, there is still one fault in each example. In the top-right case, the letter "Z" is mistaken for the digit "vii". In the bottom case, the letter "L" is mistaken for the letter "Eastward".

Although these are understandable mistakes, we would promise to do better.

While our organization is a groovy start (and is certain to impress our friends and family!), there are some obvious limitations and drawbacks associated with today's proof of concept. Allow's discuss them, forth with a few ideas for improvement.

Limitations and drawbacks

Figure 12: Our Automatic License/Number Plate Recognition solution was very sensitive to some conditions. In this instance, allowing characters to bear upon the edges of the image resulted in noisy input to the Tesseract OCR, resulting in lower accurateness.

As the previous section's ANPR results showed, sometimes our ANPR system worked well and other times it did not. Furthermore, something as simple as immigration whatever foreground pixels that bear upon the borders of the input license plate improved license plate OCR accuracy.

Why is that?

The elementary answer hither is that Tesseract's OCR engine tin can be a scrap sensitive. Tesseract will work best when you provide it with neatly cleaned and pre-candy images.

Nonetheless, in real-world implementations, you may not be able to guarantee clear images. Instead, your images may exist grainy or low quality, or the driver of a given vehicle may have a special cover on their license plate to obfuscate the view of information technology, making ANPR fifty-fifty more challenging.

As I mentioned in the introduction to this tutorial (and I'll reiterate in the summary), this blog mail service serves as a starting point to building your ain Automatic License/Number Plate Recognition systems.

This method volition piece of work well in controlled weather condition, merely if y'all want to build a system that works in uncontrolled environments, you'll need to start replacing components (namely license plate localization, character partitioning, and character OCR) with more advanced auto learning and deep learning models.

If you're interested in more than advanced ANPR methods, please allow me know what challenges you're facing and so I can develop future content for y'all!

Credits

The collection of images we used for this ANPR example was sampled from the dataset put together by Devika Mishra of DataTurks. Thank you for putting together this dataset, Devika!

What's next? I recommend PyImageSearch University.

Class data:
35+ full classes • 39h 44m video • Last updated: Feb 2022
★★★★★ 4.84 (128 Ratings) • 3,000+ Students Enrolled

I strongly believe that if yous had the right instructor you could master computer vision and deep learning.

Practice yous recall learning reckoner vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That'due south not the case.

All you lot need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that'south exactly what I do. My mission is to alter teaching and how complex Artificial Intelligence topics are taught.

If you're serious most learning calculator vision, your next terminate should be PyImageSearch University, the virtually comprehensive computer vision, deep learning, and OpenCV course online today. Here y'all'll learn how to successfully and confidently apply estimator vision to your work, research, and projects. Join me in reckoner vision mastery.

Within PyImageSearch University you'll notice:

  • 35+ courses on essential computer vision, deep learning, and OpenCV topics
  • ✓ 35+ Certificates of Completion
  • 39h 44m on-demand video
  • Make new courses released every month , ensuring you tin can keep upward with country-of-the-fine art techniques
  • Pre-configured Jupyter Notebooks in Google Colab
  • ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev surroundings configuration required!)
  • ✓ Access to centralized lawmaking repos for all 500+ tutorials on PyImageSearch
  • Easy i-click downloads for code, datasets, pre-trained models, etc.
  • ✓ Access on mobile, laptop, desktop, etc.

Click here to bring together PyImageSearch University

Summary

In this tutorial, you learned how to build a basic Automatic License/Number Plate Recognition system using OpenCV and Python.

Our ANPR method relied on basic computer vision and image processing techniques to localize a license plate in an epitome, including morphological operations, image gradients, thresholding, bitwise operations, and contours.

This method will work well in controlled, predictable environments — similar when lighting conditions are compatible across input images and license plates are standardized (such as dark characters on a low-cal license plate background).

Nevertheless, if you are developing an ANPR system that does not have a controlled environment, you'll demand to beginning inserting machine learning and/or deep learning to replace parts of our plate localization pipeline.

HOG + Linear SVM is a good starting signal for plate localization if your input license plates take a viewing angle that doesn't modify more than a few degrees. If you're working in an unconstrained surroundings where viewing angles tin can vary dramatically, then deep learning-based models such equally Faster R-CNN, SSDs, and YOLO volition likely obtain better accuracy.

Additionally, you may need to train your ain custom license plate character OCR model. Nosotros were able to get away with Tesseract in this blog postal service, merely a dedicated character sectionalisation and OCR model (like the ones I cover inside the PyImageSearch Gurus class) may exist required to ameliorate your accurateness.

I hope you enjoyed this tutorial!

To download the source lawmaking to this post (and exist notified when future tutorials are published here on PyImageSearch), simply enter your email address in the grade below!

Download the Source Code and FREE 17-page Resources Guide

Enter your e-mail address below to get a .zilch of the lawmaking and a Free 17-folio Resource Guide on Reckoner Vision, OpenCV, and Deep Learning. Inside you'll find my manus-picked tutorials, books, courses, and libraries to help you master CV and DL!

mayimesers.blogspot.com

Source: https://pyimagesearch.com/2020/09/21/opencv-automatic-license-number-plate-recognition-anpr-with-python/

0 Response to "Reading Numbers Off Number Plates Machine Learning"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel