Extract Text From Image Using pytesseract in Python

In this tutorial let’s write Python Script to Extract Text From Image Using pytesseract in Python.

Python Code to Extract Text From Image Using `pytesseract`

Pytesseract often known as Python-tesseract, is a Python Optical Character Recognition (OCR) tool. It can read and recognise text in photos, licence plates, and so on. To read the words from the given image, we will utilize the tesseract software.

Here, three basic steps are involved, as indicated below:

Loading an image from the computer or downloading it with a browser and then loading it. (Any image accompanied by text.)
Image Binarization (Converting Image to Binary).
The image will subsequently be processed by the OCR system.

Install pytesseract and pillow using below commands.

pip install pytesseract

pip install pillow

app.py : You can save the below code with app.py filename.

The below code will Extract Text From Image Using pytesseract in Python.

from PIL import Image 
from pytesseract import pytesseract 


path_to_tesseract = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
image_path = r"csv\sample_text.png"

# Opening the image & storing it in an image object 
img = Image.open(image_path) 
pytesseract.tesseract_cmd = path_to_tesseract 

# image_to_string() function
text = pytesseract.image_to_string(img) 

# printing the extracted text 
print(text[:-1])

Output:

To execute the code through command line, you can use python3 app.py

Extract Text From Image Using pytesseract in Python

Python Code to Extract Text From Image Using pytesseract

Related Codes

Python Code to Extract Text From Image Using `pytesseract`