Extract string from image using pytesseract
I am a newbie on OCR manipulation and extraction data from images. After searching for solution I did find some code but it didn't work for my use case, it didn't extract correctly all characters, at most 2 of them.
I want to get the characters on this image:
I tried this solution:
image = cv2.imread('./images/screenshot_2023_11_16_15_41_24.png')
# Assuming 4 characters in a 36x9 image
char_width = image.shape[1] // 4
char_height = image.shape[0]
characters = []
characters_slices = [(0, 9), (9, 18), (18, 27), (27, 36)] # Adjust based on your image
for start, end in characters_slices:
char = image[0:char_height, start:end]
characters.append(char)
# Perform OCR on each character
extracted_text = ""
for char in characters:
char_text = pytesseract.image_to_string(char, config='--psm 10 --oem 3 -c char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
extracted_text += char_text.strip() + " "
print("Extracted Text:", extracted_text)
Output would be: 'H9FA'
Thanks.
Comments
Post a Comment