If you like to read our old blogs you are welcome, I4INFO
Sometimes in the CTF , you get problems were you have to solve the Captcha. Today I thought to write on Captcha bypass, a simple trick using pytesser and tesseract ocr. The thing used by me is really a simple trick, where you can get the string of some simple Captch.
If you would like to download pytesser, go to this link.
Tesseract
Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License, Version 2.0, and development has been sponsored by Google since 2006. Tesseract is considered one of the most accurate open source OCR engines currently available.
To install tesseract-ocr
sudo apt-get install tesseract-ocr
To get the string in the command line.
tesseract input_image_file output_text
Now we will try OCR with one image using python.
from PIL import Image
import pytesser
im = Image.open(image_file)
text = image_file_to_string(image_file, graceful_errors=True)
print "=====output=======\n"
print text
Well this is the simplest way to do this, I will try to update the thread.
Thank you for reading the blog!