OCR challenge: Catch me if you can! [Affinity CTF lite 2020] in Web Hacking Category

0x0elliot (Elliot)
4 min readNov 17, 2020


A fun challenge which required luck. Was more of a misc challenge than a web exploitation challenge but I had fun anyway.

Recently me and my little team called Gateway (we are just a group of try-hards and actually really smart players who love fucking around with tech) attended Affinity CTF Lite 2020. We weren’t supposed to play it but out of nowhere one thing led to another and I had enough time to play it so we did it anyway.

Back then it was worth 750 points. The points decreased as people solved it. When I finished, We got 650 or something points out of it which took us from rank 71 to rank 50 something.

This is how the challenge looked like:

The text always came in the form of an image and it seemed to be random.

The objective of the challenge was to enter the text that reloaded every 15 seconds or so (I had to count that lmao) on the insert text here bar. Easy right? Yes but the catch was (pun intended) that the text loaded as an image. You couldn’t just copy paste it.

After some inspection, I felt like there was no direct hack there. I intercepted the request and all that but got nothing useful out of it. So then to solve this problem I used OCR (Optical character recognition)

I haven’t messed with OCR much before but now it was necessary. So I put my programming socks on, sat the fuck down and programmed while my teammates solved other problems with the same focused attitude.

How it was solved:

I wrote a script (don’t worry you will find the link to it in here) using selenium python webdriver (the chrome one which allowed to me to automate my browser) which opened the website. Then I made it get the source of the image. On printing the source, I understood that it was nothing but base64 representation of the image. I then added a functionality into my script to write the base64 data into an image file and save it.

After that, It was easy to use Google’s Tesseract module with my python script to read the text from the image. The problem was though, It wasn’t accurate enough.

This screen haunted me for hours

And every time I got the steps wrong, this came up. And trust me I saw this a lot. So I read up on how to make OCR more accurate. Turns out Binarisation, Noise filtering, Sharpening and all that jazz helps OCR to be more accurate. I threw all that together in the end and came up with this script: https://github.com/GatewayFolks/WriteUps/blob/main/Affinity%20CTF%20Lite%202020/Web/Catch%20Me%20If%20You%20Can/catch.py

I have commented it as much as I could to make the process easier.

It didn’t work for a long time. I experimented with a lot of things and that piece of code is just a child of that. A monstrous child. It’s like a child with three arms and six legs. Still I love it.

But then, Out of nowhere, After hours of fixing my code and adding little tweaks to it, Python FINALLY read it and I saw this wonderful screen:

This was the flag :)

Honestly felt amazing. I remember texting everyone in our discord server: “I DID IT MOTHERFUCKERS.”

In the end, Our team ended up at 23rd place out of 690 (Nice) teams. Was a fun experience, I loved it.

Speaking of our discord, Here is the link to where I hangout on discord :) https://discord.gg/X5QQrGB73v

This is where our team hangs out!

If you liked this write up then consider checking me out on my socials and saying a hi :)

Oh also, this was my reaction when I finally did it: