I think it wouldn't be too much trouble to create a robot that could push a mouse to find the square and click it. There are some that then proceed to an image analysis portion where you need to identify the images with cars in them or stop lights whatever. That would be harder.
