You need to dissect your image into areas that have the same horizontal and vertical proportions as a fixed-width character. Let's call this dimension a "cell".
You will also need a map of each character, but rendered as an image into the same "cell".
Then you will need to do a fuzzy match of the cell of the image to the character cells. The one that matches most closely is selected for the output. There are multiple ways of matching things, and this gets related to AI visual research.
The most powerful techniques detect edges, and reduce the complexity of the image to permit a better line representation of the image before attempting a character match. This is because the best ascii art is more aligned to line art, rather than photo type art.