Python regex doesn't find matches when using line anchors

Question

I have the following function which I'm using to find SHA256 hashes in a body of text:

sample_data = "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 sometext 9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08 sometext aec070645fe53ee3b3763059376134faaaa5ddff058cc337247c978add178b6ccdfb0019f"

    def parse_hash(input_str):
        hashes = re.findall("^[0-9a-fA-F]{64}$", input_str)
        return hashes

parse_hash(sample_data)

This seems to work fine without the ^$ line anchors, but this creates the potential to find the first 64 characters of a much longer string and call it a match. In the sample data, using line anchors results in no matches, but using line anchors gives three matches which is incorrect as the third "hash" in the sample data is longer than 64 characters and shouldn't be included.

I can't understand how to do this. Maybe I've misunderstood the purpose of ^$ in regex?

`$` is end of the LINE, your line has more data after your 64 hex — Nullman, Apr 06 '20 at 09:29
Use `\b`, word boundary. `re.findall(r"\b[0-9a-fA-F]{64}\b", input_str)` — Wiktor Stribiżew, Apr 06 '20 at 09:29

Python regex doesn't find matches when using line anchors

0 Answers0