I see some confusion in the way you've written your code up this point and I'd like to make some suggestions that might help.
Collect your matches in one place, first
To avoid such deeply nested code/logic I recommend iterating the lines of the input file and finding and saving matches as a first step:
import re
date_pattern = r"\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}"
matches: list[re.Match[str]] = []
with open("input.txt") as f:
for line in f:
m = re.search(date_pattern, line)
if m is not None:
matches.append(m)
Also, you don't need to escape the hyphens in the regex: The following two regexes are equivalent:
r"\d{4}-\d{2}"
r"\d{4}\-\d{2}"
For a good look at hyphens in regexes, I recommend checking out this short answer.
I also like type hints, so I've type-hinted my matches list.
Review your matches (just to debug, understand what the code does)
I mocked up some input text:
Foo
Bar
Current date & time : 2023-07-22_23-02-09
Baz
2012-12-12_13-13-13
To see my matches after running the previous code:
for m in matches:
print(f"match={m}; match.group()={m.group()}")
# m=<re.Match object; span=(22, 41), match='2023-07-22_23-02-09'>; m.group()=2023-07-22_23-02-09
# m=<re.Match object; span=( 0, 19), match='2012-12-12_13-13-13'>; m.group()=2012-12-12_13-13-13
You tried using Match.groupdict()
, which only works if your regex has named subgroups. As is, your regex has no subgroups (so no named subgroups). Instead, you can use Match.group()
to get the matched text.
Create your CSV based on matches
Since you don't need (cannot use) groupdict, I don't see the need for csv.DictWriter.
I think the list-based writer will easily give you your desired result (which I actually don't know, but think I have a good-enough idea of):
import csv
with open("output1.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerow(["currenttime", "output"])
for m in matches:
writer.writerow([m.group(), "???"])
I think you can just write the header row directly, then use m.group()
to create the following rows. That output looks like:
| currenttime | output |
|---------------------|--------|
| 2023-07-22_23-02-09 | ??? |
| 2012-12-12_13-13-13 | ??? |
I also saw you explicitly making counting variables, count
and a
. For either the read or write steps, I recommend using Python's enumerate()
built-in, maybe something like:
# store the match and the line (int) it was found on
matched_lines: list[tuple[int, re.Match[str]]] = []
with open("input.txt") as f:
for i, line in enumerate(f, start=1):
line = re.search(date_pattern, line)
if line is not None:
matched_lines.append((i, line))
with open("output2.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerow(["Original line", "Current time"])
for ml in matched_lines:
writer.writerow([ml[0], ml[1].group()])
| Original line | Current time |
|---------------|---------------------|
| 3 | 2023-07-22_23-02-09 |
| 5 | 2012-12-12_13-13-13 |