- I can print 2 columns of a
pandas
data frame like this - How do I format a row-by-row print?
- Here is my "ugly" solution followed by what I had expected to work
import pandas
def date_normalization(data: pandas.core.frame.DataFrame) -> None:
# EDIT: add completed code
# convert to desired date format
data[normalized] = pandas.to_datetime(
data[original],
errors="coerce",
).dt.strftime('%d/%m/%Y')
original = "start"
normalized = "normalized"
data = pandas.DataFrame({
original:
{
0: "AUG 26 2016",
1: "JAN-FEB 2021",
2: "2017-06-01 00:00:00"
}})
date_normalization(data)
# remove rows with invalid date
data = data[data[normalized].notnull()]
# arrggghh ... this is working, but ugly ...
for i, before in enumerate(data[original]):
for j, after in enumerate(data[normalized]):
if i == j:
print(f"row {i}: {before} -> {after}")
print("\n")
# surprisingly (?) this doesn't work
for row in data:
print(f"{row[original]} -> {row[normalized]}")
Here is the error I get for the second try:
row 0: AUG 26 2016 -> 26/08/2016
row 1: 2017-06-01 00:00:00 -> 01/06/2017
Traceback (most recent call last):
File "/home/oren/Downloads/GGG/main.py", line 36, in <module>
print(f"{row[original]} -> {row[normalized]}")
TypeError: string indices must be integers