I would start by reading the main list of books into a dict, keyed to a book's ID:
import csv
books: dict[str, list[str]] = {}
with open("books.csv", newline="", encoding="utf-8") as f:
reader = csv.reader(f)
header = next(reader)
assert header == ["ID", "Title", "Author"], f"{header} != {['ID', 'Title', 'Author']}"
for row in reader:
books[row[0]] = row
After reading this CSV:
ID,Title,Author
1,A Closed and Common Orbit,Becky Chambers
2,The Clan of the Cave Bear,Jean M. Auel
3,Sum,David Eagleman
4,the 13 1/2 lives of Captain Bluebear,Walter Moers
the dict books looks like:
{
"1": ["1", "A Closed and Common Orbit", "Becky Chambers"],
"2": ["2", "The Clan of the Cave Bear", "Jean M. Auel"],
"3": ["3", "Sum", "David Eagleman"],
"4": ["4", "the 13 1/2 lives of Captain Bluebear", "Walter Moers"],
}
Then I'd read the loans CSV into a separate dict, and parse the checked-out and returned timestamps into real date values (it will make the math and stats, later on, easier and more accurate):
from datetime import date, datetime, timedelta
DATE_FMT = r"%Y/%m/%d"
loans: dict[str, list[tuple[date, date]]] = {}
with open("book_loans.csv", newline="", encoding="utf-8") as f:
reader = csv.reader(f)
for row in reader:
book_id = row[0]
checked_out = datetime.strptime(row[1], DATE_FMT).date()
returned = datetime.strptime(row[2], DATE_FMT).date()
if book_id not in loans:
loans[book_id] = []
loans[book_id].append((checked_out, returned))
After reading this loans CSV:
1,2000/01/01,2000/01/13
2,2000/01/01,2000/01/10
3,2000/01/04,2000/01/16
2,2000/01/11,2000/02/01
3,2000/01/12,2000/01/13
1,2000/01/12,2000/01/26
The loans dict looks like:
{
"1": [(date(2000, 1, 1), date(2000, 1, 13)), (date(2000, 1, 12), date(2000, 1, 26))],
"2": [(date(2000, 1, 1), date(2000, 1, 10)), (date(2000, 1, 11), date(2000, 2, 1))],
"3": [(date(2000, 1, 4), date(2000, 1, 16)), (date(2000, 1, 12), date(2000, 1, 13))],
}
a book ID that keys to a list of pairs of dates (checked-out and returned for each pair).
Note: book 4, Capt. Bluebear, was never loaned.
Now, with the complete list of all available books and a list of loan dates for each book (that was actually loaned), you can apply whatever logic/math you need.
I went for:
for book_id, row in books.items():
title = row[1]
author = row[2]
if book_id not in loans:
print(f"'{title}' by {author} was never loaned out. ")
continue
loan_dates = loans[book_id]
first_checked_out = date.max
last_returned = date.min
loan_durations: list[timedelta] = []
for checked_out, returned in loan_dates:
if checked_out < first_checked_out:
first_checked_out = checked_out
if returned > last_returned:
last_returned = returned
loan_durations.append(returned - checked_out)
# https://stackoverflow.com/a/3617540/246801, give sum() a starting value of timedelta(0)
avg_loan = sum(loan_durations, timedelta(0)) / len(loan_durations)
print(
f"'{title}', by {author}, was loaned {len(loan_durations)} times. It was first checked out on {first_checked_out} and last returned on {last_returned}. The average loan was {avg_loan.days} days."
)
which prints:
'A Closed and Common Orbit', by Becky Chambers, was loaned 2 times. It was first checked out on 2000-01-01 and last returned on 2000-01-26. The average loan was 13 days.
'The Clan of the Cave Bear', by Jean M. Auel, was loaned 2 times. It was first checked out on 2000-01-01 and last returned on 2000-02-01. The average loan was 15 days.
'Sum', by David Eagleman, was loaned 2 times. It was first checked out on 2000-01-04 and last returned on 2000-01-16. The average loan was 6 days.
'the 13 1/2 lives of Captain Bluebear' by Walter Moers was never loaned out.