I'm really struggling with reading some messy string data from a CSV file into a DataFrame. I'm working with Python 3.8.5 and Pandas 1.1.3
My CSV file contains integers, dates, and strings bookended with double quote " characters. An example of the first few rows is:
Using pd as an instance of Pandas like this:
import pandas as pd
import csv
myfilename = 'Input.csv'
myexpenses = pd.read_csv(myfilename, skipinitialspace=True, quotechar='"', quoting=csv.QUOTE_ALL)
Consistently gives me:
ParserError: Error tokenizing data. C error: Expected 37 fields in line 3, saw 42
This is clearly because there are actually 41 commas spread across the 37 fields. I'm not sure why these aren't being handled by the quotechar statement.
How can I handle these strings as a single field entry?
I've probably tried a dozen different tweaks of this function such as this, this, this, etc.
These include tweaking variables like quoting=csv.QUOTE_ALL
without result.
Do I need to involve reg expressions?
Any help would be greatly appreciated - thank you in advance.
EDIT: Added full code. For those asking, these are the columns and first five entry lines of the CSV file:
Date,Employee,Project,Expense,Description,Units,Cost Rate,Cost Amount,Markup %,Charge Amount,Billable,Billed Status,Submit Status,Approved By,Reimbursable,Paid,Paid Date,Income Account,Expense Account,Class,Country,Purchase Tax Rate,Extra,Tax1 %,Tax2 %,Tax3 %,Credit Card,Check Number,Vendor Bill Number,Invoice Number,Client,Attachments,Memo,Created By,Created On,Last Updated By,Last Updated
8/27/2021,"Lastname, Firstname","2021-123 - Bob & Bob (Bob's - New York, NY) Project - Support",Meals:Project Meals,Meals,1,43.64,43.64,0,43.64,True,Un Billed,Un-Submitted,,False,False,8/27/2021,,,Meals-Project,U.S. Dollar,0,False,0,0,0,1234-56 - Lastname - 1234,,,,"Bob & Bob, Inc.",1,,"Lastname, Firstname",8/27/2021,"Lastname, Firstname",8/27/2021
8/27/2021,"Lastname, Firstname","2021-123 - Bob & Bob (Bob's - New York, NY) Project - Support",Auto Expense:Fuel,Fuel,1,29.41,29.41,0,29.41,True,Un Billed,Un-Submitted,,False,False,8/27/2021,,,Fuel,U.S. Dollar,0,False,0,0,0,1234-56 - Lastname - 1234,,,,"Bob & Bob, Inc.",1,,"Lastname, Firstname",8/27/2021,"Lastname, Firstname",8/27/2021
8/27/2021,"Lastname, Firstname","2021-123 - Bob & Bob (Bob's - New York, NY) Project - Support",Airfare:Flight,Plane Ticket,1,658.4,658.4,0,658.4,True,Un Billed,Un-Submitted,,True,False,8/27/2021,,,Flight,U.S. Dollar,0,False,0,0,0,,,,,"Bob & Bob, Inc.",1,,"Lastname, Firstname",8/27/2021,"Lastname, Firstname",8/27/2021
8/26/2021,"Lastname, Firstname","2021-123 - Bob & Bob (Bob's - New York, NY) Project - Support",Meals:Project Meals,Meals,1,32.28,32.28,0,32.28,True,Un Billed,Un-Submitted,,False,False,8/26/2021,,,Meals-Project,U.S. Dollar,0,False,0,0,0,1234-56 - Lastname - 1234,,,,"Bob & Bob, Inc.",1,,"Lastname, Firstname",8/27/2021,"Lastname, Firstname",8/27/2021
8/26/2021,"Lastname, Firstname","2021-123 - Bob & Bob (Bob's - New York, NY) Project - Support",Meals:Project Meals,Meals,1,6.58,6.58,0,6.58,True,Un Billed,Un-Submitted,,False,False,8/26/2021,,,Meals-Project,U.S. Dollar,0,False,0,0,0,1234-56 - Lastname - 1234,,,,"Bob & Bob, Inc.",1,,"Lastname, Firstname",8/27/2021,"Lastname, Firstname",8/27/2021