I have a list of sets that contain OrderedDicts that look like this, but the actual list contains ~22,000 elements:
o_dict_list = [(OrderedDict([('StreetNamePreType', 'ROAD'), ('StreetName', 'Coffee')]), 'Ambiguous'),
(OrderedDict([('StreetNamePreType', 'AVENUE'), ('StreetName', 'Washington')]), 'Ambiguous'),
(OrderedDict([('StreetNamePreType', 'ROAD'), ('StreetName', 'Quartz')]), 'Ambiguous')]
When I try to convert this list to a Pandas DataFrame using the question and solution noted here, on the entire list, I get the following error:
IndexError: string index out of range
For reference, the line of code that is causing the error is here:
pd.DataFrame([o_dict_list[i][0] for i, j in enumerate(o_dict_list)])
When I trim the list down to 1,000, I can get the DataFrame to populate with no issue. The only issue is when I use the entire list of ~22K elements.
I am using:
Python 3.6.5 :: Anaconda, Inc.
pandas==0.23.0
numpy 1.15.2
on a Window's 10 machine.
Does anyone know why I get the IndexError
when I use the list of ~22K elements?
Update: As noted below, I was able to resolve this issue by breaking up the list and testing each one. When doing so, I was able to find the part of the list that was causing the code to fail. Thanks for the help.