1

I have a CSV file that has rows with a variable number of columns (and no column headers). E.g. the file could begin with some rows with 23 columns and then some rows with 83 columns etc. Now when read_csv() starts reading the file it guesses the number of columns after the first few rows are read (I think) so if the data rows in the beginning are shorter than at the end I get the exception below. Is there a way to pass a parameter to the function to set the number of columns to a certain max value? Or is there a better way to do this?

Thanks.

CParserError: Error tokenizing data. C error: Expected 23 fields in line 150, saw 83

  • 2
    http://stackoverflow.com/questions/15242746/handling-variable-number-of-columns-with-pandas-python – Nicholas Flees Jun 25 '15 at 14:21
  • I would like to flag this question as possible duplicate of the question mentioned by @NicholasFlees comment – wadkar Jun 14 '17 at 18:08
  • Possible duplicate of [Handling Variable Number of Columns with Pandas - Python](https://stackoverflow.com/questions/15242746/handling-variable-number-of-columns-with-pandas-python) – wadkar Jun 14 '17 at 18:09

1 Answers1

-2
# coding: utf-8

# In[16]:

def params(text):
    pairs = text.split("|")
    print pairs
    out = {i.split("=")[0]:i.split("=")[1] for i in pairs}
    return pd.Series(out) 

params("asd=2|qwe=5")


# In[27]:


import pandas as pd
aa = pd.DataFrame({'id':[1,2],'text':["asd=2|qwe=5","asd=20|qwe=5|qzxc=5"]})
aa



# In[29]:

aa['text'].apply(params)


# In[30]:

pd.concat([aa,aa['text'].apply(params)],1)
Neerav
  • 1,399
  • 5
  • 15
  • 25
  • The answer doesn't provide any explanation of the `params` function, and I can't find how it's related to the question being asked. – wadkar Jun 05 '17 at 13:52
  • @Sudhi The code is readable. 'params' function just creates a series from a string. – Neerav Jun 10 '17 at 11:04
  • The code being readable, unfortunately, does not imply relevance. I don't see how your answer is relevant to the question. The question asks about reading a CSV file in pandas. You are talking about applying a function to a pandas column that looks like splitting up an http API call with parameters. Please read the question again and show how your code actually answers it. – wadkar Jun 14 '17 at 18:07