1

I'm trying to parse a file that contains, a bunch of entries which, among other fields, contains a date in its last column.

Walmart,Retail,482,-0.7,2200000,Arkansas,31-10-1969

I've tried doing this:

from datetime import datetime

def readdata (fname):

    print ('*'*5,'Reading Records From File',fname,'*'*5)

    data = []

    readf = open(fname,'r')
    for line in readf:       
        name1, name2, No_1, No_2, No_3, name3, date1 = line.split(',')
        date = datetime.strptime(date1,'%d-%m-%Y')
        Number1 = float(No_1)
        Number2 = float(No_2)
        Number3 = int(No_3)

        rec = [name1,name2,Number1,Number2,Number3,name3,date]
        data.append(rec)
    readf.close()
    print('\nDone.\n\n')
    return data

But when I try to convert the last field of the line (the date) to an actual datetime.datetime instance, I get the following error:

data_string[found.end():])
    ValueError: unconverted data remains: 

the full error stack is

Traceback (most recent call last):
  File "C:\Users\Keitha Pokiha\Desktop\New folder\Program 2.py", line 42, in <module>
    main()
  File "C:\Users\Keitha Pokiha\Desktop\New folder\Program 2.py", line 39, in main
    data = readdata('fname.txt')
  File "C:\Users\Keitha Pokiha\Desktop\New folder\Program 2.py", line 12, in readdata
    date = datetime.strptime(date1,'%d-%m-%Y')
  File "C:\Users\Keitha Pokiha\AppData\Local\Programs\Python\Python35-32\lib\_strptime.py", line 510, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "C:\Users\Keitha Pokiha\AppData\Local\Programs\Python\Python35-32\lib\_strptime.py", line 346, in _strptime
    data_string[found.end():])
ValueError: unconverted data remains: 
Savir
  • 17,568
  • 15
  • 82
  • 136
  • i know datestr should be date –  Oct 27 '16 at 23:03
  • 1
    It'd really help if you made sure that all the code in your question is properly formatted. Also, I can't see in your snippet any reference to `found` or `data_string`, so it's difficult to tell why the error happens. And you seem to be reading a file with more information than just a date (looks like a comma-separated-value file to me). Could you add a couple of lines of how the file looks like? – Savir Oct 27 '16 at 23:14
  • Walmart,Retail,482,-0.7,2200000,Arkansas,31-10-1969 –  Oct 27 '16 at 23:17
  • 1
    I agree with @BorrajaX . Something else might be the issue. You can try running you function on just the 1 line you've pasted as an example and it runs fine. Something else might be the issue – letsc Oct 27 '16 at 23:38
  • I think I got it, though. I copy/pasted the sample provided in the question, and it seems to be failing because of a newline at the end **:+1:** – Savir Oct 27 '16 at 23:43
  • oh ok ill try that. I think the problem is i copy and paste this data maybe i might have to manually write it out? –  Oct 27 '16 at 23:49

1 Answers1

2

The problem that you seem to be having is that when you do for line in readf:, line ends with the carriage return (special character \n, which signals a new line) so instead of trying to convert 31-10-1969 to datetime, Python is trying to convert 31-10-1969\n, using the format %d-%m-%Y Therefore, when it finishes parsing the year (%Y) it finds an unexpected \n and that's why you're seeing that error: because it doesn't know what to do with it.

You have several options to fix this. Below you'll find two that "fix" the read line, and a third that "fixes" the format expected by datetime:

  1. You can remove that \n it using rstrip after you've read the line:

    name1, name2, No_1, No_2, No_3, name3, date1 = line.rstrip().split(',')
    date = datetime.strptime(date1, '%d-%m-%Y')
    
  2. Or you could use the method explained here and remove the last character in the line, like this:

    name1, name2, No_1, No_2, No_3, name3, date1 = line[:-1].split(',')
    
  3. Or you could tell the datetime module to expect a newline as well in the string:

    name1, name2, No_1, No_2, No_3, name3, date1 = line.split(',')
    date = datetime.strptime(date1, '%d-%m-%Y\n')
    

I'd use 1., because if your line doesn't end with a newline character, everything will still work.

PS (as a side note): If you're reading a comma-separated-value file, I'd strongly suggest you make use of the csv.reader module.

Community
  • 1
  • 1
Savir
  • 17,568
  • 15
  • 82
  • 136
  • Yes worked very well thank you this will enable me to finish one of my assessments. –  Oct 27 '16 at 23:51
  • Good luck! And, since you haven't been a member of S.O. for too long, here it goes: Remember that if any of the answers solved your question, is good that you mark it as accepted (big checkbox to the left of the answer). It'll give you reputation points, it'll give the person that spent time answering points and most importantly, it'll help future readers see that the answer was helpful. See http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235 (not only for this post, but for your future questions as well) Cheers **:-)** – Savir Oct 27 '16 at 23:56