2

I wrote some code and trying to refactor it to cut out a few steps and i cant seem to find an answer for this. Im reading an excel file and doing a bunch of column renaming and dropping columns i dont need. My end goal is to write the Excel file as an text tab delimited file and accomplished all this but in a very hacky way. I have a function convertToText() that reads the excel file and turns it into a txt file. However every single integer in the file gets a .0 appended to the end.

Ex.

Excel value 1234321

Txt File = 1234321.0

Im just doing a simple read and write using pandas, openpyxl and xlrd.

def convertToText():
        with open(os.path.join(outFile, 'target2.txt'), 'wb') as myTxtfile:
            wr = csv.writer(myTxtfile, delimiter="\t") 
            myfile = xlrd.open_workbook(outFile + fileName)
            mysheet = myfile.sheet_by_index(0)
            for rownum in xrange(mysheet.nrows):
               wr.writerow(mysheet.row_values(rownum))

I had to write a second function just to do a find and replace on the .0 and trying to cut that step out of the process. If anyone has any ideas how to do this in the above function would be greatly appreciated!!

Abe
  • 1,357
  • 13
  • 31
Andrew Singer
  • 21
  • 1
  • 3
  • 1
    can you print `mysheet.row_values(rownum)` and show it to us? – Jean-François Fabre Mar 05 '18 at 14:28
  • 1
    Are you sure when you view the file in Excel, it isn't hiding the original decimal places? – OneCricketeer Mar 05 '18 at 14:29
  • You just need to make sure the numbers are integers: in Excel they're all floats. – Charlie Clark Mar 05 '18 at 14:34
  • @Jean-FrançoisFabre, this isn't a duplicate of the one that you marked against. The other question is about simply reading the numbers as integers, while this question is about not just reading but also saving in a CSV file. There could be answers that can perform tasks way differently compared to those given for the other question. So I suggest this question be restored and reopened. – murphy1310 Mar 05 '18 at 15:11
  • ok but if you _read_ as integers, then writing isn't an issue anymore. The link explains how to convert to integer when reading. I'm waiting till the OP edits his post to explain why it isn't a duplicate. I'm not going to reopen without more arguments (or I won't be able to close again). Some other python gold can do that. – Jean-François Fabre Mar 05 '18 at 15:33
  • the thing is, the other question is answered as soon as conversion is established. But here, there could be a possibility of a different approach. The other question could be about a small file and a simple cell-level conversion might have worked, but here it could be a huge file and a different logic could be better, as in avoiding going through the loops to convert the cells. – murphy1310 Mar 05 '18 at 15:38
  • Hi all just saw that this question had an answer. im going to try the solution and will let you know. Also i did confirm if i print straight from the excel the code is fine. i had the code output my new excel file and the convert to text and checked both. The excel was fine and the txt file had the.0's – Andrew Singer Mar 19 '18 at 19:39

1 Answers1

0

Is this the same thing as what you mean?

So I think your code should become, although I have no test data so I cannot try it:

def convertToText():
    with open(os.path.join(outFile, 'target2.txt'), 'wb') as myTxtfile:
        wr = csv.writer(myTxtfile, delimiter="\t") 
        myfile = xlrd.open_workbook(outFile + fileName)
        mysheet = myfile.sheet_by_index(0)
        for rownum in xrange(mysheet.nrows):
           wr.writerow([int(i) for i in mysheet.row_values(rownum)])
ArnJac
  • 352
  • 2
  • 5
  • 15
  • @AmJac i just tried the above code and it errors out with this. wr.writerow([int(i) for i in mysheet.row_values(rownum)]) ValueError: invalid literal for int() with base 10: 'zip' any ideas? – Andrew Singer Mar 19 '18 at 19:43
  • Sounds like there are not only numeric values in the file, is that correct? – ArnJac Mar 20 '18 at 09:28