0

I want to convert this file.tsv to csv the conversion works well but the seperation of the fields isn't this is the file.tsv

protein1 protein2 neighborhood neighborhood_transferred fusion cooccurence homology coexpression coexpression_transferred experiments experiments_transferred database database_transferred textmining textmining_transferred combined_score
9606.ENSP00000003084 9606.ENSP00000301645 0 0 0 0 0 0 0 0 0 0 0 163 129 239

here is the first line result file.csv

"protein1 protein2 neighborhood neighborhood_transferred fusion cooccurence homology coexpression coexpression_transferred experiments experiments_transferred database database_transferred textmining textmining_transferred combined_score"
"9606.ENSP00000003084 9606.ENSP00000301645 0 0 0 0 0 0 0 0 0 0 0 163 129 239"

Here is the code

import csv


print(csv.list_dialects())


with open('File.tsv', 'r', encoding='utf-8', newline='') as fin, \
     open('file2.csv', 'w', encoding='utf-8', newline='') as fout: 

     reader = csv.reader(fin, dialect='excel-tab')
     writer = csv.writer(fout, delimiter=' ')    

     for row in reader:
         writer.writerow(row)

The problem is that the code don't seperate the fields using the space, it taked the whole header for one row
The desired result is that the seperation should be where I put commas protein1,protein2,neighborhood,neighborhood_transferred,fusion,cooccurence homology,coexpression,coexpression_transferred,experiments experiments_transferred,database,database_transferred,textmining, textmining_transferred,combined_score 9606.ENSP00000003084,9606.ENSP00000301645,0,0,0,0,0,0,0,0,0,0,0,163,129,239

  • show the desired result – RomanPerekhrest May 29 '17 at 12:32
  • Really this is a futile excercise. CSV readers in most languages can read the data regarldess of whether you use a space or tab as the delimiter – e4c5 May 29 '17 at 12:34
  • 1
    possibly duplicate of https://stackoverflow.com/questions/29759305/how-do-i-convert-a-tsv-to-csv – Bidisha Pyne May 29 '17 at 12:35
  • @RomanPerekhrest I edited the question –  May 29 '17 at 12:38
  • @e4c5 opening it with excel, and as it is a large size of about 700M it is not fully opened , and I need the file to be cnverted for another work –  May 29 '17 at 12:39
  • whether it's space or tab excel will still face the same problem. Your code wouldn't mind either delimiter. – e4c5 May 29 '17 at 12:41
  • @e4c5 what could be the solution then –  May 29 '17 at 12:45
  • You haven't mentioned what the problem is. https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem – e4c5 May 29 '17 at 12:45
  • @e4c5 The problem is that the code do not seperate the attributes it takes the whole header for one row , yet I need it to seperate it where the space is –  May 29 '17 at 12:55

1 Answers1

0

Edit: answer rewritten after exchange of comments with OP.

The input is specified to expect tabs in the input as delimiters:

reader = csv.reader(fin, dialect='excel-tab')

but there aren't tabs, there are spaces, so:

reader = csv.reader(fin, delimiter=' ')

Note that this will treat 2 consecutive spaces as two delimiters with a null field between them. You can't specify ignore duplicate delimiters the way you can in Excel.

BoarGules
  • 16,440
  • 2
  • 27
  • 44
  • No That was an example of where I want the separation to be . the probleme is in the starting input file the code do not seperate it with space –  May 29 '17 at 12:52
  • OK, but if there are spaces in the header line, and not tabs, then `csv.reader` won't split them up because the input dialect is specified as `'excel-tab'`. Are you saying that the header line is space-delimited but the data lines are tab-delimited? – BoarGules May 29 '17 at 13:03
  • No the whole data is space delimited, so I guess we need to change the dialect to something else right? –  May 29 '17 at 13:17