0

I'm trying to write a script that will take an input file with an unknown number of columns separated by commas and create a new file (name specified by user) where columns are separated by tabs.

The test input file I'm working with looks like this:

Data 1,35,42,7.34,yellow,male
Data 2,41,46,8.45,red,female

Here is the code I have so far:

# Read input file
infile = open("input_file.txt", "r")

line_count = 0

# Read as a collection, removing end line character
for line in infile:
    print(line, end = "")
print("The input file contains", line_count, "lines.")

# Request user input for output file name
filename = input("Enter a name for the output file: ")

# Prompt for file name if entry is blank or only a space    
while filename.isspace() or len(filename) == 0:
    
    filename = input("Whoops, try again. Enter a name for the output file: ")

# Complete filename creation
filename = filename + ".txt"
filename = filename.strip()

# Write output as tab-delim file
for line in infile:
    outfile = open(filename, "w")
    outfile.write(line,"\t")
    outfile.close()

print("Success, the file", filename, "has been written.")
    
# Close input file
infile.close()

The part that writes the output isn't working - it doesn't produce an error, but the output is blank.

happymappy
  • 179
  • 9
  • Does this answer your question? [How to prevent iterator getting exhausted in Python(3.x)?](https://stackoverflow.com/questions/10866134/how-to-prevent-iterator-getting-exhausted-in-python3-x) – Pranav Hosangadi Sep 22 '20 at 22:21
  • 1
    Also you would `open` the `filename` for each `line`, meaning before writing it is cleared. So only fixing the generator exhaustion, you would get just one line in the output file – Jan Stránský Sep 22 '20 at 22:22
  • Related: https://unix.stackexchange.com/questions/359832/converting-csv-to-tsv/359838#359838 – OneCricketeer Sep 22 '20 at 22:32
  • 1
    @PranavHosangadi Files aren't generators. Generators are one kind of iterator; files are another (specifically, instances of a class that implements the iterator protocol). – Karl Knechtel Jan 07 '23 at 07:00

2 Answers2

1

You may use pandas:

import pandas as pd
df = pd.read_csv("input_file.txt", sep=',',header=None)
print("The input file contains", df.shape[0], "lines.")
filename = input("Enter a name for the output file: ").strip()

# Prompt for file name if entry is blank or only a space    
while filename.isspace() or len(filename) == 0:
    filename = input("Whoops, try again. Enter a name for the output file: ")
    
#Saving to csv with | separator
df.to_csv(f'{filename}.txt', sep="\t", header=None, index=None)

print("Success, the file", filename, "has been written.")
Sebastien D
  • 4,369
  • 4
  • 18
  • 46
1

You can split the lines by commas and write while adding tab(\t) chars :

with open('input_file.txt','r') as f_in, open('output_file.txt', 'w') as f_out:
    for line in f_in:
        s = line.strip().split(',')
        for i in s:
            f_out.write(i+'\t')
        f_out.write('\n')

or briefly as @martineau suggested :

with open('input.txt','r') as f_in, open('output.txt', 'w') as f_out:
    for line in f_in:
        s = line.strip().split(',')
        f_out.write('\t'.join(s) + '\n')
Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
  • You shouldn't use the name of a built-in like `str` as variable name. Ignoring that issue, you could just `f_out.write('\t'.join(str) + '\n')`. – martineau Sep 22 '20 at 22:39
  • To clarify since I'm still learning, I need to re-read the input file despite already reading it earlier in the program? This works, I just need to change open(output_file.txt', 'w') to open(filename, 'w') to accept the user-inputted filename. – happymappy Sep 22 '20 at 22:40
  • @happymappy: This is just reading the input file once…and writing the output file at the same time. – martineau Sep 22 '20 at 23:03