How to convert multiple text files to csv format in Python3?

Question

I have just over 2000 .txt files that I need to convert to .csv files. Each is sequentially labeled (i.e. nstar0001.txt, nstar0002.txt, etc...). I have searched multiple places for answers, but often the solutions are for Python2.x or use outdated libraries. Each star file has 7 columns of data that I want to label when converting to csv format.

Here is my most recent attempt:

import csv
import os
import itertools


##Convert all nstar####.txt files to csv
stars = int(input("Enter the TOTAL number of stars (including 'bad' stars):"))
k = 1
while k < stars + 1:
    if k < 10:
        q = 'nstar' + '0' + '0' + '0' + str(k) + '.txt'
        r = 'nstar' + '0' + '0' + '0' + str(k) + '.csv'
        with open(q, 'rb') as in_file:
            stripped = (line.strip() for line in in_file)
            lines = (line for line in stripped if line)
            grouped = itertools.izip(*[lines] * 7)
            with open(r, 'wb') as out_file:
                writer = csv.write(out_file)
                writer.writerow(('jd', 'mag', 'merr', 'id', 'cerr', 'serr', 'perr'))
                writer.writerows(grouped)

This was borrowed from another StackOverflow question and slightly modified to suit my needs. However, upon running I get

AttributeError: module 'itertools' has no attribute 'izip'

I know this loop only works for the first few files, but just wanted to get it working before running it for all files.

`izip` is in Python-2.x. Use `zip` on Python-3.x. This SO post may help you http://stackoverflow.com/questions/32659552/izip-not-working-in-python-3-x or you can try this from github https://github.com/nschloe/matplotlib2tikz/issues/20 — alvits, Jul 14 '16 at 21:46

Sjaak Dalens · Accepted Answer · 2016-07-14T22:02:22.450

0

You can use pandas. Something like this should work:

import pandas as pd

for i in range(5):
    fln = "nstar%04d" % i
    df = pd.read_csv(fln+".txt",delim_whitespace=True, header=None)
    hdr = ['jd', 'mag', 'merr', 'id', 'cerr', 'serr', 'perr']
    df.to_csv(fln+".csv", header=hdr, index=False)

edited Jul 14 '16 at 22:02

answered Jul 14 '16 at 21:19

Sjaak Dalens

390
1
3
7

Using that for loop starts the script searching for nstar0000.txt, but my data start at nstar0001.txt. How can I change that to start one higher. [Edit] Got it with a while loop. Thanks for the help! works like a charm. – Justin Jul 15 '16 at 01:59
Range will also take a start value: range(1,N) will do what you want. – Sjaak Dalens Jul 15 '16 at 05:47
What is the benefit of using that over a while loop? – Justin Jul 15 '16 at 21:35
1

Initialization, testing and incrementing of the loop counter is all done in one clear statement. That makes it easier to read and easier to maintain. – Sjaak Dalens Jul 16 '16 at 06:29

How to convert multiple text files to csv format in Python3?

1 Answers1