1

Basically the original data has no headers but only value (but i have header list). The delimiter is '|'. Now what i try to do is to convert txt file to csv file by using. The csv file contains headers i have and corresponding values.

For example:

txt file looks like:

sadasd|dsdads|adsasd

value 1|value 2|value 3|value 4| value 5| value 100|value 101|value 102|value 103|value 104|value 105 value 200|value 201|value 202|value 203|value 204|value 205

sdasd|dsa|dsdad

and after converting .csv file will look like :

header 1,header 2, header 3, header 4, header 5,

value 1,value 2,value 3,value 4,value 5,

value 100,value 101,value 102,value 103,value 104,value 105

value 200,value 201,value 202,value 203,value 204,value 205

I just start to learn python and what my idea is:

  • delete first and last line.

  • use dictionary list: every column is a list with key (header i have). to dataframe

  • convert to .csv

so it looks like {'header 1': [value 1, value 100, value 200],'header 2': [value 2, value 101, value 201]. and then convert to .csv.

That's just my thought, or you have the easiest way but only using python.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Bruce
  • 11
  • 3
  • Do you know whether there are always exactly 5 values? Can this vary across files? Have you tried looking into other SO posts that delete the [first](https://stackoverflow.com/questions/4796764/read-file-from-line-2-or-skip-header-row) and [last](https://stackoverflow.com/questions/26696393/simple-way-of-not-reading-last-n-lines-of-a-file-in-python) line? – dennlinger Jul 10 '18 at 09:04

2 Answers2

0

Using csv module

Ex:

import csv
with open(filename, "r") as infile:
    data = []
    for i in infile.readlines()[1:-1]:                   #Strip first and last line. 
        if i.strip():
            data.extend(i.strip().split("|"))
data = [data[i:i+5] for i in range(0, len(data), 5)]     #Split list to sub-list of 5 elements
print(data)


header = ["header 1","header 2", "header 3", "header 4", "header 5"]
with open(outfile, "w") as outfile:                     #Output CSV file
    writer = csv.writer(outfile, delimiter=",")
    writer.writerow(header)                             #Write Header
    writer.writerows(data)                              #Write content.
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • Hi Rakesh, thank you for your quick response. what if in the transmission, i want to delete some column of values (let's say column 3 and 5), how to do it. and how can i output the file is .csv file . – Bruce Jul 10 '18 at 10:14
0

Stitching up from parts in stackoverflow yields the following solution

import pandas as pd

mycolnames = ['col1','col2','col3','col4','col5']

# Use the sep argument to change your delimiter accordingly
df = pd.read_csv("foo.txt", sep="|")

# Set your column names to the data frame
df.columns = mycolnames

# Write your desired columns to csv
df['col1'].to_csv("bar.csv", sep=",")

Credits

@atomh33ls - How to read csv into record array in numpy?

@LangeHaare - set column names in pandas data frame from_dict with orient = 'index'

kerwei
  • 1,822
  • 1
  • 13
  • 22