I have a datafile like this:
# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
14.2000 0.531000 0.0618000 0.938200
14.2000 0.532000 0.0790500 0.920950
14.2000 0.533000 0.0998900 0.900110
# it has lots of other lines
# datafile can be obtained from pastebin
The link to input datafile is: http://pastebin.com/NaNbEm3E
I like to create 20 files from this input such that each files have the comments line.
That is :
#out1.txt
#comments
first part of one-twentieth data
# out2.txt
# given comments
second part of one-twentieth data
# and so on upto out20.txt
How can we do so in python?
My intitial attempt is like this:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author : Bhishan Poudel
# Date : May 23, 2016
# Imports
from __future__ import print_function
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# read in comments from the file
infile = 'filecopy_multiple.txt'
outfile = 'comments.txt'
comments = []
with open(infile, 'r') as fi, open (outfile, 'a') as fo:
for line in fi.readlines():
if line.startswith('#'):
comments.append(line)
print(line)
fo.write(line)
#==============================================================================
# read in a file
#
infile = infile
colnames = ['angle', 'wave','trans','refl']
print('{} {} {} {}'.format('\nreading file : ', infile, '','' ))
df = pd.read_csv(infile,sep='\s+', header = None,skiprows = 0,
comment='#',names=colnames,usecols=(0,1,2,3))
print('{} {} {} {}'.format('length of df : ', len(df),'',''))
# write 20 files
df = df
nfiles = 20
nrows = int(len(df)/nfiles)
groups = df.groupby( np.arange(len(df.index)) / nrows )
for (frameno, frame) in groups:
frame.to_csv("output_%s.csv" % frameno,index=None, header=None,sep='\t')
Till now I have twenty splitted files. I just want to copy the comments lines to each of the files. But the question is: how to do so?
There should be some easier method than creating another 20 output files with comments only and appending twenty_splitted_files to them.
Some useful links are following:
How to split a dataframe column into multiple columns
How to split a DataFrame column in python
Split a large pandas dataframe