0

Using the internet information/information using this platform, I manage to write a small Python code which reads 200 csv and then able to distinguish the different values corresponding to the index of the column. Now I am interested in writting a csv/txt file in which 2 columns should there one of variable "time" and another variable "alpha.water" . Using the following Python script I am able to write a single variable "time":

# importing different modules
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math 
import importlib

totalt = 200 # defining the total time   
for i in range (0,totalt):
    df = pd.read_csv('interpolate_lwc_t_%d.csv'%i, skiprows=0)

    # writting file corresponding to this data
    x1 = df['Time'].values
    tfile = open('test.txt', 'a')
    tfile.write(str (x1))
    tfile.close()

Also this "time" variable is written with the square bracket [] as follows:

[0.008][0.009][0.01][0.011][0.012][0.013][0.014][0.015][0.016][0.017][0.018][0.019][0.02][0.021][0.022][0.023][0.024][0.025][0.026][0.027][0.028][0.029][0.03][0.031][0.032][0.033][0.034][0.035][0.036][0.037][0.038][0.039][0.04][0.041][0.042][0.043][0.044][0.045][0.046][0.047][0.048][0.049][0.05][0.051][0.052][0.053][0.054][0.055][0.056][0.057][0.058][0.059][0.06][0.061][0.062][0.063][0.064][0.065][0.066][0.067][0.068][0.069][0.07][0.071]

Is there any clearer way of writing a csv/txt file where inside two columns of variable "Time" and "alpha.water", the corresponding values be written? I am expecting the following output:

Time        alpha.water
0.008       0.01147
0.009       0.011472
0.010       0.011473

Any suggestion/comment will be a great help. Thanks in advance.

1 Answers1

0

If you already have a dataframe created from a CSV and want to have another CSV, you can do this in two steps.

First, after loading the CSV into your dataframe drop the columns you don't need aka. select the ones you do want. See this post for reference: https://stackoverflow.com/a/11287278/6826556

To combine different dataframes into one, use pandas concat functionality. Assuming you have one dataframe outside of the loop named df_all, you can concat the current one to that incrementally.

# outside of loop:
df_all = pd.DataFrame()

# inside loop
df_all = pd.concat([df_all, df], ignore_index=True)

Then, you can export the new dataframe (that contains only the desired columns and is a combination of all others) to a new CSV file directly. Check out this link to see how.

codeguy
  • 98
  • 5
  • Thanks a lot for your reply, but as you can see that I am reading each file as per the loop and consequently I don't a complete dataframe, my dataframe changes as per the loop. So may be I should store all the dataframe in another variable, could you kindly comment how to do that? – user13680518 Apr 07 '21 at 10:55
  • Thanks a lot for your kind help. As per your suggestion I tried to use the link [link](https://stackoverflow.com/questions/11285613/selecting-multiple-columns-in-a-pandas-dataframe/11287278#11287278) and modified the code as `for i in range (0,totalt): df = pd.read_csv('interpolate_lwc_t_%d.csv'%i, skiprows=0) newdf = df[["Time", "alpha.water"]] print(newdf)` Now I am able to correctly print the desired result, what I am unable to save this as a csv/text file, since it's in the loop, any further help? – user13680518 Apr 07 '21 at 11:10
  • Take a look at my updated answer, you can concatenate the frames. – codeguy Apr 07 '21 at 12:46
  • Thanks a lot, for the updated answer, when I am running this code it is showing me error that "df_all" is not defined, how do I need to define that? Although in the post you have mentioned that I need to define before the loop. Sorry for being too naive for Python. – user13680518 Apr 07 '21 at 13:31
  • Thanks a lot for your help, worked perfectly, so finally I wrote: `import os import numpy as np import pandas as pd totalt = 200 # defining the total time df_all = pd.DataFrame() # stack flow for i in range (1,totalt): df = pd.read_csv('interpolate_lwc_t_%d.csv'%i, skiprows=0) newdf = df[["Time", "alpha.water"]] df_all = pd.concat([df_all, newdf], ignore_index=True) # stackflow df_all.to_csv (r'combined_all_nofc1.csv', index = False, header=True) # stackflow` – user13680518 Apr 08 '21 at 05:53