0

I want to save my output in .csv. When I am running my while loop and saving the output, My output is only saving for the last iteration. Its not saving my all iteration value.

Also, I want to skip the zero value rows while printing my output.

This is my code:

import pandas as pd `#pandas library
sample = pd.DataFrame(pd.read_csv ("Sample.csv")) #importing .csv as pandas DataFrame

i = 0
while (i <= 23):
    print('Value for', i) `#i vale`
    sample2 = (sample[sample['Hour'] == i])`#Data for every hour`
    sample3 = (sample2[(sample2['GHI']) == (sample2['GHI'].max(0))]) `#Max value from sample3 DataFrame`
    sample3 = sample3.loc[sample3.ne(0).all(axis=1)]`ignoring all rows having zero values`
    print(sample3)  `print sample3`
    sample3.to_csv('Output.csv')`trying to save for output after every iteration`
    i = i + 1
Vishwas
  • 343
  • 2
  • 13
  • 2
    if you want to save after iteration, you should rename your output after each iteration, e.g. `sample3.to_csv(f'Output{i}.csv')`. – Quang Hoang Oct 17 '19 at 13:03
  • Or you can use `mode='a'` inside to_csv() to append. – vb_rises Oct 17 '19 at 13:03
  • Thank you your input. yes I am able to get my all data without overwriting. but I am getting all the row header after my all iteration. e.g. Year Month Day Hour Minute DHI DNI GHI Dew Point Surface Albedo Wind Direction Wind Speed Temperature Pressure 3917 2005 6 13 5 30 34 363 78 10 0.129 116.5 4 13 1010 Year Month Day Hour Minute DHI DNI GHI Dew Point Surface Albedo Wind Direction Wind Speed Temperature Pressure 3918 2005 6 13 6 30 62 656 265 11 0.129 134.1 4.8 16 1010 Year Month Day Hour Minute DHI DNI GHI Dew Point Surface Albedo Wind Direction Wind Speed Temperature Pressure – Shriganesh Patil Oct 17 '19 at 13:12
  • Its repeatedly printing row title after every iteration. I dont want to do it – Shriganesh Patil Oct 17 '19 at 13:14
  • @ShriganeshPatil what you can do is `if i == 0`, then add `header = True` in the to_csv() method. else `header = False`. – vb_rises Oct 17 '19 at 13:27

2 Answers2

0

An other way of doing what you want to do is to get rid of your loop, like this :

sample_with_max_ghi = sample.assign(max_ghi=sample.groupby('Hour')['GHI'].transform('max'))
sample_filtered = sample_with_max_ghi[sample_with_max_ghi['GHI'] == sample_with_max_ghi['max_ghi']]
output_sample = sample_filtered.loc[sample_filtered.ne(0).all(axis=1)].drop('max_ghi', axis=1)
output_sample.to_csv('Output.csv')

Some explanations :

1.

sample_with_max_ghi = sample.assign(max_ghi=sample.groupby('Hour')['GHI'].transform('max'))

This line add a new column to your dataframe containing the max of GHI column for your group of Hour

2.

sample_filtered = sample_with_max_ghi[sample_with_max_ghi['GHI'] == sample_with_max_ghi['max_ghi']]

This line filters only rows where the GHI value is actually the max of its Hour group

3.

output_sample = sample_filtered.loc[sample_filtered.ne(0).all(axis=1)].drop('max_ghi', axis=1)

And apply the last filter to get rid of the 0 values rows

0

while the loop is running adding the value at every loop to rename the csv file will make it to look unique and solve your problem.. eg:

import pandas as pd `#pandas library
sample = pd.DataFrame(pd.read_csv ("Sample.csv")) #importing .csv as pandas DataFrame

i = 0
while (i <= 23):
    print('Value for', i) `#i vale`
    sample2 = (sample[sample['Hour'] == i])`#Data for every hour`
    sample3 = (sample2[(sample2['GHI']) == (sample2['GHI'].max(0))]) `#Max value from sample3 DataFrame`
    sample3 = sample3.loc[sample3.ne(0).all(axis=1)]`ignoring all rows having zero values`
    print(sample3)  `print sample3`
    sample3.to_csv(str(i)+'Output.csv')`trying to save for output after every iteration`
    i = i + 1
Suraj Rao
  • 29,388
  • 11
  • 94
  • 103