0

I am trying to study the probability of having a zero value in my data and I have developed a code that outputs the value of a column of data when the other is zero which is what I need. But having to do that for each column vs all other 28 of my 577by29 dataframe is difficult so I decided to create a for loop that does that for me where I have this:

import numpy as np
import pandas as pd
allchan = pd.read_csv('allchan.csv',delimiter = ' ')
allchanarray = np.array(allchan)
dfallchan = pd.DataFrame(allchanarray,range(1,578),dtype=float)
y = pd.DataFrame()
x = pd.DataFrame()
for n in range(0,29):
    x[n] = dfallchan[(dfallchan[0]>0) & (dfallchan[n]==0)][0]
    y[n] = x[n].count()
x.to_excel('n.xlsx', index=False, sheet_name='ValForOtherZero')
y.to_excel('v.xlsx', index=False, sheet_name='CountOfZeroVlas')

The problem that is that the loop for some reason goes properly through the lines:

 x[n] = dfallchan[(dfallchan[0]>0) & (dfallchan[n]==0)][0]
 y[n] = x[n].count()

but it repeats the value of n=6 for the second condition:

(dfallchan[n]==0)

the output of the code should return different values of the first channel as the zeros are randomly distributed in my input file, but my output is correct for the data until the the 6th column -as my columns(0-5) should be empty- where it repeats the output for all other columns! output: output 1

you can see that the code loops correctly as the output data frame has n=29 columns but not for the condition specified above.

Please help, Thanks!

Faisal
  • 11
  • 2
  • 1
    That isn't an error, it's a warning. See more about it here: https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas – Alex S Aug 08 '17 at 16:41
  • I am have read the warning and it seams that the type of the variable x was inappropriate.... – Faisal Aug 08 '17 at 22:06
  • I am now running into another issue and have edited the question! – Faisal Aug 08 '17 at 22:07
  • 1
    Check this: https://stackoverflow.com/questions/31674557/how-to-append-rows-in-a-pandas-dataframe-in-a-for-loop – Anton vBR Aug 08 '17 at 22:08

2 Answers2

0

Finally Got it!

This code does exactly what I want!

# In[9]:

import numpy as np
import pandas as pd


# In[10]:

allchan = pd.read_csv('allchan.csv',delimiter = ' ')


# In[11]:

allchanarray = np.array(allchan)


# In[12]:

dfallchan = pd.DataFrame(allchanarray,range(1,578),dtype=float)


# In[13]:

v = pd.DataFrame(columns=range(0,29))
y = pd.DataFrame()
k = pd.DataFrame(columns=range(0,29))


# In[14]:

for n in range(0,29):
    x = dfallchan[(dfallchan[0]>0) & (dfallchan[n]==0)][0]
    y = y.append(x)
    v = y.transpose()
    k = v.count()


# In[15]:

v.columns=range(0,29)
k = k.values.reshape(1,29)


# In[16]:

v.to_excel("Chan1-OthersZeroVals.xlsx", index=False)
pd.DataFrame(k).to_excel("Chan1-OtherZeroCount.xlsx", index=False)
Faisal
  • 11
  • 2
0

This will more efficient.

all_values = []
for n in range(0,29):
    condition = (dfallchan[0]>0) & (dfallchan[n]==0)
    count = condition.sum()
    vals = dfallchan[condition][0].values
    all_values.append(vals)

all_values_df = pd.DataFrame(all_values).transpose()

Here, I am first creating a list of lists and appending all the values to it. Then at the end I am creating the dataframe and transposing it.

TrigonaMinima
  • 1,828
  • 1
  • 23
  • 35