0

I'm working with UserZoom which does an output in a strange way.

         How dissatisfied or satisfied are you with your experience today
         Col1      Col2      Col3      Col4     Col5    Col6    Col7    Col8    Col9    Col10
user1    nan       nan       nan       nan      nan     nan       7     nan     nan     nan
user2    nan         2       nan       nan      nan     nan     nan     nan     nan     nan
user3    nan       nan         3       nan      nan     nan     nan     nan     nan     nan
user4    nan       nan       nan       nan      nan     nan     nan     nan     nan     10

...and so on.

I'm searching for a way to collapse them all into one column so the output is a series like this:

         Satisfaction
user1    7
user2    2
user3    3
user4    10

I found a way to pull out the whole column, but after that I got stuck:

xl_file = pd.read_excel(
"/Users/Folder/file.xlsx", skiprows=(0, 1), header=None)

comment = xl_file[17]
user_id = xl_file[0]

I also looked in here How to iterate over rows in a DataFrame in Pandas where someone said that iterating over rows is not a great idea. It then linked to this article, which I'm finding a bit confusing: https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#essential-basic-functionality

Can you point me in the right direction?

Thanks!

Edit: I should probably edit it to make it a bit more realistic.

  • 2
    ``df.sum(axis=1)`` ? – sushanth Aug 17 '20 at 12:50
  • Are we doing the sum on an excel file? Or am I writing individual scores as `score_1 = xl_file[18], score_2 = xl_file[19], score_3 = xl_file[21]` and so on, and then adding? Seems a bit fiddly, the scores are actually out of 10 in the real file (this is just an example). – Belthazubel Aug 17 '20 at 13:19
  • @sushanth 's answer is good. Alternatively you could use [pandas.melt](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html) and then [drop NaNs](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html) to avoid relying on the fact that panda's sum ignores NaNs by default. – Big Bro Aug 17 '20 at 13:31
  • I'm a potato. Sorry, I didn't realise it already reads it in as a DataFrame. Thanks! – Belthazubel Aug 17 '20 at 14:00

0 Answers0