0

I have several files in excel with multiple columns and I want to concat each colmun in a single one. Is there a performative code to do it?

df = pd.read_excel('exemple.xlsx', header=0, na_values='NA')
df.head()

joincolumn = pd.concat((df['dia1'] + df['dia1a'] + df['dia1b'] + df['dia1c'] + df['dia1d'] + df['A2'] + df['A2A'] + df['B2'] + df['B2B'] + df['C2'] + df['C2C'] +df['A3'] + df['A3A'] + df['B3'] + df['B3B'] + df['C3'] + df['C3C']))

enter image description here

this should look like:

0.423
0.469
0.428
0.495
0.448
0.51
0.494
0.425
0.489
0.465

it's not working!

msanford
  • 11,803
  • 11
  • 66
  • 93
MARCOS SANTOS
  • 87
  • 1
  • 7
  • 1
    Hi Marcos, are you trying to sum the columns or you really want to concatenate them as strings? It is not clear by your question was is the input and the expected output. – renatomt Jul 31 '20 at 03:08

1 Answers1

1

This can achieved using ravel

Since no real data provided, for the sack of clear presentation, assume dial and dia1a as below:

df = pd.DataFrame({'dia1':[1, 2, 3, 4],'dia1a':['A', 'B', 'C', 'D']})

Then ravel these two column gives:

print(pd.Series(df.values.ravel('F')))

1    2
2    3
3    4
4    A
5    B
6    C
7    D

If you would like to maintain the hierarchy, you can use stack.

df = pd.DataFrame({'dia1':[1, 2, 3, 4],'dia1a':['A', 'B', 'C', 'D']})

print(df.stack().reset_index())

which will output

   level_0 level_1  0
0        0    dia1  1
1        0   dia1a  A
2        1    dia1  2
3        1   dia1a  B
4        2    dia1  3
5        2   dia1a  C
6        3    dia1  4
7        3   dia1a  D
mpx
  • 3,081
  • 2
  • 26
  • 56
  • Hi, thanks very much! it´s working now! – MARCOS SANTOS Jul 31 '20 at 03:39
  • 1
    That great @MARCOSSANTOS, since the proposed method is working, maybe you can kindly mark an [answer as accepted](https://stackoverflow.com/help/someone-answers), click on the check mark beside the answer to toggle it from greyed out to filled in. – mpx Jul 31 '20 at 04:01