Combine two numeric dataframe columns into one column of tuple

Question

I would like to create a new column that will put together 2 columns together. I looked over the internet but found nothing. How could I do:

Ex:

A B
50.631456 5.57871    

C
(50.631456, 5.57871)

So You want to put 2 columns into 1 like in C? See this: https://stackoverflow.com/questions/12555323/adding-new-column-to-existing-dataframe-in-python-pandas in Sereis put Your data — Lukasz, Mar 04 '18 at 22:34
You want to combine two (numeric?) columns into one column containing a Python tuple, yes? (Not strings) — smci, Mar 04 '18 at 23:26
By the way, if you want to do this to have (lat, long) tuple objects inside your dataframe, it's not a great idea, any function that processes them will have to give them special treatment. Better just to convert into tuple when you write csv/ export/ pickle the dataframe. — smci, Mar 04 '18 at 23:56
@ThibaultMambour, does one of the below solutions solve your problem? if so, consider accepting an answer (green tick on left), so other users know. — jpp, Mar 09 '18 at 01:49

jpp · Answer 1 · 2018-03-04T23:00:29.150

list + zip is one efficient way:

df['C'] = list(zip(df.A, df.B))

#            A        B                     C
# 0  50.631456  5.57871  (50.631456, 5.57871)

Performance

As expected, df.apply methods are loopy and inefficient for large dataframes, especially when combined with lambda.

df = pd.concat([df]*10000)

%timeit list(zip(df.A, df.B))                  # 3.14ms
%timeit df.apply(tuple, axis=1)                # 378ms
%timeit df.apply(lambda x: (x.A,x.B), axis=1)  # 577ms

score 4 · Answer 2 · answered Mar 04 '18 at 22:34

Checkout DataFrame.apply.

df = pd.DataFrame(np.random.randint(0, 10, (6, 2)), columns=['a', 'b'])

df['c'] = df.apply(tuple, axis=1)
df

returns

   a  b       c
0  8  1  (8, 1)
1  3  3  (3, 3)
2  2  8  (2, 8)
3  6  2  (6, 2)
4  2  2  (2, 2)
5  8  5  (8, 5)

score 1 · Answer 3 · answered Mar 04 '18 at 22:38

you can use apply.

df = pd.DataFrame({'A': {0: 50.631456}, 'B': {0: 5.57871}})

df
Out[162]: 
           A        B
0  50.631456  5.57871

df['C'] = df.apply(lambda x: (x.A,x.B), axis=1)

df
Out[155]: 
           A        B                     C
0  50.631456  5.57871  (50.631456, 5.57871)

Combine two numeric dataframe columns into one column of tuple

3 Answers3