1

The R package tidyr has a nice separate function to "Separate one column into multiple columns."

What is the pandas version?

For example here is a dataset:

import pandas
from six import StringIO
df = """  i  | j | A
         AR  | 5 | Paris,Green
         For | 3 | Moscow,Yellow
         For | 4 | New York,Black"""
df = StringIO(df.replace(' ',''))
df = pandas.read_csv(df, sep="|", header=0)

I'd like to separate the A column into 2 columns containing the content of the 2 columns.

This question is related: Accessing every 1st element of Pandas DataFrame column containing lists

M--
  • 25,431
  • 8
  • 61
  • 93
Paul Rougieux
  • 10,289
  • 4
  • 68
  • 110

1 Answers1

3

The equivalent of tidyr::separate is str.split with a special assignment:

df['Town'], df['Color'] = df['A'].str.split(',', 1).str
print(df)

#      i  j              A     Town   Color
# 0   AR  5    Paris,Green    Paris   Green
# 1  For  3  Moscow,Yellow   Moscow  Yellow
# 2  For  4  NewYork,Black  NewYork   Black

The equivalent of tidyr::unite is a simple concatenation of the character vectors:

df["B"] = df["i"] + df["A"]
df
#      i  j              A                 B
# 0   AR  5    Paris,Green     ARParis,Green
# 1  For  3  Moscow,Yellow  ForMoscow,Yellow
# 2  For  4  NewYork,Black  ForNewYork,Black
Paul Rougieux
  • 10,289
  • 4
  • 68
  • 110