0

I have this dataframe:

df = pd.DataFrame({"X" : ["2017-12-17","2017-12-18","2017-12-19"],
                  "Y": ["F","W","Q"]})

And I'm looking for the key column:

           X    Y            key
0   2017-12-17  F   2017-12-17_F  
1   2017-12-18  W   2017-12-18_W
2   2017-12-19  Q   2017-12-19_Q

I have tried 1,2,3, and the best solution is (for speed, as they are near 1 million rows):

df.assign(key=[str(x) + "_" + y for x, y in zip(df["X"], df["Y"])])

And it gives me this error:

TypeError: unsupported operand type(s) for +: 'Timestamp' and 'str'

Why?

Chris
  • 2,019
  • 5
  • 22
  • 67
  • It would seem that one of the values is not of type `str`. using the sample code you provide, all are strings. But perhaps your actual data is not. – David Zemens Jun 07 '19 at 16:58
  • figure out which, and convert to string: https://stackoverflow.com/questions/10624937/convert-datetime-object-to-a-string-of-date-only-in-python/35780962 – David Zemens Jun 07 '19 at 16:59

1 Answers1

0

Looks like your X column is not string as posted, but TimeStamp. Anyway, you can try:

df['key'] = df.X.astype(str) + '_' + df.Y.astype(str)
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74