0

I have a dataset for example:

df = pd.DataFrame({'x':[1,5,7,8], 'y':['12,4', '1,6,7', '8,3,2', '1']})

I want to split column y, order it, and change the comma (',') into dash ('-') like this:

   x      y
0  1   4-12
1  5  1-6-7
2  7  2-3-8
3  8      1

I've tried using apply but it shows TypeError: can only join an iterable

df['y'] = df['y'].str.split(',')
df['y'] = df['y'].apply(lambda x : '-'.join(x.sort()))

What is wrong with my code?

yasin
  • 3
  • 2
  • [John Zwinck's answer](https://stackoverflow.com/a/67747614/12975140) below is the way to go, as it will run [more efficiently than `apply`](https://stackoverflow.com/questions/54432583/when-should-i-not-want-to-use-pandas-apply-in-my-code). But the specific reason you're getting that error is because `x.sort()` sorts *in place* and returns `None`, as opposed to `sorted(x)`. – CrazyChucky May 29 '21 at 04:11

1 Answers1

1

I'd do it this way:

df.y.str.split(',').map(lambda items : sorted(items, key=int)).str.join('-')

The first step is split like you did, then use map which takes an arbitrary function to convert each element of the Series (that is to say, each list). In your case you want to sort, but numerically not as strings, hence key=int. Finally, join with -.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436