0

Given the following..

df.show()

+--------------+
|          name|
+--------------+
| Some_Cool_Guy|
|Some_Other_Guy|
+--------------+

how do i pick out the 'middle' part of the string correctly?? Am i missing a library??

i've tried:

df.withColumn("newCol", df["name"]).show()

+--------------+--------------+
|          name|        newCol|
+--------------+--------------+
| Some_Cool_Guy| Some_Cool_Guy|
|Some_Other_Guy|Some_Other_Guy|
+--------------+--------------+

and then a bit of string manipulation..

df.withColumn("newCol", df["name"].split('_')[1]).show()

but this just blows up with..

'Column' object is not callable.

The expected out come is..

+--------------+------+
|          name|newCol|
+--------------+------+
| Some_Cool_Guy|  Cool|
|Some_Other_Guy| Other|
+--------------+------+

this is driving me a bit round the bend..

cheers!

m1nkeh
  • 1,337
  • 23
  • 45

0 Answers0