Given the following..
df.show()
+--------------+
| name|
+--------------+
| Some_Cool_Guy|
|Some_Other_Guy|
+--------------+
how do i pick out the 'middle' part of the string correctly?? Am i missing a library??
i've tried:
df.withColumn("newCol", df["name"]).show()
+--------------+--------------+
| name| newCol|
+--------------+--------------+
| Some_Cool_Guy| Some_Cool_Guy|
|Some_Other_Guy|Some_Other_Guy|
+--------------+--------------+
and then a bit of string manipulation..
df.withColumn("newCol", df["name"].split('_')[1]).show()
but this just blows up with..
'Column' object is not callable.
The expected out come is..
+--------------+------+
| name|newCol|
+--------------+------+
| Some_Cool_Guy| Cool|
|Some_Other_Guy| Other|
+--------------+------+
this is driving me a bit round the bend..
cheers!