I have tried the below in Pandas and it works. I wondered how I might do it in PySpark?
The input is
news.bbc.co.uk
it should split it at the '.' and hence index should equal:
[['news', 'bbc', 'co', 'uk'], ['next', 'domain', 'name']]
index = df2.domain.str.split('.').tolist()
Does anyone know how I'd do this in spark rather than pandas?
Thanks