I am trying to split the strings in a column tweet_text
if the column lang
is en
Here is how to do it on a string:
s = 'I am always sad'
s_split = s.split(" ")
This returns:
['I', 'am', 'always', 'sad']
My current code which does not work:
df['tweet_text'] = df.apply(lambda x: x['tweet_text'].split(" ") if x['lang'] is 'en' else x['tweet_text'], axis = 1)
Dictionary of data:
{'lang': {1404: 'en',
1943: 'en',
2169: 'en',
2502: 'de',
3981: 'nl',
4226: 'en',
7223: 'en',
8557: 'de',
11339: 'pt',
11854: 'en'},
'tweet_text': {1404: 'I am always sad when a colleague loses his job and Frank is not just a colleague he is an impoant person in my',
1943: 'It remains goalless at FNB Stadium between Kaizer Chiefs and Baroka at halftimeRead more',
2169: 'Which one gets your vote 05',
2502: 'Was sagt ihr zu den ersten Minuten',
3981: 'En we gaan door speelronde begint vandaagTegen wie speelt jouw favoriete club',
4226: 'Quote tweet or replyYour favourite Mesut Ozil moment as a Gunner was',
7223: 'How to follow the game live The opponent Current form Did you know The squad Koeman said It must b',
8557: 'BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN BAYERN',
11339: '9o golo para',
11854: 'have loads of boss stuff available on their store products available including the m'}}