Adding all string in a column into one

Question

I have a relatively simple question but I am novice in Python so I need help.

I want to iterate over a column in Python, where all values are sentences , like 'Friends+CCas+good result','just want everything to go smooth. serious','a mixture of both academic and non-academic', etc ...

	Serie i want to loop
First	'Friends+CCas+good result'
Second	'just want everything to go smooth. serious'

My goal is to add all the string in the column into a single one in order to count the total number of occurrences of each word separately for the entire column. I found this method for two string :

string = 'Hello ' 
string += 'World'

print(string) => 'Hello World' then I can string.split() but I tried list comprehension and loop without getting the good result I wanted for my entire column, in order to get something like this:

'Friends+CCas+good result just want everything to go smooth. serious a mixture of both academic and non-academic' with a space between all strings and then split the entire thing in order to get the total frequencies of each word

I hope I am clear enough.

Thank you in advance

It is not quite clear what the end result you expect. Can you please update the question and add the expected result for the sample data you have presented in the question? — ThePyGuy, Aug 28 '21 at 08:09

score 0 · Answer 1 · answered Aug 28 '21 at 08:23

0

Assuming by "column" you mean python list: You can iterate over the list and add each string with a space before it like this:

full_str = ""
for sentence in list_name:
    full_str += " " + sentence

answered Aug 28 '21 at 08:23

Lecdi

2,189
2
6
20

Thank you very much, have a nice day – Thomas ZILLIOX Aug 28 '21 at 08:38

score 0 · Answer 2 · answered Aug 28 '21 at 08:40

I'd advise to use regex to extract the words:

import re

data = ['Friends+CCas+good result', 'just want everything to go smooth. serious']
re.findall(r'\b\w+\b', ' '.join(data))

Or use pandas:

import pandas as pd 

data = ['Friends+CCas+good result', 'just want everything to go smooth. serious']
df= pd.DataFrame(data, columns=['strings'])
df['strings'] = df['strings'].str.lower().str.findall(r'\b\w+\b')
df.explode('strings').stack().value_counts()

	0
result	1
good	1
serious	1
friends	1
go	1
want	1
to	1
smooth	1
ccas	1
just	1
everything	1

Thank you very much, I will try that too. Have a nice day – Thomas ZILLIOX Aug 28 '21 at 11:15 — Thomas ZILLIOX, Aug 28 '21 at 11:15

Adding all string in a column into one

2 Answers2