I am gathering insights on a platform that automates messages, and I'd like to group them by how often that type of message occurs. Some of these messages are syntactically similar, others are not.
For example, my pandas dataframe currently looks something like this:
message | count
-------------------------------------------|-------
"Happy Birthday!" | 50
"Good luck on your first day of school!" | 44
"Sent comms on 04042020" | 3
"Sent comms on 05031996" | 1
...
"Sent comms on 06052021" | 1
"Sent comms on 11042020" | 1
"Sent comms on 07202014" | 1
What I would like is to condense some rows together, so that all the rows I know are essentially various metadata relating to comms are still counted as a single item, i.e:
message | count
-------------------------------------------|-------
"Sent comms on XXXXXXXX" | 94
"Happy Birthday!" | 50
"Good luck on your first day of school!" | 44
Is there any functionality in Pandas to do this? I'm slightly familiar with the various functions for aggregation, but I'm not sure how to aggregate in a dataframe based on substrings/conditionals.
Thanks for any pointers.