-1

I am trying to merge two datasets together where one lists schools like Alabama, Alabama State, Arizona State, etc. The other lists the schools as University of Alabama, Alabama State University, Arizona State University, etc.

My strategy is to remove "University", "of", and other strings from the one dataset to be able to merge on School.

I am EXTREMELY new to Python and do not know how to accomplish editing the column to achieve these results.

If there is a way to merge the two with "close" matches, that would also be beneficial.

Thank you.

bschaible
  • 11
  • 2
  • Take a look at [str.replace(old, new)](https://www.tutorialspoint.com/python/string_replace.htm) for starters. That will allow you to replace University with empty string, for example. – jarmod Oct 17 '20 at 22:13
  • 2
    Please repeat [on topic](https://stackoverflow.com/help/on-topic) and [how to ask](https://stackoverflow.com/help/how-to-ask) from the [intro tour](https://stackoverflow.com/tour). "Show me how to solve this coding problem?" is off-topic for Stack Overflow. You have to make an honest attempt at the solution, and then ask a *specific* question about your implementation. Stack Overflow is not intended to replace existing tutorials and documentation. The Python string functions are documented quite well; give it a try, and post code when you have a problem. – Prune Oct 17 '20 at 22:23
  • 2
    Does this answer your question? [How to replace multiple substrings of a string?](https://stackoverflow.com/questions/6116978/how-to-replace-multiple-substrings-of-a-string) – kiran_raj_r Oct 17 '20 at 22:26
  • What is the rule that tells you which words to remove? – Karl Knechtel Oct 17 '20 at 22:34

1 Answers1

-1

You can use replace for every string, e.g. (assuming list_universities contains all strings:

for university in list_universities:
    university.replace('University of ', '')

This takes every string in the list and replaces the characters "University of " with nothing.

MD98
  • 344
  • 2
  • 9