0

I have a list looks like this

['Drexel University,\r\n                  Antoinette Westphal COMAD,\r\n                  Animation & Visual Effects,\r\n                  Undergraduate Program']

I want to remove the university name, which is "Drexel University", and the space like \r\n (include those white space after that) in front of other words. I guess regex would be a good idea. But I don't know how to exclude some words with regex.

Well, I already have a solution. But if anyone could provide a regex version, I'll be appreciate.

user8314628
  • 1,952
  • 2
  • 22
  • 46
  • So it's a single element list with one string? – cs95 Sep 02 '17 at 22:13
  • Possible duplicate of [How to delete a character from a string using python?](https://stackoverflow.com/questions/3559559/how-to-delete-a-character-from-a-string-using-python) – Pokestar Fan Sep 02 '17 at 22:15
  • @COLDSPEED Yes, I think string or list is not the main problem. The reason I keep the list there is I think there might be some more convenient way to split it. – user8314628 Sep 02 '17 at 22:26

3 Answers3

0

You can use .split() to split by whitespace and then slice the list as the following:

>>> l = ['Drexel University,\r\n                  Antoinette Westphal COMAD,\r\n                  Animation & Visual Effects,\r\n                  Undergraduate Program']
>>> l = l[0].split()[2:]
>>> l
['Antoinette', 'Westphal', 'COMAD,', 'Animation', '&', 'Visual', 'Effects,', 'Undergraduate', 'Program']

If you want it as a string with a space between each word you can use l = ' '.join(l)

Mohd
  • 5,523
  • 7
  • 19
  • 30
  • Seems that is not the way I want. If you split it word by word, the phrase structure would be broken. I want to get a result looks like: Antoinette Westphal COMAD, Animation & Visual Effects, ... – user8314628 Sep 02 '17 at 22:27
  • Oh well, I get it. Let discipline be the string. Then l=[d.strip() for d in discipline[0].split(',')] works. – user8314628 Sep 02 '17 at 22:32
0

To turn your list of one text into a list of strings, you can do:

l = ['Drexel University,\r\n                  Antoinette Westphal COMAD,\r\n                  Animation & Visual Effects,\r\n                  Undergraduate Program']

text = l[0]
lines = [line.strip().strip(',') for line in text.splitlines()]

Here, I extracted the first item of the list. Then I split the first item into lines, and for each line I use strip to remove the spaces and the ",".

The result is:

['Drexel University', 'Antoinette Westphal COMAD',
 'Animation & Visual Effects', 'Undergraduate Program']

To remove the first element of the list, you can do:

lines.pop(0)

EDIT: RegEx

Using RegEx, you can split your text as below:

import re

text = l[0]
lines = re.split(r',\s+', text)
Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103
0

In case you plan on doing this regularly for other words as-well. I would generalize it a bit.

From your data:

l = ['Drexel University,\r\n                  Antoinette Westphal COMAD,\r\n                  Animation & Visual Effects,\r\n                  Undergraduate Program']

Assign string to variable:

l = l[0]

Define the list of keys you want to ignore:

ignore_keys = ["Drexel University,","\n","\r","  "]

Loop over keys to ignore and replace them with blank

for ignore in ignore_keys:
    l = l.replace(ignore,"")

Then depending on how you want the result represented:

As list - l.split(",") As string - l

Result:

print(l.split(","))
['Antoinette Westphal COMAD', 'Animation & Visual Effects', 'Undergraduate Program']

print(l)
'Antoinette Westphal COMAD,Animation & Visual Effects,Undergraduate Program'
user3166881
  • 138
  • 3
  • 13