0

How do I make a function that turns a string to a list. I am trying remove white spaces, punctuation, and lowercase.

For example:

input = string: "I have a big,red house."
output = list: "i","have","a","big","red","house"

def stir_to_list (file):
    result = [x.strip() for x in file.split(',')]
    return result
furas
  • 134,197
  • 12
  • 106
  • 148
jack
  • 1
  • https://stackoverflow.com/questions/12683201/python-re-split-to-split-by-spaces-commas-and-periods-but-not-in-cases-like – Buckeye14Guy Dec 06 '19 at 20:03
  • The answer on that duplicate is more complicated than this needs, since the duplicate involves some commas that are field delimiters, and some commas that aren't. – chepner Dec 06 '19 at 20:12
  • 1
    That said, there are three separate questions here: how to split the string, how to discard punctuation (which could be folded into the splitting process), and how to lowercase the resulting strings. – chepner Dec 06 '19 at 20:14
  • as @chepner said there are three problems: first you alredy resolved, last can be resolved with `file = file.lower()` which you can do as first. removing punctation would need `.replace('.', '').replace(',', '').replace(...)` or list comprenension with ie. `if char not in ".,?!"` – furas Dec 06 '19 at 20:23

2 Answers2

0

Just change your code to,

import re

def stir_to_list (file):
    result = [x.lower() for x in re.findall(r"[\w']+",file)]
    return result

Note that I have added lower() and removed the strip() function.

I have used a Regular Expression here. All it does is it returns a list with the matching sequence of characters,i.e. without the trailing blank spaces.

After that, all I am doing here is converting each character to lower case. Now if a character is already in lower case then it remains unaffected but an upper case letter will be converted to lower case.

Output:

['i', 'have', 'a', 'big', 'red', 'house']

If you don't know about Regular Expressions, you can look it up here.

halfer
  • 19,824
  • 17
  • 99
  • 186
Debdut Goswami
  • 1,301
  • 12
  • 28
0

you can do this with regular expressions as well:

import re
def stir_to_list (file):
    return re.findall(r"[\w']+", file.lower())

lower() is a built-in function making the characters in a string lower case.
[\w']+ is a pattern that finds all the words.

Arya11
  • 570
  • 2
  • 6
  • 21