2

I'm trying to write code that will remove dashes from a string if it's in the middle of a word and not between them. If the dash comes before or after a line break, this will also remove the line break. According to the rules of this assignment, a dash that needs to be removed will always border a line break, a space, or another dash. If the character to the left of the dash is a letter and the character to the right is a letter, it should not be removed.

def remove_dashes(string):
    lst = list(string)
    for i in range(len(lst)-1):
        if lst[i] == '-' and lst[i+1] == (' ' or '-' or '\n') or lst[i-1] == (' ' or '-' or '\n'):
            lst[i] = ''
            if lst[i+1] == '\n':
                lst[i+1] = ''
            elif lst[i-1] == '\n':
                lst[i-1]
        elif lst[i] == '-' and i == len(lst)-1 and lst[i-1] == (' ' or '-' or '\n'):
            lst[i] = ''
            if lst[i-1] == '\n':
                lst[i-1] = ''
    return "".join(lst)

So in theory "rem-\nove the-\nse da\n-shes--" would return as "remove these dashes" without any line breaks. But "not-these-dashes" would just return as "not-these-dashes". However, my code is not working. Can anyone help?

Prune
  • 76,765
  • 14
  • 60
  • 81
mrcmc888
  • 53
  • 1
  • 3
  • 1
    can you show us an example input string and output string – yash Oct 30 '17 at 23:32
  • Also, please fix your indentation – yash Oct 30 '17 at 23:34
  • Unless my Python is more rusty than I realised, expressions like `thing == (' ' or '-' or '\n')` don't mean what you think they mean. `or` has a wider meaning than just combining boolean results, but doesn't build a set of alternatives for `==` to search through. The English language can do things with `or` and `and` that most programming languages (and mathematics) can't. A quick check tells me that `'a' or 'b'` gives the result `'a'` - as far as I recall because that's the first alternative that isn't null - so that `==` probably just sees the ' ' on the RHS and never checks for '-' or '\n'. –  Oct 30 '17 at 23:42
  • So basically `lst[i+1] == (' ' or '-' or '\n')` only runs without triggering an error message by accident, and should be rewritten as the more long-winded `(lst[i+1] == ' ' or lst[i+1] == '-' or lst[i+1] == '\n')`. Similar for other cases. –  Oct 30 '17 at 23:54

1 Answers1

1

You can use re.sub:

import re
s = ["rem-\nove the-\nse da\n-shes--", "not-these-dashes"]
new_data = [re.sub("-\n|--|\n-", '', i) for i in s]

Output:

['remove these dashes', 'not-these-dashes']
Ajax1234
  • 69,937
  • 8
  • 61
  • 102