1

I made two functions which count punctuations in a text file. These punctuations are comma, apostrophe, hyphen and semi-colon. Counting comma and semi-colon is quite straight forward, but counting apostrophe and hyphen is a little bit more complicated as there are certain rules I must follow (according to my assignment) e.g. I can only count an apostrophe if it is between two letters like in shouldn't, won't etc. So I split this task up into two functions: countpunc1() and countpunc2().

At the end of both of these, I return a dictionary that has the count of these punctuations.

Then in the function main(), I want to be able to return a dictionary that has both the results from countpunc1 and countpunc2 combined into one key: punctuations.

Example shown at the bottom.

Here is my code:


def countpunc1(text):

    for ch in '0123456789abcdefghijklmnopqrstuvwxyz!"#$%&()*+./:<=>?@[\\]^_`{|}~-':
        text = text.replace(ch, '')
    words = text.replace('--', '').replace("'", '').split()       
    wordlist = list(words)

    dictToReturn = {}
    punctuations = [',', ';']

    punclist = list(i for i in wordlist if i in punctuations) 

    for x in range(len(punctuations)):
        dictToReturn[punctuations[x]] = dictToReturn.get(x,0)
    for p in punclist:
        dictToReturn[p] = dictToReturn.get(p,0) + 1

    return dictToReturn


def countpunc2(text):
    for ch in '!"#$%&()*+./:<=>?@[\\]^_`{|}~':
        text = text.replace(ch, ' ')
    words = text.replace('--', ' ').split('\n')    
    wordlist = str(words)

    punctuations = "'-"          

    dictToReturn = {}
    letters = "abcdefghijklmnopqrstuvwxyz"
    for i, char in enumerate(wordlist):
        if i < 1:
            continue
        if i > len(wordlist) - 2:
            continue
        if char in punctuations:
            if char not in dictToReturn:
                dictToReturn[char] = 0
            if wordlist[i-1] in letters and wordlist[i+1] in letters:
                dictToReturn[char] += 1

    return dictToReturn

def main(text):
    text = open(text, 'r').read().lower()
    profileDict = {}
    # profileDict[punctuations] = ??
    return profileDict

In the second last line above that is commented, I tried doing things like:

profileDict[punctuations] = countpunc1(text) + countpunc2(text)

and

profileDict[punctuations] = countpunc1(text).items() + countpunc2(text).items()

Clearly all of these are wrong and I get an TypeError: unsupported operand type(s).

Expected result is something like this:

E.g: dict[punctuations] = {",": 9, "'" : 0, ";" : 4, "-" : 11}

PS. the function themselves work fine as I tested them on multiple text files.

NicoNing
  • 3,076
  • 12
  • 23

0 Answers0