-1

I am doing python basic challenges this is one of them. What all I needed to do is to read through a file and print out the frequency of letters in decreasing order. I am able to do this but I wanted to enhance the program by also printing out the frequency percentage alongside with the letter - frequency - freq%. Something like this: o - 46 - 10.15%

This is what I did so far:

def exercise11():
    import string
    while True:
        try:
            fname = input('Enter the file name -> ')
            fop = open(fname)
            break
        except:
            print('This file does not exists. Please try again!')
            continue

    counts = {}
    for line in fop:
        line = line.translate(str.maketrans('', '', string.punctuation))
        line = line.translate(str.maketrans('', '', string.whitespace))
        line = line.translate(str.maketrans('', '', string.digits))
        line = line.lower()
        for ltr in line:
            if ltr in counts:
                counts[ltr] += 1
            else:
                counts[ltr] = 1
    lst = []
    countlst = []
    freqlst = []
    for ltrs, c in counts.items():
        lst.append((c, ltrs))
        countlst.append(c)
    totalcount = sum(countlst)
    for ec in countlst:
        efreq = (ec/totalcount) * 100
        freqlst.append(efreq)
    freqlst.sort(reverse=True)
    lst.sort(reverse=True)
    for ltrs, c, in lst:
        print(c, '-', ltrs)

exercise11()

As you can see I am able to calculate and sort the freq% on a different list but I am not able to include it in the tuple of the lst[] list alongside with the letter, freq. Is there any way to solve this problem?

Also if you have any other suggestions for my code. Please do mention. Output Screen


Modification

Applying a simple modification as mentioned by @wwii I got the desired output. All I had to do is add one more parameter to the print statement while iterating the lst[] list. Previously I tried to make another list for the freq%, sort and then tried to insert it to the letters-count tuple in a list which didn't work out.

 for ltrs, c, in lst:
        print(c, '-', ltrs, '-', round(ltrs/totalcount*100, 2), '%')

Output Screen

Srijan Singh
  • 37
  • 1
  • 2
  • 11

5 Answers5

1

The items in freqlst,countlist, and lst are related to each other by their position. If any are sorted that relationship is lost.

zipping the lists together before sorting will maintain the relationship.

Will pick up from your list initialization lines.

lst = []
countlst = []
freqlst = []
for ltr, c in counts.items():
    #change here, lst now only contains letters
    lst.append(ltr)
    countlst.append(c)
totalcount = sum(countlst)
for ec in countlst:
    efreq = (ec/totalcount) * 100
    freqlst.append(efreq)

#New stuff here: Note this only works in python 3+
zipped = zip(lst, countlst, freqlst)
zipped = sorted(zipped, key=lambda x: x[1])

for ltr, c, freq in zipped:
    print("{} - {} - {}%".format(ltr, c, freq)) # love me the format method :)

Basically, zip combines lists together into a list of tuples. Then you can use a lambda function to sort those tuples (very common stack question)

wwii
  • 23,232
  • 7
  • 37
  • 77
matisetorm
  • 857
  • 8
  • 21
  • The only flaw with this is that it prints freq(counts) two times in the output screen, once while within the tuple inside the `lst[] list` and second with the `countlst` list. – Srijan Singh Feb 22 '18 at 03:10
  • yeah sorry. That is super easy to fix tho by removing the tuple creation in the first for loop. I was trying to illustrate a solution. I will fix. If this helped, consider accepting it – matisetorm Feb 22 '18 at 03:52
1

Tuples are immutable which is probably the issue you are finding. The other issue is the simple form of the sort function; A more-advanced sort function would serve you well. See below:

The list-of-tuples format of lst, but because tuples are immutable whereas lists are mutable, opting to change lst to a list-of-lists is a valid approach. Then, since lst is a list-of-lists with each element consisting of 'letter,count,frequency%', the sort function with lambda can be used to sort by whichever index you'd like. The following is to be inserted after your for line in fop: loop.

lst = []
for ltrs, c in counts.items():
    lst.append([ltrs,c])
totalcount = sum([x[1] for x in lst])       # sum all 'count' values in a list comprehension

for elem in lst:
    elem.append((elem[1]/totalcount)*100)   # now that each element in 'lst' is a mutable list, you can append the calculated frequency to the respective element in lst

lst.sort(reverse=True,key=lambda lst:lst[2])    # sort in-place in reverse order by index 2.
matisetorm
  • 857
  • 8
  • 21
jshrimp29
  • 588
  • 6
  • 8
  • Not sure why we need to get hung up on the imutability of tuples. The len of the lists will all be the same. Just use zip() to get them together. That said, advanced sorting is the way to go here for sure :) We are in agreement there. Cheers – matisetorm Feb 22 '18 at 01:52
1

Your count data is in a dictionary of {letter:count} pairs.

You can use the dictionary to calculate the total count like this:

total_count = sum(counts.values())

Then don't calculate the percentage till you are iterating over the counts...

for letter, count in counts.items():
    print(f'{letter} - {count} - {100*count/total}')    #Python v3.6+
    #print('{} - {} - {}'.format(letter, count, 100*count/total)    #Python version <3.6+

Or if you want to put it all in a list so you can sort it:

data = []
for letter, count in counts.items():
    data.append((letter,count,100*count/total)

Using operator.itemgetter for the sort key function can help code readability.

import operator
letter = operator.itemgetter(0)
count = operator.itemgetter(1)
frequency = operator.itemgetter(2)

data.sort(key=letter)
data.sort(key=count)
data.sort(key=frequency)
wwii
  • 23,232
  • 7
  • 37
  • 77
0

I think I was able to achieve what you wanted by using lists instead of tuples. Tuples cannot be modified, but if you really want to know how click here

(I also added the possibility to quit the program)

Important: Never forget to comment your code

The code:

def exercise11():
    import string
    while True:
        try:

            fname = input('Enter the file name -> ')
            print('Press 0 to quit the program') # give the User the option to quit the program easily
            if fname == '0':
                break
            fop = open(fname)
            break
        except:
            print('This file does not exists. Please try again!')
            continue

    counts = {}
    for line in fop:
        line = line.translate(str.maketrans('', '', string.punctuation))
        line = line.translate(str.maketrans('', '', string.whitespace))
        line = line.translate(str.maketrans('', '', string.digits))
        line = line.lower()
        for ltr in line:
            if ltr in counts:
                counts[ltr] += 1
            else:
                counts[ltr] = 1
    lst = []
    countlst = []
    freqlst = []

    for ltrs, c in counts.items():
        # add a zero as a place holder & 
        # use square brakets so you can use a list that you can modify 
        lst.append([c, ltrs, 0]) 
        countlst.append(c)
    totalcount = sum(countlst)

    for ec in countlst:
        efreq = (ec/totalcount) * 100
        freqlst.append(efreq)
    freqlst.sort(reverse=True)
    lst.sort(reverse=True)

    # count the total of the letters 
    counter = 0
    for ltrs in lst:
        counter += ltrs[0]

    # calculate the percentage for each letter 
    for letter in lst:
        percentage = (letter[0] / counter) * 100
        letter[2] += float(format(percentage, '.2f'))

    for i in lst:
        print('The letter {} is repeated {} times, which is {}% '.format(i[1], i[0], i[2]))
exercise11()
Nello
  • 141
  • 4
  • 15
0
<?php

$fh = fopen("text.txt", 'r') or    die("File does not exist");
 $line = fgets($fh); 

 $words = count_chars($line, 1); 

foreach ($words as $key=>$value)
   {
   echo "The character  <b>' ".chr($key)." '</b>  was found   <b>$value</b>   times. <br>";
   }

?>