This is supposed to be a compression function. We're supposed to read in a text file with just words, and sort them by frequencies and by the amount of words. Upper and lower case. I don't an answer to solve it, I just want help.
for each word in the input list
if the word is new to the list of unique words
insert it into the list and set its frequency to 1
otherwise
increase its frequency by 1
sort unique word list by frequencies (function)
open input file again
open output file with appropriate name based on input filename (.cmp)
write out to the output file all the words in the unique words list
for each line in the file (delimited by newlines only!)
split the line into words
for each word in the line
look up each word in the unique words list
and write its location out to the output file
don't output a newline character
output a newline character after the line is finished
close both files
tell user compression is done
This is my next step:
def compression():
for line in infile:
words = line.split()
def get_file():
opened = False
fn = input("enter a filename ")
while not opened:
try:
infile = open(fn, "r")
opended = True
except:
print("Won't open")
fn = input("enter a filename ")
return infile
def compression():
get_file()
data = infile.readlines()
infile.close()
for line in infile:
words = line.split()
words = []
word_frequencies = []
def main():
input("Do you want to compress or decompress? Enter 'C' or 'D' ")
main()