I'm working on a simple compression algorithm that compresses binary files. I am scanning the file and filling a list with the character & the number of times that character appears after it. The list however is formatted in a way that is making the compressed result larger due to all the brackets & commas and I need rid of these. I have tried several methods of removing them but nothing is working. Here is the encode algorithm:
def encode(inputString):
characterCount = 1
previousCharacter = ''
List = []
for character in inputString:
if character != previousCharacter:
if previousCharacter:
listEntry = (previousCharacter, characterCount)
List.append(listEntry)
#print lst
characterCount = 1
previousCharacter = character
else:
characterCount += 1
else:
try:
listEntry = (character, characterCount)
List.append(listEntry)
return (List, 0)
except Exception as e:
print("Exception encountered {e}".format(e=e))
return (e, 1)`
And here is where I print the list. The hashed comments are the methods I have already tried with no luck.
value = encode(binaryfile)
if value[1] == 0:
print(value[0])
#flattened = [val for sublist in value for val in sublist]
#print(flattened)
#values = value[0]
#print(*value[0], sep='')
#print (''.join(map(str, value)))
#print(int("".join(str(x) for x in value[0])))
And here is the output.
[('1', 2), ('0', 1), ('1', 1), ('0', 4), ('1', 2), ('0', 2), ('1', 4), ('0', 3), ('1', 1), ('0', 3), ('1', 4), ('0', 5), ('1', 1), ('0', 1), ('1', 1), ('0', 4), ('1', 2), ('0', 1), ('1', 2), ('0', 3), ('1', 1), ('0', 3), ('1', 2), ('0', 1), ('1', 1), ('0', 1), ('1', 3), ('0', 4), ('1', 1), ('0', 130), ('1', 5), ('0', 15), ('1', 2), ('0', 8), ('1', 7), ('0', 1), ('1', 8), ('0', 4), ('1', 1), ('0', 2), ('1', 1), ('0', 13), ('1', 2), ('0', 96), ('1', 1), ('0', 26), ('1', 3), ('0', 70), ('1', 1), ('0', 22), ('1', 3), ('0', 1), ('1', 1), ('0', 32), ('1', 1), ('0', 24), ('1', 7), ('0', 1), ('1', 24), ('0', 34), ('1', 2), ('0', 1), ('1', 3), ('0', 24), ('1', 3459), ('0', 1), ('1', 2), ('0', 2), ('1', 1), ('0', 1), ('1', 1), ('0', 2), ('1', 1), ('0', 1), ('1', 3), ('0', 5), ('1', 1), ('0', 10), ('1', 1), ('0', 2), ('1', 3), ('0', 1), ('1', 2), ('0', 9), ('1', 1), ('0', 2), ('1', 1), ('0', 5), ('1', 1), ('0', 18), ('1', 4), ('0', 7), ('1', 1), ('0', 2), ('1', 1), ('0', 1), ('1', 1),
Any help is greatly appreciated