0

I've been trying to read images in ASCII values and compare the two images for an exact ASCII values match. However, the output is very large and my hardware is very old and cannot read the output, I have tried to save the output to a file and the file was huge. Here is what I'm trying to do:

orig = sys.stdout
f = open('output.txt','w')
sys.stdout = f

# Load the two Images 

with open("image1.jpg", "rb") as b:
 with open("image2.jpg", "rb") as a:

  # Convert the two images from binary to ascii

    chunk1 = binascii.b2a_hex(b.read())
    chunk2 = binascii.b2a_hex(a.read())

# split the two chunks of ascii values into a list of 24 bytes 

chunkSize = 24
for i in range (0,len(chunk1),chunkSize):
 for j in range (0,len(chunk2),chunkSize):

 # Print them

  list1 = chunk1[i:i+chunkSize]
  print "List1: "+ list1
  list2 = chunk2[j:j+chunkSize]
  print "List2: " + list2

# Compare the two images for equality 

  list = list1 == list2

 # print whether its a match or false

  print list

sys.stdout = orig
f.close()

# Saved to a file

How it works:

img1 has the following hex : FFD8 FFE0 0010 4A46 4946 0001 0200 0064 0064 0000 FFEC 0011 img2 has the following hex: FFD8 FFE0 0010 4A46 4946 0001 0210 0064 0064 0000 FFEC 0012

It would take the first 24 chars of the img1 and test it against all the img2 hex in 24 chars a time and then take the next 24 chars of img1 and test against all the img2 hex. Example:

List1: FFD8 FFE0 0010 4A46 4946 0001 
List2: FFD8 FFE0 0010 4A46 4946 0001 
True 

List1: FFD8 FFE0 0010 4A46 4946 0001 
List2: 0210 0064 0064 0000 FFEC 0012 
False

List1: 0200 0064 0064 0000 FFEC 0011 
List2: FFD8 FFE0 0010 4A46 4946 0001 
False 

List1: 0200 0064 0064 0000 FFEC 0011 
List2: 0210 0064 0064 0000 FFEC 0012 
False

However, The output is large considering huge images with like 40k hex and 20k which I can't read from the terminal either saving the output to a file.

How do I print only the matched(True) 24 chars ASCII hex value without printing true, false and false ASCII hex values?

FFD8 FFE0 0010 4A46 4946 0001
AbyxDev
  • 1,363
  • 16
  • 30
  • Could you just compare the hex bytes without printing them? Do you need the actual output for some other reason? – John Gordon May 18 '18 at 15:06
  • Can you just compare `chunk1 == chunk2` directly? If there is a difference, do you care _where_ it is? – John Gordon May 18 '18 at 15:18
  • Yes I do need to see the output. if you're implying comparing chunks1,2 directly without going through a loop will not output the whole results, would it? If not then isn't that the same as comparing list1 == list2. I don't care about the differences, all I care about is a match ascii hex bytes output. – PrivateIntID May 19 '18 at 01:42
  • chunk1 and chunk2 are the entire file contents. you can compare them to see if the files are equal. but that would only give you a yes/no answer, it wouldn't tell you what the specific byte differences were. – John Gordon May 19 '18 at 01:51
  • Exactly. Thank you John for the time spent on this. – PrivateIntID May 19 '18 at 02:15

2 Answers2

1

You can simply read 24 bytes from each image at a time instead of reading the entire file at once. file.read() accepts a parameter that allows it to just read a couple of bytes at a time. You can run this in a loop until read() returns an empty string which means that the end of file has been reached. See the doc.

EDIT:

If what you want is to simply check if two files are the same, why not look into checksums? Identical files will always have the same checksum. See this answer for a bit more detail.

Rolando Cruz
  • 2,834
  • 1
  • 16
  • 24
  • Indeed I can. I was actually aware of it, but I thought I would read the images as a whole and later on split them in an evenly sized chunk of chars, compares them and output the matched and false results, but results got larger and become not readable. Your approach is much shorter. Definitely going to try it out. Thank you – PrivateIntID May 19 '18 at 01:51
0

If I understand the question, how about:

orig = sys.stdout
f = open('output.txt','w')
sys.stdout = f

# Load the two Images 

with open("image1.jpg", "rb") as b:
 with open("image2.jpg", "rb") as a:

  # Convert the two images from binary to ascii

    chunk1 = binascii.b2a_hex(b.read())
    chunk2 = binascii.b2a_hex(a.read())

# split the two chunks of ascii values into a list of 24 bytes 

chunkSize = 24
for i in range (0,len(chunk1),chunkSize):
 for j in range (0,len(chunk2),chunkSize):

  list1 = chunk1[i:i+chunkSize]
  list2 = chunk2[j:j+chunkSize]

  # Compare the two images for equality 

  list = list1 == list2

  # print bytes once only if they were the same in both list1 and list2

  if list:
   print list1

sys.stdout = orig
f.close()

That will omit any output that is False in your original example, the only output will be the bytes that matched. If this isn't what you meant, can you clarify exactly what you want to achieve please?

Rob Bricheno
  • 4,467
  • 15
  • 29