-1

Let's assume I have two variables whose content is very big, and so I'm not interested in checking manually their content.

For a working example purpose, let's run:

from sklearn.datasets import fetch_openml
mnist=fetch_openml('mnist_784',version=1)
mnist2=minst

I have no idea what's in mnist. If I do type(mnist), I get sklearn.utils.Bunch which for me means nothing...

I'm looking for a method, or function, that tells me if two variables (mnist and mnist2) are equal to each other, i.e. function(mnist,mnist2) returns True. I don't want a method/function that works only when the variables are strings, or only when they are lists... I'm looking for a method/function that works even when I have no idea what the variable content is, just like above.

For example, I've used == for comparing two variables, but I get an error.

ValueError                                Traceback (most recent call last)
<ipython-input-9-582032dfa1c5> in <module>
----> 1 mnist==mnist2

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

So, I've tried

import numpy
numpy.all(mnist==mnist2)

which returns

ValueError                                Traceback (most recent call last)
<ipython-input-21-c2fa07d03d32> in <module>
      1 import numpy
----> 2 numpy.all(mnist==mnist2)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
An old man in the sea.
  • 1,169
  • 1
  • 13
  • 30
  • 1
    What do you mean by "equal to each other, without being dependent on their contents"? – ccl Mar 03 '20 at 16:55
  • 1
    What do you mean by equality here? You mean that they are the same object in memory? You ask if the content is the same, then ask how to check equality without checking content. Doesn't make sense. – roganjosh Mar 03 '20 at 16:55
  • 1
    I'd say it's impossible. You've already seen an example that needs special treatment, and I could easily write a data type that needs a different special treatment, one that nobody has ever seen before. – Kelly Bundy Mar 03 '20 at 16:57
  • The immediate issue here is that `==` doesn't compare two NumPy arrays for equality; it creates a *new* array representing the pointwise equality of the corresponding elements. E.g., `[1,2] == [3, 2]` does not evaluate to `False`, but to `[False, True]`. – chepner Mar 03 '20 at 17:00
  • @roganjosh I want a method that works for any content... I don't want a method that works only when the variables are strings, or only when they are lists. – An old man in the sea. Mar 03 '20 at 17:03
  • 1
    For down-clicking-happy users, please think of giving some feedback. – An old man in the sea. Mar 03 '20 at 17:48
  • Related: https://stackoverflow.com/questions/10062954/valueerror-the-truth-value-of-an-array-with-more-than-one-element-is-ambiguous – AMC Mar 03 '20 at 19:31
  • Why do you need to do this, what is it for? Can you provide some more context? – AMC Mar 03 '20 at 19:33
  • @AMC I've edited the question. I hope now it's clearer. thanks – An old man in the sea. Mar 04 '20 at 16:11
  • @Anoldmaninthesea. I understand the first part of the question, just not the leap to needing some sort of universal content checker thingy. _If I do `type(mnist)`, I get `sklearns.utils.Bunch` which for me means nothing..._ That shouldn’t be much of an obstacle though, right? http://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_openml.html – AMC Mar 04 '20 at 16:32

1 Answers1

-1

The following sample program reads two files and compares them using their hash values:

import hashlib

with open("datei1.txt", "rb") as f1, open("datei2.txt", "rb") as f2:
    if hashlib.md5(f1.read()).digest() == hashlib.md5(f2.read()).digest():
        print("same data")
    else:
        print("dates are not the same")
Albert.O
  • 27
  • 6