0

I have two lists containing some image pixel values and its "measurements". I'm trying to delete items from those if it satisfies two conditions. I tried looping over it with a for loop before realizing my dumb mistake. My code is below. What could be a method to do it?

for i in range(100):
    delete_chance = np.random.random()
    if abs(measurements[i]) == 0.15 and delete_chance < 0.70:
        del images[i]
        del measurements[i]
prat_pad
  • 41
  • 1
  • 2
  • 7
  • 1
    What's the dumb mistake? Why doesn't this code work and what should it do instead? – Nils Werner May 12 '17 at 13:34
  • 1
    If you just iterate over the `reversed(range(100))` instead it can actually be safe. – wim May 12 '17 at 13:36
  • 1
    Whatever solution you might end up using, you should eliminate the float equal comparison. There are special function for float comparison, look it up. – Liran Funaro May 12 '17 at 13:46

4 Answers4

2

A list comprehension can help:

filtered_images, filtered_measurements = zip(*[(i, m) for i, m in zip(images, measurements) if math.isclose(abs(m), 0.15) and random.random() < 0.7])

Note we are using math.isclose() which has only been added to Python 3.5. If you're on older versions you'll have to write your own isclose().

If speed is an issue (thousands or millions of images) you can use NumPy, too:

images = numpy.array(images)
measurements = numpy.array(measurements)

mask = numpy.logical_and(
    numpy.random.rand(images.shape[0]) < 0.7,
    numpy.isclose(numpy.abs(measurements), 0.15)
)

filtered_measurements = measurements[mask]
filtered_images = images[mask]
Community
  • 1
  • 1
Nils Werner
  • 34,832
  • 7
  • 76
  • 98
0

Try filter:

z = zip(*list(filter(lambda x: x[1] != 0.15 or np.random.random() >= 0.7, zip(measurements, images))))
measurements = list(z[0])
images = list(z[1])
M. Shaw
  • 1,742
  • 11
  • 15
  • You're using values from `images` as an index for `measurements`. I don't think this will work. Instead try zipping `images` and `measurements`. – Nils Werner May 12 '17 at 13:36
0

First, you should never compare two floats directly, you should compare them using a function that accept a reasonable error (epsilon).

Second, first build an iterator that return True for whatever element you want to remove:

t = [np.random.random() < 0.7 and np.isclose(abs(measurements[i]), 0.15) for i in range(100)]

Then build a new list accordingly:

measurments = [m for m,b in zip(measurments,t) if not b]
images = [m for m,b in zip(images,t) if not b)]
Liran Funaro
  • 2,750
  • 2
  • 22
  • 33
-1

You could iterate in reversed order:

for i in range(99, -1, -1):
    delete_chance = np.random.random()
    if abs(measurements[i]) == 0.15 and delete_chance < 0.70:
        del images[i]
        del measurements[i]
Mike Scotty
  • 10,530
  • 5
  • 38
  • 50