13

I'm working on a piece of software which needs to implement the wiggliness of a set of data. Here's a sample of the input I would receive, merged with the lightness plot of each vertical pixel strip: alt text

It is easy to see that the left margin is really wiggly (i.e. has a ton of minima/maxima), and I want to generate a set of critical points of the image. I've applied a Gaussian smoothing function to the data ~ 10 times, but it seems to be pretty wiggly to begin with.

Any ideas?

Here's my original code, but it does not produce very nice results (for the wiggliness):

def local_maximum(list, center, delta):
  maximum = [0, 0]

  for i in range(delta):
    if list[center + i] > maximum[1]: maximum = [center + i, list[center + i]]
    if list[center - i] > maximum[1]: maximum = [center - i, list[center - i]]

  return maximum

def count_maxima(list, start, end, delta, threshold = 10):
      count = 0

  for i in range(start + delta, end - delta):
    if abs(list[i] - local_maximum(list, i, delta)[1]) < threshold: count += 1

  return count

def wiggliness(list, start, end, delta, threshold = 10):
  return float(abs(start - end) * delta) / float(count_maxima(list, start, end, delta, threshold))
Robie Basak
  • 6,492
  • 2
  • 30
  • 34
Blender
  • 289,723
  • 53
  • 439
  • 496
  • 10
    Could you post a link to an accurate definition of wiggliness? – Adam Matan Nov 16 '10 at 06:29
  • Is the statistic you are looking to characterize a frequency feature or amplitude feature? – SingleNegationElimination Nov 16 '10 at 07:14
  • 1
    If you are asking about a way to characterise wiggliness instead of a way to implement that characterisation programmatically, you may have more luck on http://stats.stackexchange.com/. – Katriel Nov 16 '10 at 07:38
  • 1
    Instead of just hitting it with a smoothing function over and over, I'd run it through a well-known low-pass filter of some sort, like a [Butterworth filter](http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.butter.html#scipy-signal-butter). That'll make things MUCH easier to tune later on. – detly Nov 16 '10 at 07:50
  • Wow, thanks. I was thinking about it for a while, and I will try to see if I can do anything with the absolute value of the derivative of the function. As for a smoothing filter, I'm more into theoretical mathematics, so I don't know much about statistics, but I'll surely look into that filter, as I will need to do a lot of other stuff like this. Thanks! – Blender Nov 16 '10 at 16:07
  • What are you trying to do with this data? Both out of curiosity and because it might help people when offering suggestions. – Redwood Nov 17 '10 at 17:21
  • I'm creating an algorithm to crop pictures of scanned books. The current method employed in virtually all software looks for vertical lines of similar colors, but that fails in many cases. I'm trying to get it by lightness. I've done a blind test where I scanned a few pages completely randomly, plotted the data, and I could still tell where the text, seam, and boundaries of the book are just from the data. – Blender Nov 17 '10 at 18:27

2 Answers2

5

Take a look at lowpass/highpass/notch/bandpass filters, fourier transforms, or wavelets. The basic idea is there's lots of different ways to figure out the frequency content of a signal quantized over different time-periods.

If we can figure out what wiggliness is, that would help. I would say the leftmost margin is wiggly b/c it has more high-frequency content, which you could visualize by using a fourier transform.

If you take a highpass filter of that red signal, you'll get just the high frequency content, and then you can measure the amplitudes and do thresholds to determine wiggliness. But I guess wiggliness just needs more formalism behind it.

gtrak
  • 5,598
  • 4
  • 32
  • 41
  • Thanks! I'll take a look at them. I'm still trying out figure out what all of this terminology means ;) – Blender Nov 17 '10 at 15:05
1

For things like these, numpy makes things much easier, as it provides useful functions for manipulating vector data, e.g. adding a scalar to each element, calculating the average value etc.

For example, you might try with zero crossing rate of either the original data-wiggliness1 or the first difference-wiggliness2 (depending on what wiggliness is supposed to be, exactly-if global trends are to be ignored, you should probably use the difference data). For x you would take the slice or window of interest from the original data, getting a sort of measure of local wiggliness. If you use the original data, after removing the bias you might also want to set all values smaller than some threshold to 0 to ignore low-amplitude wiggles.

import numpy as np

def wiggliness1(x):
    #remove bias:
    x=x-np.average(x)
    #calculate zero crossing rate:
    np.sum(np.abs(np.sign(np.diff(x))))


def wiggliness(x):
    #calculate zero crossing rate of the first difference:
    return np.sum(np.abs(np.sign(np.diff(np.sign(np.diff(x))))))
miro
  • 136
  • 2
  • Thanks, I never thought of that. I think I will use that, since my smoothing algorithm removes some critical points... – Blender Nov 17 '10 at 18:54