Create an error statistic sensitive to case sample size and is scaled to 0 and 1

Question

I want to describe the error of a method to detect a single feature in an image.

The features are microscopic pores and sometimes the program counts more pores than are present, and sometimes fewer pores than are present.

An absolute error of 1/10 pores is more severe than 1/30, and I want to incorporate the sample size of pores in an image into the calculation of a scaled error statistic that is between 0 and 1, where 0 = no errors, and 1 is complete error.

This is what I've created so far (thanks to @Vic for his/her answer):

scaled_error = (absolute_error - MIN(absolute_error)) / (MAX(absolute_error) - MIN(absolute_error))

That code rescales the values between 0 and 1, but I don't think it's doing what I want.

I think the statistic is too sensitive to the max(absolute_error).

For example, one sample has 16 pores, but only 8 were detected. That's an error of 0.5, but the scaled_error statistic is 0.031 = (8-0) / (258-0).

My question is, how can I rescale and incorporate the true count of pores to create a statistic that is sensitive to how severe the over/undercounting is?

EDIT: I forgot to add that I already tried scaling the absolute error by the true count, but if the denominator is zero infinite is returned and if the numerator is larger than the demoninator, a value grater than 1 is returned. I used this code to create that version of the error statistic:

scaled_error = (dat$Automated_count - dat$Manual_count) / (dat$Manual_count)

What would *complete error* mean for a method that could output a potentially unbounded count? A nice (unbounded) measure might be `abs(log(recognizedPoreCount / correctPoreCount))` — Nico Schertler, Jan 03 '18 at 17:50
I suppose complete error is when zero pores are present but it finds more than zero? — GigaZaur, Jan 03 '18 at 17:53
What does the '1' in max(manual, 1) signify? I'm working in R, fyi. — GigaZaur, Jan 03 '18 at 18:04
@AlexisOlson, the formula you suggested returns the same as first scaled_error definition. Above. It is between 0 and 1, but it's still sensitive to the largest value of error in the dataset. — GigaZaur, Jan 03 '18 at 18:09
The function `max` returns the greater of its arguments. If `manual` is `0`, then it returns `1`. (In R it operates on vectors instead of multiple arguments.) — Alexis Olson, Jan 03 '18 at 18:16
If the manual count is 0, do you want your statistic to differentiate between automatic counts of 1, 2, and 10? If so, how? — Alexis Olson, Jan 03 '18 at 18:17
No I don't think I need to differentiate between an incorrect automatic count of 1,2 or 10. In that case, any non-zero number is completely wrong. — GigaZaur, Jan 03 '18 at 18:31

Create an error statistic sensitive to case sample size and is scaled to 0 and 1

0 Answers0