Finding sequential patterns in data (efficiency question)

Question

I am writing some JavaScript code to do statistical anomaly detection, and I am programmatically analyzing data from a Control Chart. (Shout out to my Lean Six Sigma friends out there.) There are a number of rules that, if they are broken, identifies a potential anomaly. My scenario is this:

I have an array of data that consists of "A," "B," or "C." If 2 out of 3 consecutive points are A's, I want to flag these points for further analysis. For example, ["A", "C", "A"] and ["B", "A", "A"] would be flagged, but ["A", "C", "B"] would not. In addition, I need to identify them by their location in the array so I can highlight them accordingly.

I have code that works, but there has to be a more efficient way of doing this.

Here is my code.

const ee = require( "events" ).EventEmitter;

let results = {
    "test1": [],
    "test2": []
}

let emitter = new ee();
emitter.on( "points", ( _pts, _me ) => {
    results[ _me ].push( _pts );
} );

let zones = [ "C", "C", "A", "C", "C", "A", "B", "A", "C", "C", "C", "A", "A", "A", "A", "C"];

for ( let i = 0; i < zones.length; i++ ) {
    let me = "test1";
    let pts = [];
    let str = "";
    for ( j = i; j < i + 3; j++ ) {
        if ( zones[ j ] !== undefined ) {
            pts.push( j );
            str += zones[ j ];
        }
    }
    if ( str && str.length === 3 ) {
        let _temp = str.match( /A/g );
        if ( _temp && _temp.length >= 2 ) {
            emitter.emit( "points", pts, me );
        }
    }  
}
console.log( JSON.stringify( results, null, 2 ) );

The results look like this:

{"test1":[[5,6,7],[10,11,12],[11,12,13],[12,13,14],[13,14,15]],"test2":[]}

(There will be many tests, so "test2" in the results is just a placeholder for now.)

As I said, it works and gives me precisely the results I am looking for, but I refuse to believe there is not a better way to do this.

Any help is greatly appreciated.

A simpler way to get those `str`s would be `zones.slice(i, i+3).join('')`. But really you shouldn't need strings at all and then use a regular expression match against them - better create an integer `let aCount = 0` and increment that in your loop: `if (zones[j] === 'A') aCount += 1;`, then just check that! — Bergi, May 01 '23 at 19:26
Should the code be easily extendable to slices larger than 3? — Bergi, May 01 '23 at 19:27
@Bergi re: extendable to slices larger than 3, short answer is yes, but the rules start to change. For example, the next rule would be 4 out of 5 in zones A or B. Similar, but different zones and different ratio. — kurt.kincaid, May 01 '23 at 19:47
In that case, you should use a [sliding window](https://stackoverflow.com/q/8269916/1048572) with counts by element — Bergi, May 01 '23 at 19:52
If the example was "CCACCABACCCAAAAC" instead, one could find matches for the regular expression "AA|A.A". (Hey, you're appending *zone*s to a string already, and use RE.) — greybeard, May 02 '23 at 11:09

Finding sequential patterns in data (efficiency question)

0 Answers0