7

I am looking to create a program that would identify certain patterns in numbers. I am not sure if this needs an algorithm or just carefully thought out programming. I am not looking for someone to provide source code, just some thought provoking ideas to start me in the right direction.

Numbers will be fixed length of 6 digits from 000000 to 999999. I guess each number would be stored as part of an array. I would then like to test the number against a pattern.

For example lets say the 3 patterns I am using are

A A A A A A - would match such examples as 111111 , 222222, 333333 etc where 
A B A B A B - would match such examples as 121212 , 454545, 919191 etc
A (A+1) (A+2) B (B+1) (B+2) - would match such examples as 123345, 789123, 456234

I guess the part that I am stuck on is how to allocate each part of the integer array to a value such as A or B

My initial thoughts would be just to allocate each part as an individual letter. So if the array consisted of 1 3 5 4 6 8, then I would create a map like

A=1
B=3
C=5
D=4
E=6
F=8

Then some how take the first pattern,

AAAAAA

and test with something like if (AAAAAA = ABCDEF) then we have matched AAAAAAA

if not then try (ABABAB = ABCDEF) etc through all my patterns

In this scenario there is no reason why the value assigned to C cant be the same as the value assigned to F like in the number 234874.

I am not sure if this will even make sense to anyone but I guess I can refine my question based on feedback.

In summary, I am looking for ideas on how to have a program accept a 6 digit number and return back to us which pattern it matched.

SOLUTION

After the comments given put me on a good track below is what i created as a final solution.

package com.doyleisgod.number.pattern.finder;

import java.io.File;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;


public class FindPattern {
private final int[] numberArray; //Array that we will match patterns against.
private final Document patternTree = buildPatternTree(); //patternTree containing all the patterns
private final Map<String, Integer> patternisedNumberMap; //Map used to allocate ints in the array to a letter for pattern analysis
private int depth = 0; //current depth of the pattern tree

// take the int array passed to the constructor and store it in out numberArray variable then build the patternised map
public FindPattern (int[] numberArray){
    this.numberArray = numberArray;
    this.patternisedNumberMap = createPatternisedNumberMap();
}

//builds a map allocating numbers to letters. map is built from left to right of array and only if the number does not exist in the map does it get added
//with the next available letter. This enforces that the number assigned to A can never be the same as the number assigned to B etc
private Map<String, Integer> createPatternisedNumberMap() {
    Map<String, Integer> numberPatternMap = new HashMap<String, Integer>();

    ArrayList<String> patternisedListAllocations = new ArrayList<String>();
    patternisedListAllocations.add("A");
    patternisedListAllocations.add("B");
    patternisedListAllocations.add("C");
    patternisedListAllocations.add("D");
    Iterator<String> patternisedKeyIterator = patternisedListAllocations.iterator();

    for (int i = 0; i<numberArray.length; i++){
        if (!numberPatternMap.containsValue(numberArray[i])) {
            numberPatternMap.put(patternisedKeyIterator.next(), numberArray[i]);
        } 
    }
    return numberPatternMap;
}

//Loads an xml file containing all the patterns.
private Document buildPatternTree(){
    Document document = null;
    try {
    File patternsXML = new File("c:\\Users\\echrdoy\\Desktop\\ALGO.xml");
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    document = db.parse(patternsXML);

    } catch (Exception e){
        e.printStackTrace();
        System.out.println("Error building tree pattern");
    }
    return document;
}

//gets the rootnode of the xml pattern list then called the dfsnodesearch method to analyse the pattern against the int array. If a pattern is found a
//patternfound exception is thorwn. if the dfsNodeSearch method returns without the exception thrown then the int array didn't match any pattern
public void patternFinder() {
    Node rootnode= patternTree.getFirstChild();
    try {
        dfsNodeSearch(rootnode);
        System.out.println("Pattern not found");
    } catch (PatternFoundException p) {
        System.out.println(p.getPattern());
    }
}

//takes a node of the xml. the node is checked to see if it matches a pattern (this would only be true if we reached the lowest depth so must have 
//matched a pattern. if no pattern then analyse the node for an expression. if expression is found then test for a match. the int from the array to be tested
//will be based on the current depth of the pattern tree. as each depth represent an int such as depth 0 (i.e root) represent position 0 in the int array
//depth 1 represents position 1 in the int array etc. 
private void dfsNodeSearch (Node node) throws PatternFoundException {
    if (node instanceof Element){
        Element nodeElement = (Element) node;
        String nodeName = nodeElement.getNodeName();

        //As this method calls its self for each child node in the pattern tree we need a mechanism to break out when we finally reach the bottom
        // of the tree and identify a pattern. For this reason we throw pattern found exception allowing the process to stop and no further patterns.
        // to be checked.
        if (nodeName.equalsIgnoreCase("pattern")){
            throw new PatternFoundException(nodeElement.getTextContent());
        } else {
            String logic = nodeElement.getAttribute("LOGIC");
            String difference = nodeElement.getAttribute("DIFFERENCE");

            if (!logic.equalsIgnoreCase("")&&!difference.equalsIgnoreCase("")){
                if (matchPattern(nodeName, logic, difference)){
                    if (node.hasChildNodes()){
                        depth++;
                        NodeList childnodes = node.getChildNodes();
                        for (int i = 0; i<childnodes.getLength(); i++){
                            dfsNodeSearch(childnodes.item(i));
                        }
                        depth--;
                    }
                } 
            }
        }


    }
}

//for each node at a current depth a test will be performed against the pattern, logic and difference to identify if we have a match. 
private boolean matchPattern(String pattern, String logic, String difference) {
    boolean matched = false;
    int patternValue = patternisedNumberMap.get(pattern);

    if (logic.equalsIgnoreCase("+")){
        patternValue += Integer.parseInt(difference);
    } else if (logic.equalsIgnoreCase("-")){
        patternValue -= Integer.parseInt(difference);
    }

    if(patternValue == numberArray[depth]){
        matched=true;
    }

    return matched;
}


}

the xml list of patterns looks like this

<?xml version="1.0"?>
<A LOGIC="=" DIFFERENCE="0">
    <A LOGIC="=" DIFFERENCE="0">
        <A LOGIC="=" DIFFERENCE="0">
            <A LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(A)(A)(A)</pattern>
            </A>
            <B LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(A)(B)(A)</pattern>
            </B>
        </A>
        <B LOGIC="=" DIFFERENCE="0">
            <A LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(A)(B)(A)</pattern>
            </A>
            <B LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(A)(B)(B)</pattern>
            </B>
        </B>
    </A>
    <A LOGIC="+" DIFFERENCE="2">
        <A LOGIC="+" DIFFERENCE="4">
            <A LOGIC="+" DIFFERENCE="6">
                <pattern>(A)(A+2)(A+4)(A+6)</pattern>
            </A>
       </A>
    </A>
    <B LOGIC="=" DIFFERENCE="0">
        <A LOGIC="+" DIFFERENCE="1">
            <B LOGIC="+" DIFFERENCE="1">
                <pattern>(A)(B)(A+1)(B+1)</pattern>
            </B>
        </A>
        <A LOGIC="=" DIFFERENCE="0">
            <A LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(B)(A)(A)</pattern>
            </A>
            <B LOGIC="+" DIFFERENCE="1">
                <pattern>(A)(B)(A)(B+1)</pattern>
            </B>
            <B LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(B)(A)(B)</pattern>
            </B>
        </A>
        <B LOGIC="=" DIFFERENCE="0">
            <A LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(B)(B)(A)</pattern>
            </A>
            <B LOGIC="=" DIFFERENCE="0">
                <pattern>(A)(B)(B)(B)</pattern>
            </B>
        </B>
        <A LOGIC="-" DIFFERENCE="1">
            <B LOGIC="-" DIFFERENCE="1">
                <pattern>(A)(B)(A-1)(B-1)</pattern>
            </B>
        </A>
    </B>
    <A LOGIC="+" DIFFERENCE="1">
        <A LOGIC="+" DIFFERENCE="2">
            <A LOGIC="+" DIFFERENCE="3">
                <pattern>(A)(A+1)(A+2)(A+3)</pattern>
            </A>
        </A>
    </A>
    <A LOGIC="-" DIFFERENCE="1">
        <A LOGIC="-" DIFFERENCE="2">
            <A LOGIC="-" DIFFERENCE="3">
                <pattern>(A)(A-1)(A-2)(A-3)</pattern>
            </A>
        </A>
    </A>
</A>

and my pattern found exception class looks like this

package com.doyleisgod.number.pattern.finder;

public class PatternFoundException extends Exception {
private static final long serialVersionUID = 1L;
private final String pattern;

public PatternFoundException(String pattern) {
    this.pattern = pattern;
}

public String getPattern() {
    return pattern;
}

}

Not sure if any of this will help anyone with a similar problem or if anyone has any comments on how this is working would be great to hear them.

suspectus
  • 16,548
  • 8
  • 49
  • 57
Chris Doyle
  • 10,703
  • 2
  • 23
  • 42
  • Well, I wouldn't hardcode an exact match (ie, A = 1 type situations). You'd probably want to consider each different character encountered to be a different digit. Space separations should probably be different numbers (not digits). – Clockwork-Muse Jul 10 '12 at 21:53
  • This is AI stuff. Douglas Hofstadter made his carrier on just such pattern discovery. – Marko Topolnik Jul 10 '12 at 22:09
  • hey, do you want to have several predefined patterns and just check them ? And is there already a semantics how your patterns will be stored or is there no fix format ? – DrLudwig3 Jul 10 '12 at 22:09
  • I have built a small model is it worth while for me to post it here so others who might want to solve a similar issue can look at it. If so where to di post it? in a new thread? or edit my original post in this thread? – Chris Doyle Jul 14 '12 at 09:01

5 Answers5

2

I would suggest to build state machine:

A. Initialization with patterns:

  1. Normalize all patterns
  2. Build a tree with depth = 6 that represents all patterns starting from root and all possible choices on every depth.

B. Run state machine


A.1. Pattern normalization.

A A A A A A => A0 A0 A0 A0 A0 A0

C A C A C A => A0 B0 A0 B0 A0 B0 (always start with A, and then B, C, D, etc.)

B B+1 B+2 A A+1 A+2 => A0 A1 A2 B0 B1 B2

Thus, you always have normalized pattern start with A0.

A.2. Build a tree

1.       A0
       /  | \
2.   A0  B0  A1
      |   |   |
3.   A0  A0  A2
      |   |   |
4.   A0  B0  B0
      |   |   |
5.   A0  A0  B1
      |   |   |
6.   A0  B0  B2
      |   |   |
     p1  p2  p3

B. Run state machine

Use Depth-first search algorithm using recursion to find matched pattern.

Does it make sense?

Yuri
  • 81
  • 5
  • Thanks Yuri, This seems to have started me in the right direction. the method I have taken is to create a tree structure of patterns. such that all patterns would be contained in the the tree and with the DFS technique to elimate branches at the highest level thus eliminating multiple patterns. – Chris Doyle Jul 11 '12 at 20:59
0

You could break your numbers into an array of bytes (or ints if you prefer), one for each character. Each pattern could be defined in terms of direct comparison between the appropriate elements of the array.

ababab=a[0]==a[2] && a[2]==a[4] && a[1]==a[3] && a[3]==a[5] && a[0]!=a[1]
aaabbb=a[0]==a[1] && a[1]==a[2] && a[3]==a[4] && a[4]==a[5] && a[0]!=a[3]
fedcba=(a[0]-a[1])==1 && (a[1]-a[2])==1 && (a[2]-a[3])==1 && (a[3]-a[4])==1 && (a[4]-a[5]==1)
phatfingers
  • 9,770
  • 3
  • 30
  • 44
0

This is easy with the following, very fast match function.

Given the patterns are Arrays of PatternElement, where PatternElement consists of a Letter and an integer.

Given the Numbers are Arrays of Digits.

Now call a match() function with each combination of Number and Digit (nested for-loop).

The match function iterates over pattern and number from left to right, and does a replacing of the numbers to match the letters in the pattern. If a digit is replaced, replace all same digits to he right also. If you reach an index which is not a digit, because it is already replaced, then check if this element matches the pattern.

Example 1 iterations. Pattern: A B A C    Number 3 2 3 5
                               ^                 ^ 
                   1.                            A 2 A 5    replace 3 by A. 
                                 ^                 ^
                   2.                            A B A 5    replace 2 by B
                                   ^                 ^    
                   3.                            A B A 5    Existing A matches pattern. Good
                                     ^                 ^
                   4.                            A B A C    replace 5 by C. SUCCESS

For the (n+2) you could do the same, by doing some extra math during the replacement or matching.

NOTE: The number does not need to consist just of digits. If you like to replace any char, you can even match similar patterns! So A B C A B C would also match D F G D F G in the generic form.

Daniel
  • 27,718
  • 20
  • 89
  • 133
0

You may infer constraints and then use backtracking algorithms to find out whether particular number matches the pattern (it is sometimes referred to as constraint satisfaction problem).

For example, let's denote each of 6 digits d1, d2, d3, d4, d5 and d6 respectively. Then, the pattern A (A+1) (A+2) B (B+1) (B+2) may be rewritten to the following set of rules/constraints:

dx = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
d2 = d1 + 1
d2 = d1 + 2
d4 = d3 + 1
d5 = d3 + 2

In your examples I can see only 2 kinds of inferred rules - one for equality of digits (d1 = d2 if their positions in pattern have same character) and another for arithmetic operations (I don't know whether A and B should be different numbers, and if so you will need additional kind of rules for inequality). All kinds may be trivially translated into constraints.

Tree of possible solutions may be then compiled from these constraints. It may start with something like:

[A = ?]
|-- 1
|-- 2
|-- 3
|   |-- [B = ?]
|       |-- 1
...     ...

and then contain constraints on other nodes.

It may be hard to implement backtracking algorithm correctly by yourself, so it makes sense to find Prolog implementation for your platform (see this question for Java) and then just translate constraints to Prolog rules.

Community
  • 1
  • 1
ffriend
  • 27,562
  • 13
  • 91
  • 132
0

For the specific patterns you mentioned as examples, simple programs can be written to check if input strings match.

If the patterns get more complicated than the ones you mention, you can use Context-free grammar and Regular expressions to specify your rules. And there are tools like lex and yacc which will process input strings according to the rules specified and return if they matched or no.

user1168577
  • 1,863
  • 11
  • 14