0

I have a program that should read a few Strings from the console. If the String "end" appears, it should start to calculate and write a String in the console.

The String which I am reading is a chemical equation. The equation is split with these two characters: ->. I should prove whether the amount of atoms on both sides is the same. I found this post and tried to implement it but I have a problem with the regex.

For example:

My regex can read and calculate a chemical equation if there is a digit before the formula:

2 HCl + 2 Na -> 2 NaCl + H2

but if there is no digit, then it doesn't calculate it correctly:

HCl + Na -> NaCl + H2

My Code:

public static void main(String[] args) {
    Scanner s = new Scanner(System.in);
    List<String> list = new ArrayList<String>();
    String input = "";
    while (!(input.equals("end"))) {
        input = s.nextLine();
        list.add(input);
    }
    int before = 0;
    int after = 0;
    list.remove(list.size() - 1);

    for (int i = 0; i < list.size(); i++) {
        String string = list.get(i);
        string = string.replace("-", "");
        String[] splitted = string.split(">");
        Pattern firstPattern = Pattern.compile("(\\d+) (\\w+)");
        Matcher firstMatcher = firstPattern.matcher(splitted[0]);
        while (firstMatcher.find()) {
            int element = Integer.parseInt(firstMatcher.group(1));
            String count = firstMatcher.group(2);
            final Pattern pattern = Pattern.compile("\\d+"); // the regex
            final Matcher matcher = pattern.matcher(count); // your string
            final ArrayList<Integer> ints = new ArrayList<Integer>(); // results
            while (matcher.find()) { // for each match
                ints.add(Integer.parseInt(matcher.group())); // convert to
                                                                // int
            }
            for (int j = 0; j < ints.size(); j++) {
                before = before + element * ints.get(j);
            }

        }
        Pattern secondPattern = Pattern.compile("(\\d+) (\\w+)");
        Matcher secondMatcher = secondPattern.matcher(splitted[1]);
        while (secondMatcher.find()) {
            int element = Integer.parseInt(secondMatcher.group(1));
            String count = secondMatcher.group(2);
            final Pattern pattern = Pattern.compile("\\d+"); // the regex
            final Matcher matcher = pattern.matcher(count); // your string
            final ArrayList<Integer> ints = new ArrayList<Integer>(); // results
            while (matcher.find()) { // for each match
                ints.add(Integer.parseInt(matcher.group())); // convert to
                                                                // int
            }
            for (int j = 0; j < ints.size(); j++) {
                after = after + element * ints.get(j);
            }
        }
        if (before == after) {
            System.out.println("formally correct");
        } else {
            System.out.println("incorrect");
        }
    }
}

Here are some example chemical equations for trying out:

Input:

HCl + Na -> NaCl + H2

2 HCl + 2 Na -> 2 NaCl + H2

12 CO2 + 6 H2O -> 2 C6H12O6 + 12 O2

end

Output:

incorrect

formally correct

incorrect

Community
  • 1
  • 1
  • you use `\\d+` which means that digit is repeated one or more times. Can you please try to use `\\d*` instead and see what happens? – Anton Balaniuc Dec 14 '16 at 16:39
  • @Anton It don't works it writes a numberformat exception – universitystudent12345 Dec 14 '16 at 16:44
  • I am trying to make sure I understand the problem. So the "2 HCl + 2 Na -> 2 NaCl + H2" is the input and it comes out correct? is the correct item to info after the "->" Forgive me, I forget high school chemistry . The elements can be both 2 letters(NA) and 1 letter(O)? At first I think maybe splitting up the ones like this (CO2) with a space (C O2) and include it in the regex. The + is good to split it up and should be confused with the difference. Oh is there ever a case where count is more than 2 digits? Also this maybe helpful [link] (http://www.regexplanet.com/advanced/java/index.html) – SparkleGoat Dec 14 '16 at 17:05
  • I am not sure exactly what is required. Just matching with a number in front and behind and without? on the portion before the "->"? – SparkleGoat Dec 14 '16 at 17:27
  • @SparkleGoat Sry that I did not explain it so good. 1. "2 HCl + 2 Na -> 2 NaCl + H2" is the Input string and it is correct. The Elements could be with 2 letters or with 1 letter. If there is a element with 2 letters then the second letter is written small. The count could go up to infinite (theoretical it's not possible but you never know what the Input is) – universitystudent12345 Dec 14 '16 at 17:40
  • @universitystudent12345 For `2 HCl + 2 Na -> 2 NaCl + H2`, you are supposed to count the atoms as 4 for `2 HCl` (2 Hydrogen and 2 Chlorine), 2 for `Na` etc. right, making the total for left and right 6 each? Or should it be just 2 each for `2 HCl` and `2 Na`, making the total 4? – Naveed S Dec 14 '16 at 18:04
  • @NaveedS no it's right. There are 6 Atoms on each site. If you watch on the comment below that I posted, there I have splitted the String now I only need to calculate it. But idk how. – universitystudent12345 Dec 14 '16 at 18:07
  • @universitystudent12345 pls check my answer and see if that's your requirement. – Naveed S Dec 14 '16 at 18:43

2 Answers2

2

I am not really sure if that is what you need. But to get individual parts of the equation, the following regex can be used:

\w+ // matches any word character (equal to [a-zA-Z0-9_])

Please follow the link for the details. To get right and left part of the string we can just split it using "HCl + Na -> NaCl + H2".split("->"). After that we can do our calculations:

final Pattern pattern = Pattern.compile("\\w+");
Arrays.asList(
            "HCl + Na -> NaCl + H2",

            "2 HCl + 2 Na -> 2 NaCl + H2",

            "12 CO2 + 6 H2O -> 2 C6H12O6 + 12 O2"
).stream().flatMap(s -> Stream.of(s.split("->"))
).peek(s -> System.out.println("part of equation: " + s))
.forEach(s -> 
        {
          Matcher match = pattern.matcher(s);
          while (match.find()) {
            System.out.println(match.group());
          }
        }
);

I hope this helps.

Anton Balaniuc
  • 10,889
  • 1
  • 35
  • 53
2

So, here are the issues I could find in your logic:

  1. You are using Pattern.compile("(\\d+) (\\w+)") to match each component of either sides. In the pattern you try to match 1 or more digits followed by a space followed by 1 or more word characters. But there, the digits are optional. So you need it to be \\d* instead of the first capture group content. Also the space is optional. So you need to have the same specified in the pattern. And to avoid the digits getting matched against the second capture group (since the first group turns optional), you need to use ([A-Z]\\w*). This ensures that the digits if any would be matched against the first group itself. So your pattern for matching each component on either side should be Pattern.compile("(\\d*) ?([A-Z]\\w*)").
  2. You are using Pattern.compile("\\d+") to match the atom count (as the 2 in H2). By this you would miss to count the individual atoms if any of the elements has single atom i.e. if you have CaCl2, you have to count it as 1 atom of Ca and 2 atoms of Cl. For this, you would need to match each element separately, which can be done using a pattern like Pattern.compile("[A-Z][a-z]*(\\d*)").

  3. You are not calculating the total in the right way. Default the molecule and atom counts for each to 1 and multiply them for each element and add up all the products to get the total count.

And 2 suggestions:

  1. Since you have the same logic for calculation on each side, define a function and call it twice.
  2. Split by -> itself. I don't think you need to remove the hyphen first and then split by >.

Try to modify the logic yourself before going down to the code

This is the way I defined the function for calculating the total for a side:

private static int calculateCount(String eqPart) {
    Matcher matcher = Pattern.compile("(\\d*) ([A-Z]\\w*)").matcher(eqPart);
    int totalCount = 0;
    while (matcher.find()) {
        String moleculeCountStr = matcher.group(1);
        int moleculeCount = moleculeCountStr.isEmpty() ? 1 : Integer.parseInt(moleculeCountStr);
        String molecule = matcher.group(2);
        Matcher moleculeMatcher = Pattern.compile("[A-Z][a-z]*(\\d*)").matcher(molecule);
        while (moleculeMatcher.find()) {
            String atomCountStr = moleculeMatcher.group(1);
            int atomCount = atomCountStr.isEmpty() ? 1 : Integer.parseInt(atomCountStr);
            totalCount += moleculeCount * atomCount;
        }
    }
    return totalCount;
}

Call the function with each of the split result (by ->) and compare the totals to see if the equation is correct or not.

Naveed S
  • 5,106
  • 4
  • 34
  • 52