3

How to unit generated strings where the end order is fairly flexible. Lets say I'm trying to test some code that prints out out generated SQL that comes from key-value pairs. However, the exact order of many of the fragments does not matter.

For example

SELECT
    *
FROM
    Cats
WHERE
    fur = 'fluffy'
OR
    colour = 'white'

is functionally identical to

SELECT
    *
FROM
    Cats
WHERE
    colour = 'white'
OR
    fur = 'fluffy'

It doesn't matter in which order the condition clauses get generated, but it does matter that they follow the where clause. Also, it is hard to predict since the ordering of pairs when looping through the entrySet() of a HashMap is not predictable. Sorting the keys would solve this, but introduces a runtime penalty for no (or negative) business value.

How do I unit test the generation of such strings without over-specifying the order?

I thought about using a regexp but* I could not think of how to write one that said:

A regex is what I was thinking of but I can think of a regex that says something like "SELECT * FROM Cats WHERE" followed by one of {"fur = 'fluffy', colour = 'white'} followed by "OR"followed by one of one of {"fur = 'fluffy',colour = 'white'} ... and not the one used last time.

NB: I'm not actually doing this with SQL, it just made for an easier way to frame the problem.

Sled
  • 18,541
  • 27
  • 119
  • 168
  • 1
    And I don't really see any question out there. – Rohit Jain Jan 03 '13 at 15:41
  • @RohitJain I thought the question was implict and obvious, but I had stated it more literally for those who are having trouble following along. – Sled Jan 03 '13 at 16:09

4 Answers4

4

I see a few different options:

If you can live with a modest runtime penalty, LinkedHashMap keeps insertion order.

If you want to solve this completely without changing your implementation, in your example I don't see why you should have to do something more complicated than checking that every fragment appears in the code, and that they appear after the WHERE. Pseudo-code:

Map<String, String> parametersAndValues = { "fur": "fluffy", "colour", "white" };
String generatedSql = generateSql(parametersToValues);
int whereIndex = generatedSql.indexOf("WHERE");
for (String key, value : parametersAndValues) {
    String fragment = String.format("%s = '%s'", key, value);
    assertThat(generatedSql, containsString(fragment));
    assertThat(whereIndex, is(lessThan(generatedSql.indexOf(fragment))));
}

But we can do it even simpler than that. Since you don't actually have to test this with a large set of parameters - for most implementations there are only three important quantities, "none, one, or many" - it's actually feasible to test it against all possible values:

String variation1 = "SELECT ... WHERE fur = 'fluffy' OR colour = 'white'";
String variation2 = "SELECT ... WHERE colour = 'white' OR fur = 'fluffy'";
assertThat(generatedSql, is(anyOf(variation1, variation2)));

Edit: To avoid writing all possible variations by hand (which gets rather tedious if you have more than two or three items as there are n! ways to combine n items), you could have a look at the algorithm for generating all possible permutations of a sequence and do something like this:

List<List<String>> permutations = allPermutationsOf("fur = 'fluffy'", 
    "colour = 'white'", "scars = 'numerous'", "disposition = 'malignant'");
List<String> allSqlVariations = new ArrayList<>(permutations.size());
for (List<String> permutation : permutations) {
    allSqlVariations.add("SELECT ... WHERE " + join(permutation, " OR "));
}
assertThat(generatedSql, is(anyOf(allSqlVariations)));
Community
  • 1
  • 1
gustafc
  • 28,465
  • 7
  • 73
  • 99
  • "Since you don't actually have to test this with a large set of parameters, it's actually feasible to test it against all possible values" Good point. It isn't elegant, but it's true and it is a unit test so it should be fairly minimal. – Sled Jan 03 '13 at 16:15
  • My point exactly. As long as the strings are short enough for a human reader to easily understand the way in which they differ, I think it's elegant enough (as elegant as it gets with hard-coding, anyway). – gustafc Jan 03 '13 at 16:24
  • 1
    +1 Creative idea. It still looks very brittle to me, as it will break on many variations to the SQL which do not change its meaning - but that's unavoidable if you want to check the SQL yourself. – sleske Jan 03 '13 at 16:28
  • Well, I think you rely on the integration tests somewhat more and then live with some brittleness. Though, I still smart the build from hen I broke the build be standardising the case in SQL statements and broke some unit tests that were hardcoded to lower case. – Sled Jan 04 '13 at 14:04
1

Well, one option would be to somehow parse the SQL, extract the list of fields and check that everything is ok, disregarding order of the fields. However, this is going to be rather ugly: If done right, you have to implement a complete SQL parser (obviously overkill), if you do it quick-and-dirty using regex or similar, you risk that the test will break for minor changes to the generated SQL.

Instead, I'd propose to use a combination of unit and integration testing:

  • Have a unit test that tests the code which supplies the list of fields for building the SQL. I.e., have a method Map getRestrictions() which you can easily unit-test.
  • Have an integration test for the SQL generation as a whole, which runs against a real database (maybe some embedded DB like the H2 database, which you can a start just for the test).

That way, you unit-test the actual values supplied to the SQL, and you integration-test that you are really creating the right SQL.

Note: I my opinion this is an example of "integration code", which cannot be usefully unit-tested. The problem is that the code does not produce a real, testable result by itself. Rather, its purpose is to interface with a database (by sending it SQL), which produces the result. In other words, the code does the right thing not if it produces some specific SQL string, but if it drives the database to do the right thing. Therefore, this code can be meaningfully tested only with the database, i.e. in an integration test.

sleske
  • 81,358
  • 34
  • 189
  • 227
  • SQL was just used for the example, my actual use case is with a proprietary query-language algebra to is fed into a large rules engine that one cannot instantiate during a test via a SOAP message. That is why I focussed on the string in my question. – Sled Jan 03 '13 at 16:24
  • @ArtB: Thanks for the clarification. Still, I believe my point stands. If your code does not produce a real result (i.e. one that is directly useful to the user), but is meant to drive some other code (the rules engine), then it can only be meaningfully tested with that rules engine. – sleske Jan 03 '13 at 16:25
  • Of course, you can always build some parser to stand in for the rules engine, but that will be awkward (like I mentioned in the first paragraph). – sleske Jan 03 '13 at 16:26
0

First, use a LinkedHashMap instead of a regular HashMap. It shouldn't introduce any noticeable performance degradation. Rather than sorting, it retains insertion ordering.

Second, insert the pairs into the map in a well understood manner. Perhaps you are getting the data from a table, and adding an ordering index is unacceptable. But perhaps the database can be ordered by primary key or something.

Combined, those two changes should give you predictable results.

Alternatively, compare actual vs. expected using something smarter than string equals. Perhaps a regex to scrape out all the pairs that have been injected into the actual SQL query?

Dilum Ranatunga
  • 13,254
  • 3
  • 41
  • 52
  • A regex is what I was thinking of but I can think of a regex that says something like `"SELECT * FROM Cats WHERE"` followed by one of {`"fur = 'fluffy'`, `colour = 'white'} followed by `"OR"` followed by one of one of {`"fur = 'fluffy'`, `colour = 'white'} ... and not the one used last time – Sled Jan 03 '13 at 16:06
  • The idea would be to have a regex that finds each pair. So the assert would basically use the regex to extract the map of pairs, and compare it to the expected map. – Dilum Ranatunga Jan 03 '13 at 17:49
-2

The best I have come-up so far is to use some library during testing (suck as PowerMockito) to replace the HashMap with a SortedMap like TreeMap. That way for the tests the order will be fixed. However, this only works if the map isn't built in the same code that generates the string.

Sled
  • 18,541
  • 27
  • 119
  • 168
  • Can anyone suggest why this got downvoted? It is similar to another answer _and_ it came before it. – Sled Jan 03 '13 at 16:13
  • I guess because that changes the tested code before running the test. Strictly speaking that is not, what testing is about... ;) – brimborium Jan 03 '13 at 16:27
  • @brimborium I suppose that from a purest point of view that is true, still a down vote seems harsh – Sled Jan 03 '13 at 16:29
  • are you posting answer to your own question at the same time with question? – Nikolay Kuznetsov Jan 04 '13 at 04:58
  • 1
    Yeah, because it is _an_ answer and it may (sadly) be the best one so why not post it as an answer? It's more like "this is the best I could come up with, can anyone do better?" – Sled Jan 04 '13 at 13:59
  • I agree with ArtB (the downvote did not come from me), although I would mark this answer as your "best answer so far that you don't like and want to find a better one". Or you could simply tell this in the question, that would be valid too. – brimborium Jan 04 '13 at 14:49