In this article, we will examine some practical tips to keep in mind while implementing grammar-based test case generation. These guidelines are based on the experience of implementing Gramtest - a Java tool that allows you to generate test cases based on arbitrary user defined grammars. Let's jump right in on how we implemented Gramtest.
#Implementation The key aspect of the grammar-based test case generation algorithm in Gramtest is to follow all the production rules of the given BNF grammar and then generate strings that conform to the grammar. The production rules themselves form a tree, the root of the tree is the starting rule for generating all the strings in the grammar. For example, consider the following BNF grammar describing all the course codes at a university:
<coursecode> ::= <acadunit> <coursenumber>
<acadunit> ::= <letter> <letter> <letter>
<coursenumber> ::= <year> <semesters> <digit> <digit>
<year> ::= <ugrad> | <grad>
<ugrad> ::= 0 | 1 | 2 | 3 | 4
<grad> ::= 5 | 6 | 7 | 9
<semesters> ::= <onesemester> | <twosemesters>
<onesemester> ::= <frenchone> | <englishone> | <bilingual>
<frenchone> ::= 5 | 7
<englishone> ::= 1 | 3
<bilingual> ::= 9
<twosemesters> ::= <frenchtwo> | <englishtwo>
<frenchtwo> ::= 6 | 8
<englishtwo> ::= 2 | 4
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<letter> ::= A | B | C | D | E | F | G | H | I | J | K | L | M | N |
O | P | Q | R | S | T | U | V | W | X | Y | Z
In this grammar the rule <coursecode> ::= <acadunit> <coursenumber>
is at the root. In order to generate the strings in this grammar, we follow all the rules starting from the root (going from top to bottom) to a terminal. When we reach a terminal, we generate a string corresponding to that terminal. For rules that contain alternatives we need to follow all the alternate branches generating strings in an exhaustive manner. Thus, when we run Gramtest on this input it generates the following strings:
Asankhayas-MacBook-Pro:target asankhaya$
java -jar gramtest-0.1-SNAPSHOT-jar-with-dependencies.jar
-file ../src/test/resources/coursecodes.bnf
Generating tests ...
ZZX0989
ZZW0989
ZZW0988
ZZV0988
ZZV0987
ZZU0987
ZZU0986
ZZT0986
ZZT0985
ZZS0985
ZZS0984
ZZR0984
ZZR0983
ZZQ0983
ZZQ0982
ZZP0982
ZZP0981
...
This simple algorithm based on exhaustive search over the production rules guarantees that we will generate all possible strings in the grammar. However, it may not be feasible to do so all the time. Let us look at some of the challenges with this approach that make it difficult to use it for practical test case generation.
#Challenges
In general, a given BNF grammar can contain infinitely many strings due to the recursive nature of the production rules. Recall the following grammar for arithmetic expressions from our previous article:
<expression> ::= <term> <addOps> <expression> | <term>
<term> ::= <factor> <multOps> <term> | <factor>
<addOps> ::= + | -
<multOps> ::= * | /
<factor> ::= "(" <expression> ")" | <constant>
<constant> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
This grammar captures all possible arithmetic expressions and thus if we blindly follow the rules and generate strings, the test case generation will never finish. It is also possible for a BNF grammar without recursive rules to have an unbounded number of strings if the grammar uses the repetition operator. Due to all these cases we need to find a way to terminate the test-case generation algorithm early, otherwise Gramtest would not be very useful for automated fuzzing and testing.
#Practical Tips
We look at three useful ideas that improve on the simple naive exhaustive test case generation and provide a mechanism to address the challenges described in the previous section. All the following three tips are implemented in Gramtest, and if you are curious you can also have a look at the source code.
###Tip 1: Restrict the number of tests to be generated
The easiest way to fix the problem is to just restrict the maximum number of test cases that can be generated. In Gramtest, this can be done by using the -num
switch. This will ensure that the test case generation algorithm stops after generating the specified number of tests. For example we can generate 10 test cases from the BNF grammar of arithmetic expressions by setting -num 10
as shown below:
Asankhayas-MacBook-Pro:target asankhaya$
java -jar gramtest-0.1-SNAPSHOT-jar-with-dependencies.jar
-file ../src/test/resources/arithexp.bnf -num 10
Generating tests ...
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+0
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+1
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+2
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+3
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+4
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+5
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+6
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+7
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+8
(0*0+0)*(0)*0*0*0+(0)*0*0*0+0*0*0+0*0+9
###Tip 2: Bound the depth of recursive rules
The first tip, though useful, will unfortunately not work for a grammar with recursive rules. While generating the test cases for a recursive rule, we can end up applying the rule again and again (due to recursion) and thus it is possible that the algorithm will not terminate even while generating a single string. To handle such cases we propose bounding the depth of the recursive rule. In Gramtest it can be done by setting the -dep
parameter as shown below:
Asankhayas-MacBook-Pro:target asankhaya$
java -jar gramtest-0.1-SNAPSHOT-jar-with-dependencies.jar
-file ../src/test/resources/arithexp.bnf -num 10 -dep 1
Generating tests ...
(0)*0*0*0+0*0*0+0*0+0
(0)*0*0*0+0*0*0+0*0+1
(0)*0*0*0+0*0*0+0*0+2
(0)*0*0*0+0*0*0+0*0+3
(0)*0*0*0+0*0*0+0*0+4
(0)*0*0*0+0*0*0+0*0+5
(0)*0*0*0+0*0*0+0*0+6
(0)*0*0*0+0*0*0+0*0+7
(0)*0*0*0+0*0*0+0*0+8
(0)*0*0*0+0*0*0+0*0+9
By setting -dep 1
above, we ensure that when Gramtest sees a recursive rule it will apply the rule only once (follow the rule only once). Typically, we use this parameter in conjunction with restriction on the maximum number of test cases to ensure that the algorithm terminates. The -dep
parameter also implicitly controls the length of the generated strings. If we compare the output above with the one under the previous tip where the default value of -dep
(2) was used, it is clear that the length of the strings generated in this case are smaller.
###Tip 3: Use a minimal sentence generator
If you have a careful look at the strings that are generated above, you will notice that they all exercise only one part of the grammar and they are very similar to each other. For good test case generation we want the generated tests to be more diverse so that they can exercise different paths in the program that is being tested. The quality of the test cases is usually measured using coverage criteria like statement coverage (percentage of statements in the program that are executed by the tests), branch coverage (percentage of conditional branches that are executed by the tests) etc. For grammar-based test case generation, a useful metric is the production coverage. Production coverage refers to the percentage of production rules in the grammar that are exercised by the test cases.
For achieving production coverage, we can also use a minimal sentence generator. A minimal sentence generator creates a string with the minimum length that is required for the given production rule. Paul Purdom presented a minimal sentence generator in his classical paper on testing parsers. Although the paper presents the parsers for simple LR(1) grammars, the same ideas can be extended and applied to other grammars. In my paper on Building Extensible Parsers using Camlp4 I describe one such variation of Purdom's algorithm that can be used to test the extensible grammars supported by Camlp4. Gramtest uses a similar variation for generating minimal sentences for BNF grammars.
The minimal sentence generator can be set using the -mingen
flag as follows:
Asankhayas-MacBook-Pro:target asankhaya$
java -jar gramtest-0.1-SNAPSHOT-jar-with-dependencies.jar
-file ../src/test/resources/arithexp.bnf
-num 10 -dep 2 -mingen true
Generating tests ...
(2*0+9)*(1)+(0)/9
(2*0+9)*(1)+(0)*0
(3+0)-(9)*0
(3+0)-4*1
(3+0)+4*1
(3+0)+4/2
4*(3)+4/2
(3-2)*(2)+(2)*3
(3-1)*(3)-(2)*3
(3+1)/(2)-(1)
Looking at the output we see that the generated tests are much more diverse and cover different alternatives in the grammar using smaller sentences.
By using all the three tips we get a tool that is more useful and has practical applications. The default value of the options used in Gramtest are -num 100 -dep 2 -mingen true
, but please go ahead and have a look at the source code or play around with the other options. For a given BNF grammar you may get better results with a different set of options. If you have any further tips based on your experience or have any other suggestions on improving Gramtest, do let us know in the comments.