TLDR; I wrote a random test data generator, not a slogan generator
On the Importance of Test Data
We all know that test data is really really important.
If you don’t have the right data, you can’t test the functionality. You can’t check that an email is sent out on the customer’s 65th Birthday unless you have a customer who has a date of birth that will trigger that functionality.
Some data is path invariant and has to be specific to control the path.
We know this.
But we don’t always randomise enough data and our test data becomes stale and etc. etc.
One of my hobbies - randomly generating test data
Periodically I write code to randomly generate data. Its easier than writing a full compiler and interpreter but still keeps my hand in at parsing text. You can find old notes and tools on test data.
My most recent public test data utility attempts to randomly recreate some of the cartoon ‘slogans’ from my book “Dear Evil Tester”:
- “Of course I’m not Evil… do I look Evil?
- “Are you a good little tester? I’m better than that, I’m Eeevil!”
- “I’m not evil, I’m just doing WHATEVER it takes”
My Sloganizer is a Test Data Generator
I have an array of strings which are sentence templates e.g.
- “#start I’m not #im_not”
- “#start I’m #im_not”
Everything starting with a “#” is a ‘macro’, everything else is a string literal.
The ‘macros’ are a hash of:
- ‘key’ - which matches the macro name e.g. “start” and “im_not”
- ‘value’ - which is an array of strings, where the string might be another macro or a literal
"start" : ["", "Of course", "I honestly believe", "I really do think"], "im_not" : ["evil", "good", "nasty", "unpleasant"],
And I have a recursive function which, given a string will:
- work through the string
- if it finds a ‘macro’ name then it randomly chooses a string from the macro array and expands it
- if it finds a literal then it adds it to the output string
So “#start I’m not #im_not” might generate:
- I’m not good
- Of course I’m not nasty
- I really do think I’m not evil
The code isn’t particularly forgiving when given bad data:
- I could get in an infinite loop if a macro string references itself
- if a macro doesn’t have an entry in the hash then the code will throw an exception
The code doesn’t ‘compile’ the sentences or phrases to find these problems in advance (although it could, if I wrote code to do that).
But it does work, and it will generate thousands, if not millions of random sentences.
What’s the point?
The point is, that:
- It doesn’t take much to create random data.
- It doesn’t take a long time to write utility functions to generate random data.
- Even if you can’t find a library that you like, for the language you use. You could write your own, or probably re-purpose a template engine to create data.
And, more dangerously… its fun to write random data generation code.