This blog is not updated. Please see our current Hexawise blog on software testing to keep up with our new posts.
I can be too verbose with some of my posts. This will be quick.
I recommend that you read this. It is the Exploratory Testing Dynamics document, a tightly condensed list of useful testing heuristics authored by three of the most thoughtful and experienced software testers alive today: James Bach, Michael Bolton, and Jon Bach. Using it will help you improve your software testing capabilities. http://www.satisfice.com/blog/wp-content/uploads/2009/10/et-dynamics22.pdf
Told you it’d be brief. Go. Now. Read.
A friend passed me this set of recent tweets from Wil Shipley, a Mac developer with 11,743 followers on Twitter as of today. Wil recently encountered the familiar problem of what to do when you’ve got more software tests to run than you can realistically execute.
I love that. Who can’t relate?
Now if only there were a good, quick way to reduce the number of tests from over a billion to a smaller, much more manageable set of tests that were “Altoid-like” in their curious strength. 🙂 I rarely use this blog for shameless plugs of our test case generating tool, but I can’t help myself here. The opening is just too inviting. So here goes:
There’s an app for that… See www.hexawise.com for Hexawise, a “pairwise software test case generating tool on steroids.” It eats problems like the one you encountered for breakfast. Hexawise winnows bazillions of possible test cases down in the blink of an eye to small, manageable sets of test cases that are carefully constructed to maximize coverage in the smallest amount of tests, with flexibility to adjust the solutions based upon the execution time you have available. In addition to generating pairwise testing solutions, Hexawise also generates more thorough applied statistics-based “combinatorial software testing” solutions that include tests for, say, all possible 6-way combinations of test inputs.
Where your Mac cops an attitude and tells you “Bitch, I ain’t even allocating 1 billion integers to hold your results” and showers you with taunting derisive sneers, head-waggling and snaps all carefully choreographed to let you know where you stand, Hexawise, in contrast, would helpfully tell you: “Only 1 billion total possibilities to select tests from? Pfft! Child’s play. Want to start testing the 100 or so most powerful tests? Want to execute an extremely thorough set of 10,000 tests? Want to select a thoroughness setting in the middle? Your wish is my command, sir. You tell me approximately how many tests you want to run and the test inputs you want to include, and I’ll calculate the most powerful set of tests you can execute (based on proven applied statistics-based Design of Experiments methods) before you can say “I’m Wil Shipley and I like my TED Conference swag.”
– Justin Hunter
There are good reasons James Bach is so well known among the testing community and constantly invited to give keynote presentations around the globe at software testing conferences. He’s passionate about testing and educating testers; he’s a gifted, energetic, and entertaining speaker with a great sense of humor; and he takes joy in rattling his saber and attacking well-established institutions and schools of thought that he disagrees with. He doesn’t take kindly to people who make inflated claims of benefits that would materialize “if only you’d perform testing in XYZ way or with ABC tool” given that (a) he can always seem to find exceptions to such claims, (b) he doesn’t shy away from confrontation, and (c) he (rightly, in my view) thinks that such benefits statements tend to discount the importance of critical thinking skills being used by testers and other important context-specific considerations.
Leave it up to James to create a list of 13 questions that would be great to ask the next software testing tool vendor who shows up to pitch his problem-solving product. In his blog post titled “The Essence of Heuristics,” he posed this exact set of questions in a slightly different context, but as a software testing tool vendor myself, they really hit home. They are:
1. Do they teach you how to tell if it’s working?
2. Do they teach you how to tell if it’s going wrong?
3. Do they teach you heuristics for stopping?
4. Do they teach you heuristics for knowing when to apply it?
5. Do they compare it to alternative heuristics?
6. Do they show you why it works?
7. Do they help you understand when it probably works best?
8. Do they help you know how to re-design it, if needed?
9. Do they let you own it?
10. Do they ask you to practice it?
11. Do they tell stories about how it has failed?
12. Do they listen to you when you question or challenge it?
13. Do they praise you for questioning and challenging it?
[Side note: Apparently I wasn’t the only one who thought of Hexawise and pairwise / combinatorial test design approaches when they saw these 13 questions. I was amused that after I drafted this post, I saw Jared Quinert’s / @xflibble’s tweet just now:]
Where do I come down on each of James’ 13 questions with respect to people I talk to about our test design tool, Hexawise, and the types of benefits and the size of benefits it typically delivers? Quite simply, “Yes” to all 13. I enjoy talking about exactly the kinds of questions that James raised in his list. In fact, when I sought out James to ask him questions at a conference in Boston earlier this year, it was because I wanted his perspective on many of the points above, particularly #11: (hearing stories about how James has seen pairwise and combinatorial approaches to test design fail), and #7 (hearing his views on where it works best and where it would be difficult to apply it). I’ll save my specific answers to another post, but I am serious about wanting to share my thoughts on them; time constraints are holding me back today. I gave a speech at the ASQ World Conference on Quality Improvement in St. Louis last week though that addressed many, but not all, of James’ questions.
I’m not your typical software tool vendor. Basically, my natural instincts are all wrong for sales. I agree with the premise that “a fool with a tool is still a fool”; when talking to target clients and/or potential partners, I’m inclined to point out deficiencies, limitations, and various things that could go wrong; I’m more of an introvert than an extrovert, etc. Not exactly the typical characteristics of a successful salesman… Having said that, I believe that we’ve built a very good tool that helps enable dramatic efficiency and thoroughness benefits in many testing situations but our tool, along with the pairwise and combinatorial test design approaches that Hexawise enables both have their limitations. It is primarily by talking to software testers about their positive and negative experiences that our company is able to improve our tool, enhance our training, and provide honest, pragmatic guidance to users about where and how to use our tool (and where and how not to).
Tool vendors who defend their tools (and/or the approaches by which their tools helps users solve problems) as magical, silver bullet solutions are being both foolish and dishonest. Tool vendors who choose not to engage in serious, honest and open discussions with users about the challenges that users have when applying their tools in different situations are being short-sighted. From my own experiences, I can say that talking about the 13 topics raised by James have been invaluable.
Luis Fernández, an Associate professor at Universidad de Alcala is conducting a survey of software testers to gather data relating to, e.g., “Why isn’t software testing conducted as efficiently and effectively as it should be?” and “What factors lead to software testing being ‘under-appreciated’ as a potential career path?”
His survey (as of March, 2010) is listed here: http://www.cc.uah.es/encuestas/index.php?sid=28392&lang=en
Personally, I agree that the following two issues (identified in his survey) are significant causes of inefficiency in software testing:
1) “People tend to execute testing in an uncontrolled manner until the total expenditure of resources in the belief that if we test a lot, in the end, we will cover or control all the system.”
(Or, at least, given the relatively undisciplined test case selection methods prevalent in the industry, my experience in analyzing manually selected test scenarios is that testers generally believe (a) they are covering a higher proportion of an application’s possible combinations than they actually are and (b) they underestimate the amount of time that is spent during test execution unproductively repeating steps that they have previously tested)
2) “Many managers did not receive appropriate training on software testing so they do not appreciate its interest or potential for efficiency and quality.”
It is unfortunate, but true, that many testing managers do not have any background whatsoever in combinatorial testing methods that (a) dramatically reduce the amount of time it takes to select and document test cases, and (b) will simultaneously improve test execution efficiency when applied correctly. See, for example, https://www.hexawise.com/Combinatorial-Softwar-Testing-Case-Studies-IEEE-Computer-Kuhn-Kacker-Lei-Hunter.pdf
Please consider taking Fernández’s short survey. It takes only 5-10 minutes to complete.
All the quotes below are from the inside cover of Statistics for Experimenters written by George Box, Stuart Hunter, and William G. Hunter (my late father). The Design of Experiments methods expressed in the book (namely, the science of finding out as much information as possible in as few experiments as possible), were the inspiration behind our software test case generating tool. In paging through the book again today, I found it striking (but not surprising) how many of these quotes are directly relevant to efficient and effective software testing (and efficient and effective test case design strategies in particular):
- “Discovering the unexpected is more important than confirming the known.”
- “All models are wrong; some models are useful.”
- “Don’t fall in love with a model.”
- How, with a minimum of effort, can you discover what does what to what? Which factors do what to which responses?
- “Anyone who has never made a mistake has never tried anything new.” – Albert Einstein
- “Seek computer programs that allow you to do the thinking.”
- “A computer should make both calculations and graphs. Both sorts of output should be studied; each will contribute to understanding.” – F. J. Anscombe
- “The best time to plan an experiment is after you’ve done it.” – R. A. Fisher
- “Sometimes the only thing you can do with a poorly designed experiment is to try to find out what it died of.” – R. A. Fisher
- The experimenter who believes that only one factor at a time should be varied, is amply provided for by using a factorial experiment.
- Only in exceptional circumstances do you need or should you attempt to answer all the questions with one experiment.
- “The business of life is to endeavor to find out what you don’t know from what you do; that’s what I called ‘guessing what was on the other side of the hill.'” – Duke of Wellington
- “To find out what happens when you change something, it is necessary to change it.”
- “An engineer who does not know experimental design is not an engineer.” – Comment made by to one of the authors by an executive of the Toyota Motor Company
- “Among those factors to be considered there will usually be the vital few and the trivial many.” – J. M. Juran
- “The most exciting phrase to hear in science, the one that heralds discoveries, is not ‘Eureka!’ but ‘Now that’s funny…'” – Isaac Asimov
- “Not everything that can be counted counts and not everything that counts can be counted.” – Albert Einstein
- “You can see a lot by just looking.” – Yogi Berra
- “Few things are less common than common sense.”
- “Criteria must be reconsidered at every stage of an investigation.”
- “With sequential assembly, designs can be built up so that the complexity of the design matches that of the problem.”
- “A factorial design makes every observation do double (multiple) duty.” – Jack Couden
Where the quotes are not attributed, I’m assuming the quote is from one of the authors. The most well known of the quotes not attributed, above, “All models are wrong; some models are useful.” is widely attributed to George Box in particular, which is accurate. Although I forgot to confirm that suspicion with him when I saw him over Christmas break, I suspect most of them are from George (as opposed to from Stu or my dad); George is 90 now and still off-the-charts smart, funny, and is probably the best story teller I’ve met in my life. If he were younger and on Twitter, he’d be one of those guys who churned out highly retweetable chestnuts again and again.
As you know if you’ve read my blog before, I am a strong proponent of using the Design of Experiments principles laid out in this book and applying them in field of software testing to improve the efficiency and effectiveness of software test case design (e.g., by using pairwise software testing, orthogonal array software testing, and/or combinatorial software testing techniques). In fact, I decided to create my company’s test case generating tool, called Hexawise, after using Design of Experiments-based test design methods during my time at Accenture in a couple dozen projects and measuring dramatic improvements in tester productivity (as well as dramatic reductions in the amount of time it took to identify and document test cases). We saw these improvements in every single pilot project when we used these methods to identify tests.
My goal, in continuing to improve our Hexawise test case generating tool, is to help make the efficiency-enhancing Design of Experiments methods embodied in the book, accessible to “regular” software testers, and more more broadly adopted throughout the software testing field. Some days, it feels like a shame that the approaches from the Design of Experiments field (extremely well-known and broadly used in manufacturing industries across the globe, in research and development labs of all kinds, in product development projects in chemicals, pharmaceuticals, and a wide variety of other fields), have not made much of an inroad into software testing. The irony is, it is hard to think of a field in which it is easier, quicker, or immediately obvious to prove that dramatic benefits result from adopting Design of Experiments methods than software testing. All it takes is for a testing team to decide to do a simple proof of concept pilot. It could be for as little as a half-day’s testing activity for one tester. Create a set of pairwise tests with Hexawise or another t00l like James Bach’s AllPairs tool. Have one tester execute the tests suggested by the test case generating tool. Have the other tester(s) test the same application in parallel. Measure four things:
- How long did it take to create the pairwise / DoE-based test cases?
- How many defects were found per hour by the tester(s) who executed the “business as usual” test cases?
- How many defects were found per hour by the tester who executed the pairwise / DoE-based tests?
- How many defects were identified overall by each plan’s tests?
These four simple measurements will typically demonstrate dramatic improvements in:
- Speed of test case identification and documentation
- Efficiency in defects found per hour
As well as consistent improvements to:
- Overall thoroughness of testing.
A Suggestion: Experiment / Learn / Get the Data / Let the Efficiency and Effectiveness Findings Guide You
I would be thrilled if this blog post gave you the motivation to explore this testing approach and measure the results. Whether you’ve used similar-sounding techniques before or never heard of DoE-based software testing methods before, whether you’re a software testing newbie or a grizzled veteran, I suspect the experience of running a structured proof of concept pilot (and seeing the dramatic benefits I’m confident you’ll see) could be a watershed moment in your testing career. Try it! If you’re interested in conducting a pilot, I’d be happy to help get you started and if you’d be willing to share the results of your pilot publicly, I’d be able to provide ongoing advice and test plan review. Send me an email or leave a comment.
To the grizzled and skeptical veterans, (and yes, Mr, Shrini Kulkarni / @shrinik who tweeted “@Hexawise With all due respect. I can’t credit any technique the superpower of 2X defect finding capability. sumthng else must be goingon” before you actually conducted a proof of concept using Design of Experiments-based testing methods and analyzed your findings, I’m lookin’ at you), I would (re)quote Sophocles: “One must try by doing the thing; for though you think you know it, you have no certainty until you try.” For newer testers, eager to expand your testing knowledge (and perhaps gain an enormous amount of credibility by taking the initiative, while you’re at it), I’d (re)quote Cole Porter: “Experiment and you’ll see!”
I’d welcome your comments and questions. If you’re feeling, “Sounds too good to be true, but heck, I can secure a tester for half a day to run some of these DoE-based / pairwise tests and gather some data to see whether or not it leads to a step-change improvement in efficiency and effectiveness of our testing” and you’re wondering how you’d get started, I’d be happy to help you out and do so at no cost to you. All I’d ask is that you share your findings with the world (e.g., in your blog or let me use your data as the firms did with their findings in the “Combinatorial Software Testing” article below).
Related: (Article explaining behind Design of Experiments-based software testing techniques such as pairwise, OA, and n-wise testing: Combinatorial Software Testing by Kuhn, Kacker, Lei, and Hunter (pdf download)
Related: (Prior blog post) “In Praise of Data-Driven Management (AKA “Why You Should be Skeptical of HiPPO’s”)”
Related: (My brother’s blog: he’s in IT too and is also a strong proponent of using Design of Experiments-based software test design methods to improve software testing efficiency and effectiveness).
I responded to a recent blog post written by Gareth Bowles today and was struck – again – that a defect that must have been seen >10 million times by now has still not been corrected. When anyone responds to a blog post on Blogger.com, the stat counter says “1 comments” instead of correctly stating “1 comment.” What’s up with that?
The clothing company Lands’ End (with the apostrophe erroneously after the s instead of before it) has a bizarre but somewhat logical explanation for why they have printed their grammatical-mistake-laden brand name on millions of pieces of clothing. According to one version of the story I have heard, they printed their first brochures with the typo and couldn’t afford to get it changed. I also remember reading a more detailed explanation in a catalog in the late 80’s to the effect that by the time the company management realized their mistake and tried to get trademark protection on “Land’s End” they discovered that another firm already had trademarked rights to that name. Quick internet searches can’t verify that so perhaps my memory is just playing tricks on me. But I digress. Here’s the defect I wanted to highlight with this post:
For Blogger.com to leave the extra “s” in has me stumped for several reasons. First, this defect has been seen by a ton of people as Blogger.com is, according to Alexa’s site tracking, the world’s 7th most popular site. Second, Blogger.com is owned by Google (among the most competent, quality-oriented IT wizards on the planet) and no trademark protection is preventing the correction. Third, it would seem to be such an easy thing to fix. Fourth, other sites (like wordpress) don’t make the same mistake. Fifth, it doesn’t seem like a “style preference” issue (like spelling traveled with one “L” or two); it seems to me like a pretty clear case of a mistake. It would be a mistake to say “one cars,” “one computers,” or “one pedantic grammarians”; similarly, it is a mistake to say “one comments”. What gives? Anyone have any ideas?
For anyone wondering where the “>10 million times” figure came from, it is pure conjecture on my part. If anyone has a reasoned way to refute or confirm it or propose a better estimate, I’m all ears.