25 Great Quotes for Software Testers

All the quotes below are from the inside cover of Statistics for Experimenters written by George Box, Stuart Hunter, and William G. Hunter (my late father).  The Design of Experiments methods expressed in the book (namely, the science of finding out as much information as possible in as few experiments as possible), were the inspiration behind our software test case generating tool.  In paging through the book again today, I found it striking (but not surprising) how many of these quotes are directly relevant to efficient and effective software testing (and efficient and effective test case design strategies in particular):

  • “Discovering the unexpected is more important than confirming the known.”
  • “All models are wrong; some models are useful.”
  • “Don’t fall in love with a model.”
  • How, with a minimum of effort, can you discover what does what to what?  Which factors do what to which responses?
  • “Anyone who has never made a mistake has never tried anything new.” – Albert Einstein
  • “Seek computer programs that allow you to do the thinking.”
  • “A computer should make both calculations and graphs.  Both sorts of output should be studied; each will contribute to understanding.”  – F. J. Anscombe
  • “The best time to plan an experiment is after you’ve done it.” – R. A. Fisher
  • “Sometimes the only thing you can do with a poorly designed experiment is to try to find out what it died of.”  – R. A. Fisher
  • The experimenter who believes that only one factor at a time should be varied, is amply provided for by using a factorial experiment.
  • Only in exceptional circumstances do you need or should you attempt to answer all the questions with one experiment.
  • “The business of life is to endeavor to find out what you don’t know from what you do; that’s what I called ‘guessing what was on the other side of the hill.'”  – Duke of Wellington
  • “To find out what happens when you change something, it is necessary to change it.”
  • “An engineer who does not know experimental design is not an engineer.”  – Comment made by to one of the authors by an executive of the Toyota Motor Company
  • “Among those factors to be considered there will usually be the vital few and the trivial many.”  – J. M. Juran
  • “The most exciting phrase to hear in science, the one that heralds discoveries, is not ‘Eureka!’ but ‘Now that’s funny…'” – Isaac Asimov
  • “Not everything that can be counted counts and not everything that counts can be counted.” – Albert Einstein
  • “You can see a lot by just looking.”  – Yogi Berra
  • “Few things are less common than common sense.”
  • “Criteria must be reconsidered at every stage of an investigation.”
  • “With sequential assembly, designs can be built up so that the complexity of the design matches that of the problem.”
  • “A factorial design makes every observation do double (multiple) duty.”  –  Jack Couden

Where the quotes are not attributed, I’m assuming the quote is from one of the authors.  The most well known of the quotes not attributed, above, “All models are wrong; some models are useful.” is widely attributed to George Box in particular, which is accurate.  Although I forgot to confirm that suspicion with him when I saw him over Christmas break, I suspect most of them are from George (as opposed to from Stu or my dad); George is 90 now and still off-the-charts smart, funny, and is probably the best story teller I’ve met in my life.  If he were younger and on Twitter, he’d be one of those guys who churned out highly retweetable chestnuts again and again.

Related thoughts

As you know if you’ve read my blog before, I am a strong proponent of using the Design of Experiments principles laid out in this book and applying them in field of software testing to improve the efficiency and effectiveness of software test case design (e.g., by using pairwise software testing, orthogonal array software testing, and/or combinatorial software testing techniques).  In fact, I decided to create my company’s test case generating tool, called Hexawise, after using Design of Experiments-based test design methods during my time at Accenture in a couple dozen projects and measuring dramatic improvements in tester productivity (as well as dramatic reductions in the amount of time it took to identify and document test cases).  We saw these improvements in every single pilot project when we  used these methods to identify tests.

My goal, in continuing to improve our Hexawise test case generating tool, is to help make the efficiency-enhancing Design of Experiments methods embodied in the book, accessible to “regular” software testers, and more more broadly adopted throughout the software testing field.  Some days, it feels like a shame that the approaches from the Design of Experiments field (extremely well-known and broadly used in manufacturing industries across the globe, in research and development labs of all kinds, in product development projects in chemicals, pharmaceuticals, and a wide variety of other fields), have not made much of an inroad into software testing.  The irony is, it is hard to think of a field in which it is easier, quicker, or immediately obvious to prove that dramatic benefits result from adopting Design of Experiments methods than software testing.  All it takes is for a testing team to decide to do a simple proof of concept pilot.  It could be for as little as a half-day’s testing activity for one tester.  Create a set of pairwise tests with Hexawise or another t00l like James Bach’s AllPairs tool.  Have one tester execute the tests suggested by the test case generating tool. Have the other tester(s) test the same application in parallel.  Measure four things:

  1. How long did it take to create the pairwise / DoE-based test cases?
  2. How many defects were found per hour by the tester(s) who executed the “business as usual” test cases?
  3. How many defects were found per hour by the tester who executed the pairwise / DoE-based tests?
  4. How many defects were identified overall by each plan’s tests?

These four simple measurements will typically demonstrate dramatic improvements in:

  • Speed of test case identification and documentation
  • Efficiency in defects found per hour

As well as consistent improvements to:

  • Overall thoroughness of testing.

A Suggestion: Experiment / Learn / Get the Data / Let the Efficiency and Effectiveness Findings Guide You

I would be thrilled if this blog post gave you the motivation to explore this testing approach and measure the results.  Whether you’ve used similar-sounding techniques before or never heard of DoE-based software testing methods before,  whether you’re a software testing newbie or a grizzled veteran, I suspect the experience of running a structured proof of concept pilot (and seeing the dramatic benefits I’m confident you’ll see) could be a watershed moment in your testing career.  Try it!  If you’re interested in conducting a pilot, I’d be happy to help get you started and if you’d be willing to share the results of your pilot publicly, I’d be able to provide ongoing advice and test plan review.  Send me an email or leave a comment.

To the grizzled and skeptical veterans, (and yes, Mr, Shrini Kulkarni / @shrinik who tweeted “@Hexawise With all due respect. I can’t credit any technique the superpower of 2X defect finding capability. sumthng else must be goingon” before you actually conducted a proof of concept using Design of Experiments-based testing methods and analyzed your findings, I’m lookin’ at you),  I would (re)quote Sophocles: “One must try by doing the thing; for though you think you know it, you have no certainty until you try.” For newer testers, eager to expand your testing knowledge (and perhaps gain an enormous amount of credibility by taking the initiative, while you’re at it), I’d (re)quote Cole Porter: “Experiment and you’ll see!

I’d welcome your comments and questions.  If you’re feeling, “Sounds too good to be true, but heck, I can secure a tester for half a day to run some of these DoE-based / pairwise tests and gather some data to see whether or not it leads to a step-change improvement in efficiency and effectiveness of our testing” and you’re wondering how you’d get started, I’d be happy to help you out and do so at no cost to you.  All I’d ask is that you share your findings with the world (e.g., in your blog or let me use your data as the firms did with their findings in the “Combinatorial Software Testing” article below).

– Justin

Related: (Introductory Hexawise video overview showing 6.5 trillion possible tests reduced, using Design of Experiments techniques to the 37 tests most likely to find defects)

Related: (Article explaining behind Design of Experiments-based software testing techniques such as pairwise, OA, and n-wise testing: Combinatorial Software Testing by Kuhn, Kacker, Lei, and Hunter (pdf download)

Related: (Prior blog post) “In Praise of Data-Driven Management (AKA “Why You Should be Skeptical of HiPPO’s”)”

Related: (My brother’s blog: he’s in IT too and is also a strong proponent of using Design of Experiments-based software test design methods to improve software testing efficiency and effectiveness).

Advertisement

What Else Can Software Development and Testing Learn from Manufacturing? Don’t Forget Design of Experiments (DoE).

Lessons_from_Car_Manufacturing-20090826-171852

Tony Baer from Ovum recently wrote a blog post titled: Software Development is like Manufacturing which included the following quotes:

“More recently, debate has emerged over yet another refinement of agile – Lean Development, which borrows many of the total quality improvement and continuous waste reduction principles of lean manufacturing. Lean is dedicated to elimination of waste, but not at all costs (like Six Sigma). Instead, it is about continuous improvement in quality, which will lead to waste reduction….

In essence, developing software is like making a durable good like a car, appliance, military transport, machine tool, or consumer electronics product…. you are building complex products that are expected to have a long service life, and which may require updates or repairs.”

Here are my views: I see valid points on both sides of the debate.  Rather than weigh general high-level pro’s and cons, though, I would like to zero in on what I see as an important topic that is all-too-often missing from the debate.  Specifically,  Design of Experiments has been central to Six Sigma, Lean Manufacturing, the Toyota Production System, and Deming’s quality improvement approaches, and is equally applicable to software development and testing, yet adoption of Design of Experiments methods in software design and testing remains low.  This is unfortunate because significant benefits consistently result in both software development and software testing when Design of Experiments methods are properly implemented.

What are Design of Experiments Methods and Why are they Relevant?

In short, Design of Experiments methods are a proven approach to creating and managing experiments that alter variables intelligently between each test run in a structured way that allows the experimenter to learn as much as possible in as few experiments as possible.  From wikipedia: “Design of experiments, or experimental design, (DoE) is the design of all information-gathering exercises where variation is present, whether under the full control of the experimenter or not. Often the experimenter is interested in the effect of some process or intervention (the “treatment”) on some objects (the “experimental units”).”

Design of Experiments methods are an important aspect of Lean Manufacturing, Six Sigma, the Toyota Production System, and other manufacturing-related quality improvement approaches/philosophies.  Not only have Design of Experiments methods been very important to all of the above in manufacturing settings, they are also directly relevant to software development. By way of example, W. Edwards Deming, who was extremely influential in quality initiatives in manufacturing in Japan and the U.S. was an applied statistician. He and thousands of other highly respected quality executives in manufacturing, including Box, Juran and Taguchi (and even my dad), have regularly used Design of Experiments methods as a fundamental anchor of quality improvement and QA initiatives and yet relatively few people who write about software development seem to be aware of the existence of Design of Experiments methods.

What Benefits are Delivered in Software Development by Design of Experiments-based Tools?

Application Optimization applications, like Google’s Website Optimizer are a good example of Design of Experiments methods can deliver powerful benefits in the software development process.  It allows users to easily vary multiple aspects of web pages (images, descriptions, colors, fonts, colors, logos, etc.) and capture the results of user actions to identify which combinations work the best. A recent YouTube multi-variate experiment (e.g., and experiment created using Design of Experiment methods) shows how they used the simple tool and increased sign-up rates by 15.7%.  The experiment involved 1,024 variations.

What Benefits are Delivered in Software Testing by Design of Experiments-based Tools?

In addition, software test design tools, like the Hexawise test design tool my company created, enable dramatically more efficient software testing by automatically varying different elements of use cases that are tested in order to achieve an optimal coverage. Users input the things in the application they want to test, push a button and, as in the Google Web Optimizer example, the tool uses DoE algorithms to identify how the tests should be run to maximize efficiency and thoroughness.  A recent IEEE Computer article I contributed to, titled “Combinatorial Testing” shows, on average, over the course of 10 separate real-world projects, tester productivity (measured in defects found per tester hour) more than doubled, as compared to the control groups which continued to use their standard manual methods of test case selection: http://tinyurl.com/nhzgaf

Unfortunately, Design of Experiments methods – one of the most powerful methods in Lean Manufacturing, Six Sigma, and the Toyota Production System – are not yet widely adopted in the software development industry. This is unfortunate for two reasons, namely:

  1. Design of Experiments methods will consistently deliver measurable benefits when implemented correctly, and
  2. Sophisticated new tools designed with very straightforward user interfaces make it easier than ever for software developers and testers to begin using these helpful methods.

– Justin

In Praise of Data-Driven Management (AKA “Why You Should be Skeptical of HiPPO’s”)

Hipp

Jeff Fry recently linked to a fantastic webcast in Controlled Experiments To Test For Bugs In Our Mental Models. I would highly recommend it to anyone without any reservations.  Ron Kohavi, of Microsoft Research does a superb job of using interesting real-world examples to explain the benefits of conducting small experiments with web site content and the advantages of making data-driven decisions.  The link to the 22-minute video is here.

I firmly believe that the power of applied statistics-based experiments to improve products is dramatically under-appreciated by businesses (and, for that matter, business schools), as well as the software development and software testing communities.  Google, Toyota, and Amazon.com come to mind as notable exceptions to this generalization; they “get it”.  Most firms though still operate, to their detriment, with their heads in the sand and place too much reliance on untested guesswork, even for fundamentally important decisions that would be relatively easy to double-check, refine, and optimize through small applied statistics-based experiments that Kohavi advocates.  Few people who understand how to properly conduct such experiments are as articulate and concise as Kohavi.  Admittedly, I could be accused of being biased as: (a) I am the son of a prominent applied statistician who passionately promoted broader adoption of such methods by industry and (b) I am the founder of a software testing tools company that uses applied statistics-based methods and algorithms to make our tool work.

Here is a short summary of Kohavi’s presentation: Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO

1:00 Amazon: in 2000, Greg Linden wanted to add recommendations in shopping carts during the check out process. The “HiPPO” (meaning the Highest Paid Person’s Opinion) was against it thinking that such recommendations would confuse and/or distract people. Amazon, a company with a good culture of experimentation, decided to run a small experiment anyway, “just to get the data” – It was wildly successful and is in widespread use today at Amazon and other firms.

3:00 Dr. Footcare example: Including a coupon code above the total price to be paid had a dramatic impact on abandonment rates.

4:00 “Was this answer useful?” Dramatic differences in user response rates occur when Y/N is replaced with 5 Stars and whether an empty text box is initially shown with either (or whether it is triggered only after a user clicks to give their initial response)

6:00 Sewing machines: experimenting with a sales promotion strategy led to extremely counter-intuitive pricing choice

7:00 “We are really, really bad at understanding what is going to work with customers…”

7:30DATA TRUMPS INTUITION” {especially on novel ideas}. Get valuable data through quick, cheap experimentation. “The less the data, the stronger the opinions.”

8:00 Overall Evaluation Criteria: “OEC” What will you measure? What are you trying to optimize? (Optimizing for the “customer lifetime value”)

9:00 Analyzing data / looking under the hood is often useful to get meaningful answers as to what really happened and why

10:30 A/B tests are good; more sophisticated multi-variate testing methods are often better

12:00 Some problems: Agreeing upon Overall Evaluation Criteria is hard culturally. People will rarely agree. If there are 10 changes per page, you will need to break things down into smaller experiments.

14:00 Many people are afraid of multiple experiments [e.g., multi-variate experiments or MVE] much more than they should be.

(A/B testing can be as simple as changing a single variable and comparing what happens when it is changed, e.g., A = “web page background = Blue” / B = “web page background = Orange.” Multi-variate experiments involve changing multiple variables in each test run which means that people running the tests should be able to efficiently and effectively change the variables in order to ensure not only that each of the variables is tested but also that the each of the variables is tested in conjunction with each of the others because they might interact with one another). <My views on this: before software tools made conducting multi-variate experiments (and understanding the results of the experiments) a piece of cake, this fear had some merit; you would need to be able to understand books like this to be able to competently run and analyze such experiments. Today however, many tools, such as Google’s Website Optimizer (used for making web sites better at achieving their click through goals, etc.) and Hexawise (used to find defects with fewer test cases) build the complex Design of Experiments-based optimization algorithms into the tool’s computation engine and provide the user of the tool with a simple user interface and user experience. In short, in 2009, you don’t need a PhD in applied statistics to conduct powerful multi-variate experiments. Everyone can quickly learn how to, and almost all companies should, use these methods to improve the effectiveness of applications, products and/or production methods. Similarly, everyone can quickly learn how to, and almost all companies should, use these methods to dramatically improve the effectiveness of their software testing processes.>

16:00 People do a very bad job at understanding natural variation and are often too quick to jump to conclusions.

17:00 eBay does A/B testing and makes the control group ~1%. Ron Kohavi, the presenter, suggests starting small then quickly ramping up to 50/50 (e.g., 50% of viewers will see version A, 50% will see version B).

19:00 Beware of launching experiments than “do not hurt,” there are feature maintenance costs.

20: 00 Drive to a data-driven culture. “It makes a huge difference. People who have worked in a data-driven culture really, really love it… At Amazon… we built an optimization system that replaced all the debates that used to happen on Fridays about what gets on the home page with something that is automated.”

21:00 Microsoft will be releasing its controlled experiments on the web platform at some point in the future, but probably not in the next year.

21:00 Summary

  1. Listen to your customers because our intuition at assessing new ideas is poor.
  2. Don’t let the HiPPO drive decisions; they are likely to be wrong.  Instead, let the customer data drive decisions.
  3. Experiment often create a trustworthy system to accelerate innovation.

Justin Hunter
Founder and CEO
Hexawise
“More coverage. Fewer tests.”

Related: Statistics for Experimentersarticles on design of experiments