Feeds:
Posts
Comments

I responded to a recent blog post written by Gareth Bowles today and was struck – again – that a defect that must have been seen >10 million times by now has still not been corrected. When anyone responds to a blog post on Blogger.com, the stat counter says “1 comments” instead of correctly stating “1 comment.” What’s up with that?

The clothing company Lands’ End (with the apostrophe erroneously after the s instead of before it) has a bizarre but somewhat logical explanation for why they have printed their grammatical-mistake-laden brand name on millions of pieces of clothing. According to one version of the story I have heard, they printed their first brochures with the typo and couldn’t afford to get it changed. I also remember reading a more detailed explanation in a catalog in the late 80’s to the effect that by the time the company management realized their mistake and tried to get trademark protection on “Land’s End” they discovered that another firm already had trademarked rights to that name. Quick internet searches can’t verify that so perhaps my memory is just playing tricks on me. But I digress. Here’s the defect I wanted to highlight with this post:

For Blogger.com to leave the extra “s” in has me stumped for several reasons. First, this defect has been seen by a ton of people as Blogger.com is, according to Alexa’s site tracking, the world’s 7th most popular site. Second, Blogger.com is owned by Google (among the most competent, quality-oriented IT wizards on the planet) and no trademark protection is preventing the correction. Third, it would seem to be such an easy thing to fix. Fourth, other sites (like wordpress) don’t make the same mistake. Fifth, it doesn’t seem like a “style preference” issue (like spelling traveled with one “L” or two); it seems to me like a pretty clear case of a mistake. It would be a mistake to say “one cars,” “one computers,” or “one pedantic grammarians”; similarly, it is a mistake to say “one comments”. What gives? Anyone have any ideas?

For anyone wondering where the “>10 million times” figure came from, it is pure conjecture on my part. If anyone has a reasoned way to refute or confirm it or propose a better estimate, I’m all ears.

I’ve recently tried out Tails as a bug tracking tool. I like it and I’d recommend you check it out if you’re looking for a straightforward bug-tracking tool without a lot of extra bells and whistles. This is a quick review of what I have found to be the best defect tracking tool for my purposes.

When someone recommends something to you (whether a movie, a car, or a software application), it is useful to have an understanding of where they’re coming from; when ordering from Netflix will they be drawn to the gritty genius of “the Usual Suspects” or an animated Disney classic like Fantasia? Is their idea of the perfect car a 36 HP 1959 Karmann Ghia convertible or a 2009 Humvee?

With that said, here’s where I’m coming from with respect to software applications. I’ve always appreciated nice, simple, cleanly-designed software applications that work as you’d like them to without requiring you to invest time searching help files or in training. My appreciation for clean, straightforward applications has increased in the last year as I’ve had more hands-on Product Management responsibilities at Hexawise and I’ve seen first hand how hard it can sometimes be to strike the right balance between (1) the goals of elegance and simplicity on the one hand and (2) a Product Manager’s natural desire to equip the application with additional features and functionality on the other hand.

The screen capture tool Skitch has done a superb job of achieving this balancing act, as described well in Sean Johnson’s article, in which he writes: “These days it takes more than being an adequate solution to a real problem that people have and are willing to pay to solve. That’s certainly required, but it’s just not enough. You have to create happiness and joy in your users and they must love your product.” Unfortunately, Skitch is only available to Mac users for now. Similarly, Seth Godin and the gang at 37 Signals have done an excellent job at putting together simple, clean, powerful applications like Basecamp and Highrise. I strongly recommend their blog, Signal vs. Noise, about “design, business, experience, simplicity, culture and more” and their book “Getting Real“. I’ve been heavily influenced by the designers of those tools when making Product Management and Design decisions about our test design tool Hexawise.

Enough preamble. My point is, if you appreciate the similar design philosophies behind Skitch, Highrise, Basecamp, and Hexawise, which place a premium on nice, clean, intuitive design (and explicitly try to avoid “feature bloat”), I suspect you’ll like Tails as a bug-tracking tool and enjoy using it. By design, Tails doesn’t have a lot of bells and whistles. It does such a good job at the features the vast majority majority of users need, that it is a joy to use. I’ve attached a couple screen shots below (with a few redactions to protect client confidentiality, etc.).

- Justin

Dave Whalen posted a good piece here asserting that software testing, done well, requires a blend of “Science” and “Art”. I recommend it. (He also has a good post about testing databases here).

He includes the statement below which I agree with. If you are a software tester and any doubts about whether all of these methods work (pairwise software testing in particular), I would encourage you to conduct a pilot project on your own and measure the results achieved with and without the technique applied.

From the scientific side, testing can include a number of proven techniques such as equivalency class testing, boundary value analysis, pair-wise testing, etc. These techniques, if used properly, can reduce test times and focus on finding the bugs where they tend to hang out – much like a porch light on a summer night.

My response to Dave’s post, included below, is not especially profound or even well-written, but, hey, I’m in a hurry in the pre-Thanksgiving rush and the topic hit close to home so I couldn’t resist jotting a little something. Enjoy. Please let me know your thoughts / reactions if you have any.

Dave,

Very well said!

I wholeheartedly, enthusiastically agree with your premise. I also wish that more people saw things the same way.

My father co-wrote Statistics for Experimenters which describes the “art and science” within the Design of Experiments (“DoE”) field of applied statistics. Well-run manufacturing companies use DoE techniques in their manufacturing processes. Many companies, such as Toyota see them as an absolutely fundamental part of their processes (yet unfortunately, software testers, who could use DoE techniques such as pairwise and other forms of combinatorial testing, are often ignorant about how to use them properly and the software testing industry as a whole dramatically under-utilizes such techniques…. but I digress).

I brought the book up because it opens up with a good example relevant to the points you made. To win at the game of 20 questions, it is useful to know “the science” of game theory and DoE; choose questions so that there is a 50/50 chance that the answer will be Yes. Someone who knows this technique, all else being equal, will be win more because of their “scientific” approach than someone who doesn’t know the technique. And yet… other stuff (whether the subject matter expertise in this example, or subject matter expertise and “artistic” Exploratory Testing in your example) is indispensable as well.

You can’t truly excel at either 20 Questions or software testing unless you have a good mix of “science” (governed by mathematical principles, proven methods of DoE, etc.) and and “art” (governed by experience, instincts, and subject matter expertise).

- Justin

I highly recommend this presentation by Cem Kaner (available here as a pdf download of slides). It is provocative, funny, and insightful. In it, Cem Kaner makes a strong case for using checklists (and mercilessly derides many aspects of using completely scripted tests). Cem Kaner, as I suspect most people reading this already know, is one of the leading lights of software testing education. He is a professor of computer sciences at Florida Institute of Technology and has contributed enormously to software testing education by writing Testing Computer Software “the best selling software testing book of all time,” founding the Center for Software Testing Education & Research, and making an excellent free course available online for Black Box Software Testing. <Trivia: Cem Kaner is one of two people I know about who work in the software testing today that have a law degree; the other person is me. After graduating from the University of Virginia Law School, I worked as a lawyer in London and Hong Kong for a large global firm before coming to my senses and realizing my interests, happiness and competence lay elsewhere>.

Here are a couple of my favorite slides from the presentation.

My own belief is that the presentation is very good and makes the points it wants to quite well. If I have a minor quibble with it, it is that in doing such a good job at laying out the case for checklists and against scripted testing, the presentation – almost by definition/design – does not go into as much detail as I would personally like to see about a topic that I think is extremely important and not written about enough; namely, how practitioners should use an approach that blends the advantages of scripted tests (that can generate some of the huge efficiency benefits of combinatorial testing methods for example) and checklist-based Exploratory Testing (which have the advantages pointed out so well in the presentation). A “both / and” option is not only possible; it is desirable.

- – -

Credit for bringing this presentation to my attention: Michael Bolton (the testing expert, of course, not the singer, [ {--- "Office Space" video snippet] , posted a link to this presentation. Thanks again, Michael. Your enthusiastic recommendation to pick up boxed sets of the BBC show Connections was also excellent as well; the presenter of Connections is like a slightly tipsy genius with ADHD who possesses incredible grasp of history, an encyclopedic knowledge of quirky scientific developments and a gift for story-telling. I like how your mind works.

On October 6th, I informally launched testing.stackexchange.com as “the stackoverflow.com for Software Testing” without much hoopla. So far, less than a month later, with no advertising other than word of mouth, the initial results are very promising. We’ve had approximately:

* 70 new users join as members and contributors
* 50 software testing questions
* 160 answers to those questions
* 2,200 views of the questions and answers

The most important development is not reflected in the numbers above. More important, by far, than the number of the participants have joined is the quality of people who are contributing. Members of the forum include some prominent experts including: Jason Huggins (creator of Selenium and cofounder of Sauce Labs, Alan Page and Bj Rollison (of “How we Test Software at Microsoft” fame), Michael Bolton (the testing expert, not the singer), Fred Beringer, Elisabeth Hendrickson, Joe Strazzere, Adam Goucher, Simon Morley, Rob Lambert, Scott Sehlhorst, etc. etc.). Given the high quality people the site has attracted, the quality of the answers delivered has been quite high. Perhaps the quality is also above average because people answering know that their answers will be analyzed by thoughtful testers and voted up (or down) based on how good they are. In short, testers are asking good questions and getting them answered which is why I created the site in the first place. I’m cautiously optimistic about the future: if the site

Members so far include:

The most viewed questions so far include:

The most recent questions being asked and answered are:

I’d like to extend special thanks to Alan Page (who likes the idea so much that has volunteered to join me as a co-manager/Moderator of the site), to Shmuel Gershon, Jason from NC, and Joe Strazzere for being particularly active and to Alan Page, Corey Goldberg, Shmel Gershon, and Konstantin for helping to get the word out about the form through their blog posts telling the world about testing.stackexchange.com. Without their combined help, we’d be nowhere. With their help and support, we’re building a place where software testers can seek and receive high-quality, peer-reviewed answers to their testing questions.

Please help us succeed by spreading the word, asking a few questions, answering a few, and voting on the best answers.

Thanks everyone!

- Justin

There are some phrases in English that, as often as not, come off sounding obligatory and/or insincere. The phrase “I’m honored…” comes to mind (particularly if someone is accepting an award in front of a room full of people).

Be that as it may, I genuinely felt really honored last night and again today by a couple comments James Bach has said about me, including these:

Here’s the quick background: (1) James knows much more about software testing than I do and I respect his views a lot. (2) He has a reputation for not suffering fools gladly and pretty bluntly telling people he doesn’t respect them if he doesn’t respect the content of their views. (3) in addition to his extremely broad expertise on “testing in general” James, like Michael Bolton, knows a lot about pairwise and combinatorial testing methods and how to use them. (4) I firmly (and passionately) believe that pairwise and combinatorial testing methods are (a) dramatically under-appreciated, and (b) dramatically under-utilized. (5) James has published a very good and well-reasoned article about some of the limitations of pairwise testing methods that I wanted to talk to him about. (6) I co-wrote an article that IEEE Computer recently published about Combinatorial Testing that I wanted to discuss with him. (7) James and I have been at the STP Conference in Boston over the past few days. (8) I reached out to him and asked to meet at the conference to talk about pairwise and combinatorial testing methods and share with him my findings that – in the dozens of projects I’ve been involved with that have compared testers efficiency and effectiveness – I’ve routinely seen defects found per tester hour more than double. (9) I was interested in getting his insights into where are these methods most applicable? Least applicable? What have his experiences been in teaching combinatorial testing methods to students, etc.

In short, frankly, my goals in meeting with him were to: (a) meet someone new, interesting and knowledgeable and learn as much as could and try to understand from his experiences, his impressive critical thinking and his questioning nature, and (b) avoid tripping up with sloppy reasoning (when unapologetically expressing the reasons I feel combinatorial testing methods are dramatically under-appreciated by the software testing community) in front of someone who (i) can smell BS a mile away, and (ii) doesn’t suffer fools gladly.

I learned a lot, heard some fantastic war stories and heard his excellent counter-examples that disproved a couple of the generalizations I was making (but didn’t dampen my unshaken assertions that combinatorial testing methods are wildly under-utilized by the software testing community). I thoroughly enjoyed the experience. Moving forward, as a result of our meeting, I will go through an exercise which will make me more effective (namely carefully thinking through and enumerating all of the assumptions behind my statements like: “I’ve measured the effectiveness of testers dozens of times – trying to control external variables as much as reasonably possible – and I’m consistently seeing more than twice as many defects per tester hour when testers adopt pairwise/combinatorial testing methods.”

His complement last night was private so I won’t share it but it ranks up there in my all time favorite complements I’ve ever received. I’m honored. Thanks James.

I have just created the first video overview of the Hexawise test case generator. Please take a look and let me know your thoughts (either with an email or a comment below).

Introduction to Hexawise Pairwise Testing Tool / Combinatorial Testingn Tool

Introduction to Hexawise Pairwise Testing Tool / Combinatorial Testing Tool

I’ll refine and hopefully improve it over time, but wanted to share it at this point for feedback.  I’d welcome feedback.  Is the pace of the video too slow?  Does it have too much detail about pairwise coverage?  Does the fact that I’ve got a dull Midwestern, nasal, monotone mean I should have someone with a more animated and melodious “voice made for radio” do the voice over?

Thanks in advance for your feedback!

- Justin

Matt Heusser

Matthew Heusser, an accomplished tester, frequent blogger, and insightful contributor in the Context Driven Testing mailing list, and a testing expert whose opinion I respect a lot, has just published a very thought-provoking blog post that highlights an important issue surrounding “PowerPointy” consultants in the testing industry who have relatively weak real world testing chops.  It’s called “The Fishing Maturity Model.”

Matthew argues that testers are well-advised to be skeptical of self-described testing experts who claim to “have the answer”  – particularly when such “experts” haven’t actually rolled their sleeves up and done software testing themselves.  In reading his article, I found it quite thought-provoking, particularly because it hit close to home: while I’m by no means a testing expert in the broader sense of the term, I do consider myself to know enough about combinatorial test design strategies applicable to software testing to be able to help most testing teams become demonstrably more efficient and effective… and yet, my actual hands-on testing experiences are admittedly quite limited.  If I’m not one of the guys he’s (justifiably) skewering with his funny and well-reasoned post (and he assures me I’m not; see below), a tester could certainly be forgiven for mistaking me for one based on my past experiences.

Matthew’s Five Levels of the Fishing Maturity Model (based, not so loosely, of course on the Testing Maturity Model, not to mention CMM, and CMMi)…

The five levels of the fishing maturity model:
1 – Ad-hoc. Fishing is an improvised process.
2 – Planned. The location and timing of our ships is planned. With a knowledge of how we did for the past two weeks, knowing we will go to the same places, we can predict our shrimp intake.
3 – Managed. If we can take the shrimp fishing process and create standard processes – how fast to drive the boat, and how deep to let out the nets, how quickly, etc, we can improve our estimates over time, more importantly.
4 – Measured. We track our results over time – to know exactly how many pounds of shrimp are delivered at what time with what processes.
5 – Optimizing. At level 5, we experiment with different techniques; to see what gathers more shrimp and what does not. This leads us to continual improvement.

Sounds good, right? Why, with a little work, this would make a decent 1-hour conference presentation. We could write a little book, create a certification, start running conferences …

And the rub…

The problem:
I’ve never fished with nets in my entire life. In fact, the last time I fished with a pole, I was ten years old at Webelo’s camp.

I posted the following response, based on my personal experiences:  Words in [brackets] are  Matthew’s response to me.

Matthew,

Excellent post, as usual.  [I'm glad you like it. Thank you.]

You raise very good points. Testers (and other IT executives) should be leery of snake oil salesmen and use their judgment about “experts” who lack practical hands-on experience. While I completely agree with this point, I offer up my own experiences as a “counter-example” to the problem you pointed out here.

3-4 years ago, while I was working at a management consulting and IT company, (with a personal background as an entrepreneur, lawyer, and management consultant – and not in software testing), I began to recommend to any software testers who would listen, that they start using a different approach to how they designed their test cases. Specifically, I was recommending that testers should begin using applied statisitics-based methods* designed to maximize coverage in a small number of tests rather than continuing to manually select test cases and rely on SME’s to identify combinations of values that should be tested together. You could say, I was recommending that they adopt what I consider to be (in many contexts) a “more mature” test design process.

The reaction I got from many teams was, as you say “this whole thing smells fishy to me” (or some more polite version of the rebuttal “Why in the world should I, with my years of experience in software testing, listen to you – a non-software tester?”) Here’s the thing: when teams did use the applied statistics-based testing methods I recommended, they consistently saw large time reductions in how long it took them to identify and document tests (often 30-40%) and they often saw huge leaps in productivity (e.g., often finding more than twice as many defects per tester hour). In each proof of concept pilot, we measured these carefully by having two separate teams – one using “business as usual” methods, the other using pairwise or orthogonal array-based test design strategies – test the same application. Those dramatic results led to my decision to create Hexawise, a software test design tool.  [Point Taken ...]

My closing thoughts related to your post boils down to:

1) I agree with your comment – “There are a lot of bogus ideas in software development.”

2) I agree that testers shouldn’t accept fancy PowerPointed ideas like “this new, improved method/model/tool will solve all your problems.”

3) I agree that testers should be especially skeptical when the person presenting those PowerPointed slides hasn’t rolled up their sleeves for years as a software testing practitioner.

Even so…

4) Some consultants who lack software testing experience actually are capable of making valuable recommendations to software testers about how they can improve their efficiency and effectiveness. It would be a mistake to write them off as charlatans because of their lack of software testing experience.  [I agree with the sentiment that sometimes, people out of the field can provide insight. I even hinted at that with the comment that at least, Forrest should listen, then use his discernment on what to use. I'm not entirely ready to, as the expression goes, throw the baby out with the bathwater.]

5) Following the “bogus ideas” link above takes readers to your quote that: “When someone tells you that your organization has to do something ‘to stay competitive,’ but he or she can’t provide any direct link to sales, revenue, reduced expenses, or some other kind of money, be leery.” I enthusiastically agree. In the software testing community, in my view, we do not focus enough on gathering real data** about which approaches work (or -ideally- in what contexts they work). A more data-driven management approach would help everyone understand what methods and approaches deliver real, tangible benefits in a wide variety of contexts vs. those methods and approaches that look good on paper but fall short in real-world implementations.  [Hey man, you can back up your statements with evidence, and you're not afraid to roll up your sleeves and enter an argument. I may not always agree with you, but you're exactly the kind of person I want to surround myself with, to keep each other sharp. Thank you for the thoughtful and well reasoned comment.]

- Justin

Company – http://www.hexawise.com
Blog – http://hexawise.wordpress.com
Forum – http://testing.stackexchange.com

*I use the term “applied statistics-based testing” to incorporate pairwise, orthogonal array-based, and more comprehensive combinatorial test design methods such as n-wise testing (that can capture, for example, all possible valid combinations involving 6-values).

**Here is an article I co-wrote which provides some solid data that applied statitics-based testing methods can more than double the number of defects found per tester hour (and simultaneously result in finding more defects) as compared to testing that relies on “business as usual” methods during the test case identification phase.

Today I’ve released a beta version of testing.stackexchange.com which is a “stackoverflow.com for software testers.” I would appreciate your help in contributing content, and/or getting the word out.  Stackoverflow has become an extraordinarily useful forum for software developers to ask difficult, practical questions, and get quick, actionable, peer-reviewed responses from software developers around the globe.  While there are some software testing questions on stackoverflow itself, the questions are mostly software developer-centric.  There’s no reason why we can’t create a very similar forum geared primarily towards the software testing community.  So who’s with me?  Please show your support by posting a question, sharing an answer or voting on existing answers at  testing.stackexchange.com

If you share my belief in the significant potential benefit to the software testing community that would result from a mature, well-trafficked site with a rich collection of peer-reviewed questions on software testing and you would be interested in helping out beyond posting periodic questions and/or answers to the site, please post a reply here or contact me through Linkedin.  I’d love to brainstorm ideas and work with like-minded people to get this forum created for the software testing community.  As of now, the odds are against testing.stackexchange from growing to obtain the critical mass it needs (particularly since I’m busy day-to-day building my software testing tool company); a small number of active collaborators would improve the odds dramatically.

I first found out about stackoverflow.com through my brother’s blog at http://management.curiouscatblog.net/2009/05/04/joel-spolsky-webcast-on-creating-social-web-resources/

Joel Spolsky’s video is fantastic.  He set out to crack the code on:

  • How can you get a useful exchange of information between experts that results in very good questions and answers being actively shared by participants?
  • How can the community encourage visitors to the site to actively participate and share their expertise?
  • How can the site generate a critical mass and utilize Google to drive traffic to the site to make it self-sustaining?
  • How can users (who might not otherwise be able to tell which are the best answers from among multiple answers) tell which answers are in fact the best?

In my view, he has succeeded on all of the above counts, which is truly impressive.  We’re using the identical strategies (and Spolsky’s technology) at testing.stackexchange.com.  The way Spolsky lays out his vision is impressive.  He logically progresses through a graveyard of multiple Q & A sites that have devolved into largely useless forums where inane questions are asked and dubious answers are shared.  He then shares how he and his collaborators adjusted the model for Stackoverflow to maximize the value to participants.  Their self-described strategy amounts to taking the best ideas they could from multiple different sites and putting them together in stackoverflow (and “using Google as our landing page” as a way to build traffic).

Thank you in advance for helping to get the word out.

- Justin Hunter

I enjoyed talking about efficient and effective combination testing strategies (and highlights of a recent empirical study) at yesterday’s TISQA meeting together with Lester Bostic of Blue Cross Blue Shield North Carolina, who shared his team’s experiences of adopting a combinatorial testing approach.  It addresses how tools like Hexawise can help software testers quickly identify the test cases they should execute to find as many defects as possible with as few tests as possible.  I wanted to share it now; once I have more time, I will comment on it and highlight some of the good questions, comments, discussion points, and tester experiences that were raised by the attendees.

The presentation focused on combinatorial testing techniques, such as pairwise testing, orthogonal array-based testing methods, and more thorough combination testing strategies (capable of identifying all defects that could be captured by, say, any possible combination of three or four “things” that you’ve decided to test for (regardless of whether those “things” include features configurations or equivalence class of data or type of user a mix of each).

The middle of the presentation also highlights empirical evidence that shows this method of identifying test cases often has an enormous impact on how quickly software testers are able to identify defects; citing the IEEE Computer article I co-wrote last month on Combinatorial Testing, this approach – on average – led to more than twice as many defects found per tester hour.

The final section of the presentation was delivered by Lester Bostic of Blue Cross Blue Shield and addresses his lessons learned.  Lester used Hexawise to reduce 1,356,136,048,589,996,428,428,909,447,344,392,489,476,985,674,792,960 possible tests (that would have been necessary to achieve comprehensive testing of the application he was testing) to only 220 tests that proved to be extremely effective at identifying defects.  <Side note: No, that absurdly large number with 51 digits after the “1″ is not a typo;  it makes me smile every time I see it>.

Comments and questions are welcome.

Older Posts »