Step 6: Check Results - What Good Looks Like | A3 Thinking

What Good Looks Like (and Common Pitfalls)

Introduction – Proof Over Perception

Checking results is deceptively simple: look at the data and decide if the goal was met. In practice, it’s far more nuanced. Step 6 is not about confirming that we did something; it’s about confirming that what we did worked.

When this step is handled well, it strengthens confidence in the logic of the entire problem-solving sequence. When it’s handled poorly, it breeds illusion—activity mistaken for achievement, appearance mistaken for reality.

True checking closes the scientific loop. It tests whether our understanding of the problem, the verified causes, and the chosen countermeasures were correct. The evidence must stand up to scrutiny—quantitative, repeatable, and logically connected to the target condition defined back in Step 3.

Unfortunately, many teams stumble here. The following are the most common pitfalls—and what “good” checking actually looks like when done with discipline.

1 – Checking Implementation Instead of Impact

This is the most frequent error. Teams proudly report that every countermeasure was implemented: new standardized work posted, operators trained, 5S audits completed. Yet none of these show whether productivity, quality, or safety truly improved.

Having a goal to raise productivity in pieces per hour and then “checking” audit scores is a logical mismatch. Implementation is necessary—but it is not verification.

Good practice:
Re-state the original target metric before beginning Step 6. Measure the same parameter, in the same units, over a comparable period. A valid check asks, “Did output per hour actually rise, and is it stable?”

2 – Changing or Diluting the Metric

A close cousin of the first error is subtly altering the measurement definition. The team produces a Pareto chart of defect causes “before” and another “after,” and declares victory. But if the mix, volume, or time base changed, those charts may tell a false story.

If 5000 parts were inspected before and 2,000 after, a lower count doesn’t automatically mean lower defect rate. Without normalization (per unit, per hour, per patient-day), data can mislead.

Good practice:
Use identical measurement methods and normalize for workload differences. When variables like volume or mix shift, calculate rates or ratios rather than counts. Consistency in definition is the foundation of credibility.

3 – Substituting Pictures or Anecdotes for Proof

Modern teams love visuals. Photos of clean work areas, new signage, or smiling employees appear in many “after” slides. These are valuable for storytelling, but not for measurement. A photograph is a snapshot, not statistical evidence.

Good practice:
Use visuals as context, not proof. Pair them with hard data or verified observations. A photo of a new document template, for example, is helpful—but it must be accompanied by an audit or sample showing that documentation errors actually dropped.

4 – Declaring Victory Too Soon

Another common trap is rushing to closure after the first signs of success. A single good day or week of data is celebrated as permanent improvement. Schedule pressure or a desire to “finish the project” drives premature conclusions.

Good practice:
Require proof of stability over time. At Toyota, improvements were typically confirmed over multiple production shifts, product variants, and even personnel rotations. If the new condition withstands normal variation, only then is it verified. Sustained data—run or control charts showing at least one full cycle—is far more persuasive than short-term snapshots.

5 – Ignoring Secondary Effects or Side Losses

A final error is tunnel vision—confirming the primary metric improved while overlooking collateral damage elsewhere. Productivity goes up, but quality drops. Lead time shortens, but overtime costs spike.

Good practice:
Conduct a balance check. Review neighboring processes and related KPIs to ensure the improvement didn’t create new problems. Step 6 is about total-system confirmation, not single-point optimization.

Evidence for What “Good Checking” Looks Like

After studying hundreds of problem-solving reports, the strongest pattern is consistency: solid checks use the same logic chain and same evidence types.

Evidence Type	Description	Why It Matters
Quantitative Before/After Data	Same metric, same scale, normalized for volume or mix	Direct proof of improvement
Process Observation	Verified that new routine or condition operates consistently	Ensures behavioral sustainment
Stability Over Time	Run chart or control chart showing sustained level	Guards against short-term spikes
Balanced Impact Review	Confirmed no negative side effects	Protects system integrity
Visual Context (Optional)	Photos, videos, or dashboards illustrating the change	Aids communication, not verification

This evidence hierarchy keeps the focus on facts rather than impressions. The first four categories form the proof; the fifth simply helps others see it.

A Lesson from Experience

In the earlier surface-grinder project, our team didn’t declare success when the first samples looked good. We measured process capability (Cpk) for several weeks and tracked coolant concentration daily. Only after stability was confirmed under normal conditions did we record the improvement as verified.

Contrast that with another case I once observed, where a facility posted impressive “before/after” photos of a reorganized workspace. They reported success, yet productivity data (which was the goal) showed no measurable change. The visual improvement had become a substitute for real verification.

The difference between these two examples is not data volume—it’s discipline. One team followed the logic of Step 3 all the way through Step 6; the other stopped at appearances.

Why These Errors Happen

Most of these mistakes come from good intentions. Teams want closure, leaders want momentum, and facilitators want to showcase results. But haste turns reflection into ritual. Checking results demands patience, honesty, and respect for data—even when the outcome disappoints.

The deeper reason is psychological: confirmation feels better than contradiction. Yet contradiction is where learning lives. A result that challenges our hypothesis sends us back to strengthen the logic of earlier steps. That feedback loop is the essence of scientific problem solving.

What Good Looks Like

A robust Check Results phase has four visible qualities:

Logical Alignment – The measure perfectly mirrors the original goal.
Empirical Evidence – Before/after data, normalized and traceable.
Sustained Stability – Results hold under normal variation.
Balanced Impact – No negative side effects elsewhere.

When these conditions are met, results speak for themselves. The data become a teacher, confirming whether our countermeasures attacked the right cause or whether deeper learning is required.

Conclusion – Learning Through Verification

Step 6 is the integrity test of problem solving. It separates true learning from comforting stories. Weak checking may satisfy a presentation; strong checking strengthens a culture.

As Toyota and Six Sigma practitioners alike remind us:

No confirmation, no learning. No verification, no problem solving.

The message is not punitive—it’s liberating. When we measure correctly, we gain confidence in what works and clarity about what doesn’t. Only then can we carry forward real knowledge to Step 7, Standardize and Share, built on evidence rather than assumption.

Step 6: Check Results - What Good Looks Like