Testing Infrastructure Series — Part 1
Executive Summary
ISTQB draws a hard line between Quality Assurance and Quality Control. Most teams blur it. That blurring creates a process vacuum where nobody owns the standards, and test execution runs without governance. This post maps four ISTQB fundamentals to real engineering problems in hardware-dependent QA, and shows where computer-use agents fit into the structure that ISTQB defines.
QA vs QC: Why Most Teams Have the Wrong Structure
Quality Assurance is proactive. It defines the process: what standards to enforce, what coverage targets to set, what gates to require before release. ISTQB positions QA as the discipline that prevents defects by building the right process.
Quality Control is reactive. It executes the process: writing test cases, running them, logging defects, tracking results. QC detects defects in the actual product through testing and validation.
The problem in most enterprise QA departments is that the people called "QA engineers" are actually doing QC. They write and execute tests. Nobody owns the process layer. Nobody defines the standards, gates, and policies that prevent defects from being created in the first place.
This matters more in 2026 than it ever has. In the vibe-coding era, developers describe features and coding agents build them. Cursor, Claude Code, Copilot, and Windsurf generate implementation at a pace no human QA team can match through manual test execution. The missing piece is a QA Agent that evaluates the code these tools produce and provides structured feedback, the same way a human tester would. That agent handles the QC layer: executing tests, validating outputs, catching regressions at scale. This frees the human QA team to focus on what agents can't do: defining what quality means, setting the policies agents enforce, and deciding which trade-offs are acceptable.
Error, Defect, Failure: Why Problems Travel Further Than They Should
ISTQB defines a chain that every tester should understand. An error is a human mistake made during development. That error produces a defect in the code or documentation. When the defect is triggered during test execution, it causes a failure that reveals the problem.
The practical consequence is that defects found late cost dramatically more to fix than defects found early. A requirement ambiguity caught during a review costs one conversation to resolve. The same ambiguity discovered during system integration testing on a physical test bench can require redesigning the HMI, recoding the control logic, reflashing the firmware, and retesting the entire system.
Root cause analysis matters because a single root cause can propagate across multiple modules. Fixing one defect can introduce side effects in another. This is why regression testing exists, and why it becomes the largest line item in the QA budget over time.
For teams testing hardware-dependent systems, failure detection needs to go deeper than the UI layer. An agent that can observe failures across three levels, checking UI response on the screen, verifying log signals on the CAN Bus, and confirming physical behavior through camera or sensor feeds, can trace defects back to their root cause far faster than a human working through each layer manually.
The Inner Loop, the Outer Loop, and the Wall Between Them
ISTQB describes multiple test levels that should form a continuous chain: component testing, component integration testing, system testing, system integration testing, and acceptance testing.
In practice, most organizations experience this as two disconnected loops.
The inner loop is developer-owned. Static analysis, unit tests, integration tests. It runs on every commit, provides fast feedback, and catches logic errors before they propagate. This is where shift-left has its greatest wins.
The outer loop is QA-owned. End-to-end tests, system testing, acceptance testing. It validates the full user journey against real or near-real environments. It's inherently slower because it needs the complete system to exist.
Between them is a wall. In SaaS companies, DevOps and containerization have eroded this wall significantly. Developers can spin up full-stack environments locally and run system-level tests before committing code.
In automotive, embedded systems, medical devices, and industrial automation, the wall is concrete. Developers can't run system tests on their own machines because the system includes physical hardware they don't have access to.
Computer-use agents operate at the OS level, which means they can work on both sides of the wall. They execute unit-level validations in the inner loop and full end-to-end validations in the outer loop, on the same physical or virtual environment where the real testing needs to happen. Raw LLM APIs are limited to browser-based interactions. An infrastructure layer that provides OS-level perception and execution covers web, desktop, mobile, terminals, Citrix, VDI, and physical HMI displays.
Why Shift Left Fails for Hardware
ISTQB Principle 3 says early testing saves time and money. The shift-left approach implements this by moving testing activities earlier in the development lifecycle. It works when infrastructure can be virtualized.
For web and SaaS, shift left is highly feasible. Docker and Kubernetes let developers run the full stack locally in seconds.
For mobile, feasibility drops. Device farms and OS fragmentation create friction. Emulators help, but real device testing stays remote.
For desktop, feasibility drops further. OS registries, heavy VMs, and environments that accumulate configuration drift make testing slow and unreliable.
For desktop plus hardware, shift left is rarely feasible. Physical labs, test benches, and prototypes worth tens of thousands of euros cannot be shipped to a developer's desk. The physicality barrier stops the shift-left movement entirely.
This is fundamentally a Hardware-in-the-Loop problem. Traditional HiL approaches use specialized simulators to model the environment while testing real embedded controllers. But when the test target is the UI layer, the HMI display, the rendered user interaction, simulation alone is not enough. You need agents that can perceive the actual screen and interact with it the way a human operator would.
The answer for hardware-dependent teams is not to keep forcing shift left. It's to bring intelligent agents to where the testing actually happens: the lab, the bench, the physical device.
The Lab SRE Problem
The shift to remote work created a role that most organizations didn't plan for. When teams went remote, QA departments in hardware-dependent companies shifted from testing to infrastructure operations.
The hardware bottleneck is straightforward. You can't ship a prototype to every developer's home office. QA must now manage the physical lab as a service. That means access management with booking systems and priority queues for device time. It means remote control infrastructure with smart plugs, webcams, and relay boards for physical resets when prototypes freeze. It means firmware synchronization to ensure the lab hardware matches whatever branch the remote developer is testing.
QA engineers in these environments have become Lab SREs: Site Reliability Engineers for physical test environments. They spend their time maintaining infrastructure instead of writing and executing tests.
Agents that can interact with real hardware, real screens, and real interfaces autonomously reduce the Lab SRE burden. Instead of a human manually navigating to a test scenario on a physical device, the agent executes the full agentic loop: observe the display, reason about the next step, act through OS-level input, verify the outcome, and recover if the device enters an unexpected state.
FAQ
What is the difference between QA and QC according to ISTQB?
Quality Assurance is proactive and process-focused. It defines standards, policies, and gates to prevent defects. Quality Control is reactive and product-focused. It executes tests and detects defects. QA owns the process. QC validates the product.
Why does shift left testing fail for hardware-dependent systems?
Shift left assumes infrastructure can be virtualized and run locally. For systems involving physical devices, test benches, and prototypes, the test environment cannot be replicated on a developer's machine. The physicality barrier prevents testing from moving earlier in the lifecycle.
What is a Lab SRE?
A Lab SRE is a QA engineer whose role has shifted from testing to managing physical test lab infrastructure. This includes device access scheduling, remote power management, and firmware synchronization, delivered as a service for distributed development teams.
What is 3-level testing for hardware systems?
Validating system behavior across three layers: UI level to check if the screen shows the expected response, log level to verify internal signals like CAN Bus data, and hardware level to confirm physical behavior through cameras or sensors. Most test automation today only covers the first layer.
