The three approaches to functional black-box Testing

Share This Post

Share on linkedin
Share on twitter
Share on email
Recent scientific accomplishments have led to numerous new approaches in UI testing. We present an overview of current visual black-box techniques.

The three approaches to functional black-box Testing

functional testing

Recent scientific accomplishments in computer science have led to numerous new approaches in the digital world. One field that has profited the most from new technologies and better computational processing with GPUs, is Computer Vision. These CV accomplishments are shaping some new approaches to software testing, especially UI testing. In this post we are going to highlight the three most impactful functional black-box testing approaches in 2021. These approaches are not code dependent and don’t need any access to the rendering code like HTML. Code relying tools are excluded from this method comparison, so we are not going to present popular tools like Appium, Cypress or Selenium. 

Golden Master

The first approach we are going to introduce is called “Golden Master” or “Characterization Test”. It’s a very popular term in computer programming and anything but a new approach. But it has remained significant and is still very popular among some tools.

In UI testing, the golden master compares a given state with a previously recorded state. The state has to be recorded and synchronized between the testing systems.

golden master

Generally explained, Golden Master testing is a means to describe (or characterize) the actual behavior of an application, and therefore protect existing behavior of legacy code against unintended changes via automated testing. It’s basically a safety net if you want to extend or refactor your code. Having this safety net, developers can make modifications and verify afterwards if everything still works as intended. While Golden Master Testing is often connected to legacy code, it’s not exclusively made for legacy code. But when we look at two very general definitions of legacy code you get an idea why the gold master is so strongly linked to it. Michael Feathers once referred to legacy code as “code without unit tests”. J.B. Rainsberg tried to keep it even more general: “By legacy code I mean profitable code that we are afraid to change”. The legacy code friendly opportunity you get by using the Gold Master makes it the first advantage I want to highlight.

The second advantage goes hand in hand with the arguments we just used: the Golden Master is also very refactoring friendly, as it gives you a nice idea of the before-after situation.

Golden Master testing has a huge disadvantage though when it comes to UI testing. The technique is extremely impractical when it comes to UIs that change frequently. Seeing as the release cycles of UI changes have dramatically increased, this is very important. When the position of an element changes, the report will state an error. But errors come in different forms and shapes – maybe the element is just at another position but works just fine. Imagine a button that was moved to another location on the screen or changed color, due to display size or some minor source code changes. The button will only be marked as missing or not working because it is different, thus wrong, from the screenshot used as golden master.

This technique is used by tools such as Snapshot Testing with Jest.


  • Easy to create
  • Many free tools available


  • Error prone when visual changes are made
  • Problems with dynamically loaded content
  • Changes in display size lead to regeneration of Golden Master
  • Correctness of the results is not inferred


The second approach we want to highlight is what we are going to call the picture-in-picture approach. This summarizes various different tools and techniques. One thing all of the three following techniques share is the storage of mouse and keyboard events in order to replay them later. A cropped picture can be seen as a visual selector to (re)find an element in a picture. That’s where the name picture-in-picture comes from. This technique works similarly to the CSS or XPath selectors used in HTML.

picture in picture

This technique is looking for pixel perfect matching of elements. While this is highly accurate it does not adapt to changes made in the source code to an element. When the shape or the color of the same element changes, this technique does not work. Another disadvantage is that different resolutions are problematic due to potential pixel differences.

Overall this approach can be seen as a support pillar for testers, but it does still require the tester (or the test engineer) to have a look in the source code if an element changes. This approach is popularly used by Eggplant or Sikuli. It has to be noted that this approach is not picture independent.

The picture-in-picture arose from the golden master technique.


  • Independent of global changes because they only address snippets
  • Correctness of the results


  • Cropped Picture (visual selectors) have to be created, saved and synchronized
  • Visual changes of the cropped pictures lead require re-generation of the picture
  • Few tools available

Natural language based visual description

This method has profited the most from scientific accomplishments in Computer Vision and is a concrete result of these improvements.

Many robots and automations are supposed to work as human-like as possible. In UI Testing, a large number of companies still prefer their good old manual testers over any automation. Manual testers obviously approach UIs in a human way – they recognize elements and click them, drag them and type them. But existing test automations have not been able to simulate this human behaviour – AI trained UI detectors change that!

learned system

First of all, how do you train an AI to detect UI elements? The answer is pretty simple, you teach them. By feeding your AI as much information about UI elements as possible, your AI eventually does that job for you. Have you ever noticed that buttons, text fields, links menus look the same all over the internet? AI does notice.

Why is this such a huge leap for test automation? Common UI testing tools are relying too much on code. This means UI tests have to be written by developers and can not be outsourced to “pure” testers. They’ll let you know if something doesn’t work, but they do not recognize whether an element simply changed position or the screen changed from horizontal to vertical. An AI that can detect an element is completely independent of test environments and context, just like manual testers. This technique is completely independent of the visual appearance of an application. For example the login pages of facebook and google can be executed with the exact same script and same test step descriptions.

Another factor comes in handy at this point: natural language processing (NLP). An AI that looks at screenshots the same way humans do can be taught to understand and execute the same instructions humans would use, for example “click the login button” or “type <hello> in the text field”.

This approach is without a doubt one of the most promising when it comes to test automation. Are there any downsides? When it comes to machine learning and especially in deep learning, data is a valuable resource. In order to reliably teach an AI what a certain element looks like it needs as much data as possible. For user interfaces there are not many data sets established yet.

The possibility to teach an AI certain semantics that it uses to find UI elements comes closest to a human manual tester. The technique is still young but highly promising.


  • Very young field of research
  • Misdetection of elements does not allow to define tests
  • AI lacks an understanding of unusual meta concepts


  • Image independent
  • 100% application independent
  • Humanlike understanding of User Interfaces
  • Writing tests in natural language

Key learnings:

  1. The golden master compares a given state to a previously recorded state
  2. Picture-in-picture: a cropped picture is used as a selector to (re)find elements in the UI – evolution of the golden master.
  3. AI trained UI detectors use a semantically taught AI to address UI elements

More To Explore

Cheat Sheets

Integration Testing

Learn everything you need to get started in integration testing in our cheat sheet.

UI Testing Myths

Debunking 4 UI Testing Myths

UI Testing remains one of the most feared challenges for business owners and companies. But some myths around UI testing can be debunked.