Back to Blog
    Tutorial4 min readNovember 11, 2024

    Getting Started With AskUI for UI Automation

    AskUI is a platform-independent, visual selector-based UI automation framework. Learn how to get started!

    Jonas Menesklou
    Getting Started With AskUI for UI Automation

    TLDR

    AskUI is a cross-platform UI automation framework that uses AI-powered visual recognition to interact with UI elements, enabling automation across different operating systems and applications without relying on fragile DOM selectors.


    Introduction

    Traditional test automation struggles when dealing with diverse operating systems and applications. Tasks like automating two-factor authentication across web and mobile become particularly challenging.

    AskUI solves this by providing a platform-independent automation approach using visual recognition instead of code selectors, making UI automation more intuitive and adaptable.


    What Makes AskUI Stand Out?

    AskUI operates at the OS level, using AI vision models to identify and interact with UI elements. Unlike Selenium or Playwright that rely on XPath/CSS selectors, AskUI recognizes UI elements visually from screenshots.

    Key advantages:

    • Visual recognition - Works with any visible UI element
    • Platform independent - Windows, macOS, Linux, mobile
    • No DOM dependency - Works with native apps, games, legacy systems
    • Resilient to changes - Survives UI updates and redesigns

    How AskUI Works

    AskUI matches visual elements with Python instructions. For example:

    from askui import VisionAgent from askui import locators as loc with VisionAgent() as agent: # Click button below "Password" text agent.click(loc.Button().below_of(loc.Text("Password")))

    This approach makes automation intuitive and less dependent on code structure.


    Advantages Over Traditional Tools

    While Selenium and Playwright excel at web testing, they struggle with:

    • Shadow DOMs and iFrames
    • Native desktop applications
    • Cross-platform workflows
    • Third-party systems without source access

    AskUI benefits:

    • Human-readable code - Easy to understand and maintain
    • Code independence - No brittle selectors
    • Visual resilience - Adapts to resolution and design changes

    Getting Started: Step-by-Step Guide

    1. Create an AskUI Account

    Sign up at hub.askui.com to get:

    • Workspace ID
    • Access Token

    2. Install AskUI Agent OS

    Download the installer for your OS:

    Windows:

    macOS/Linux: Visit installation docs


    3. Install Python Package

    pip install askui

    Requirements: Python 3.10+


    4. Configure Authentication

    Set your credentials as environment variables:

    # macOS/Linux export ASKUI_WORKSPACE_ID="your-workspace-id" export ASKUI_TOKEN="your-access-token" # Windows PowerShell $env:ASKUI_WORKSPACE_ID="your-workspace-id" $env:ASKUI_TOKEN="your-access-token" # Windows CMD set ASKUI_WORKSPACE_ID=your-workspace-id set ASKUI_TOKEN=your-access-token

    5. Your First Test

    Create first_test.py:

    from askui import VisionAgent from askui import locators as loc def test_click_text(): with VisionAgent() as agent: # Click on any text element agent.click(loc.Text("Login")) print("Clicked successfully!") if __name__ == "__main__": test_click_text()

    Run:

    python first_test.py

    Practical Examples

    Login Automation

    from askui import VisionAgent from askui import locators as loc with VisionAgent() as agent: # Fill login form agent.click(loc.Text("Username")) agent.type("user@example.com") agent.click(loc.Text("Password")) agent.type("secure_password") agent.click(loc.Button().contains_text("Sign In"))

    Using Visual Relationships

    with VisionAgent() as agent: # Click element based on position agent.click(loc.Button().below_of(loc.Text("Terms"))) agent.click(loc.Icon().right_of(loc.Text("Settings"))) agent.click(loc.Text("Delete").nearest_to(loc.Text("Item 1")))

    Conclusion

    AskUI provides a fresh perspective on UI automation by using AI vision and platform-agnostic architecture. By avoiding dependence on code selectors and using visual recognition, AskUI enables creation of more stable, maintainable tests that work across any application or platform.


    FAQ

    What can AskUI automate? Any application with a GUI: web apps, desktop apps, mobile apps (via emulators), virtual machines, games, legacy systems.

    How does it handle dynamic elements? Visual recognition identifies elements by appearance and relative position, not unstable selectors.

    Do I need coding expertise? Basic Python knowledge is sufficient. AskUI's syntax is designed to be intuitive.

    How does it compare to Selenium? Selenium excels at web testing but relies on DOM selectors. AskUI uses visual AI, making it more resilient and suitable for cross-platform automation.

    Can I use it in CI/CD? Yes, AskUI integrates seamlessly with CI/CD pipelines via command-line execution.


    Ready to automate your testing?

    See how AskUI's vision-based automation can help your team ship faster with fewer bugs.

    We value your privacy

    We use cookies to enhance your experience, analyze traffic, and for marketing purposes.