AI-powered e2e testing: Getting started with Shortest

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Check it out

End-to-end (e2e) testing is essential for ensuring software applications function correctly. However, traditional testing tools like Selenium and Cypress can be difficult to use because they have steep learning curves, fragile tests, and require a lot of maintenance.

Simplifying E2E Testing With Open Source AI Testing Tools

AI-powered testing tools like Shortest, Testim, Mabl, and Functionize directly address these problems. They use natural language processing (NLP) and self-healing tests, making it easier to create and maintain tests, which means you don’t need to be a coding expert to use them.

This article looks at how AI-powered testing tools compare to traditional ones and their main benefits. We’ll take a close look at Shortest, an open source AI-powered testing library, its features, and how it simplifies the testing process.

Challenges with traditional testing frameworks

Traditional end-to-end testing frameworks are important for automated testing, but they have several drawbacks:

Steep learning curve: To write and maintain test scripts in tools like Selenium or Cypress, team members need coding skills. This makes it hard for those without a technical background (e.g., business analysts or product managers) to take part in the testing process
High maintenance overhead: When applications change, tests often fail and need manual updates, which leads to high maintenance costs
Slow test creation: Creating tests takes a lot of time when the scenarios are complex. This slows down development cycles

How AI-powered tools address these challenges

AI-driven testing solutions help solve common problems by introducing several key features:

Natural Language Processing (NLP): Users can write test cases in plain language, making it easier for those without coding skills to participate, improving teamwork among developers, QA engineers, and product teams
Self-healing tests: These tools can adjust to changes in the user interface, which means less manual work is needed to keep tests up to date
Smart test generation: AI creates test cases based on how users behave and use the application

Practical benefits of AI-powered testing tools

AI-powered end-to-end (e2e) testing tools offer several benefits compared to traditional frameworks:

Time saving

AI testing tools help reduce the time needed to create and maintain tests. What once took hours or days can now often be done in minutes. You don’t need to write custom code for every test case. Less time is spent debugging tests that break easily, test maintenance becomes automatic when the application changes, and you get immediate feedback on whether tests are valid while you create them.

Studies show that switching to AI-powered tools can cut test creation time by up to 80%. This gives developers more time to focus on building features instead of maintaining tests.

Reduced maintenance overhead

AI testing tools have self-healing features that lower the maintenance load for teams, especially those dealing with fragile test suites that often break during development. When user interface elements change, these tools can automatically spot the changes, use machine learning to find replacement elements, continue running tests without needing manual fixes, and learn from successful changes to improve future performance.

Improved collaboration

AI testing tools help team members, both technical and non-technical, work better together. They help product managers check that tests accurately reflect user experiences. QA specialists and business stakeholders can create and maintain tests without coding skills. They also allow developers to concentrate on complex issues rather than basic tests. This teamwork ensures that testing aligns with business needs and that everyone shares responsibility for maintaining quality.

Scalability and reliability

AI tools make it easier to scale complex applications. They help teams create tests faster, run them on different devices in the cloud at the same time, choose the right tests intelligently, and reduce test failures. This leads to more reliable results. With this scalability, teams can keep their tests thorough even as applications grow and change. This ensures a smoother development and testing process.

Overview of AI-powered testing tools

Here’s a simple overview of four popular tools: Shortest, Testim, Mabl, and Functionize, each offering AI-driven end-to-end testing.

Shortest

Shortest Testing Framework

Shortest is an open source testing framework that uses NLP to understand test descriptions. This makes it easy for anyone, even those with limited technical skills, to create tests. Built on Playwright, Shortest can automate browser tasks with little coding. Shortest is great for teams looking for quick and easy test creation, though using an external API might slow it down.

Key features of Shortest include:

Natural language testing: Write tests in plain English (e.g., “Log in to the app using email and password.”), and the AI will take care of the interaction
Advanced features: Chain tests for workflows (e.g., login followed by updates) and conduct API testing using natural language
Integrations: Supports GitHub 2FA, CI/CD pipeline test, and email validation through Mailosaur for secure testing
Ease of use: The shortest init command sets up a project quickly, and tests can run in headless or visible modes

Testim

Testim Testing Framework

Testim by Tricentis is a testing platform that speeds up the creation and maintenance of tests for web and mobile apps. It uses machine learning to make tests stable and less flaky.

Testim is ideal for agile teams needing strong regression testing, but its pricing can be a hurdle for smaller projects. Some of its key features include:

AI-powered stabilizers: Its smart locators analyze UI elements, adapting tests to changes in layout
Low-code authoring: Its tests can be recorded visually or coded, so both non-technical users and developers can use it
Scalability: It runs thousands of tests quickly across different browsers, with detailed reports on failures
CI/CD integration: It easily fits into DevOps pipelines for continuous testing

Mabl

Mabl Testing Framework

Mabl is an AI-based test automation platform for web, mobile, and API testing. It focuses on accessibility and collaboration. Mabl is great for teams that want speed and minimal coding, but some of its advanced features may take some time to learn.

Key features of Mabl include:

Intuitive AI: Quickly creates tests, auto-fixes tests for UI changes, and uses computer vision to find visual issues
Comprehensive testing: Supports functional, performance, and accessibility testing, along with API tests through Postman
Performance insights: Monitors page load times and test runs to catch problems early
Team collaboration: Works with CI/CD tools and communication platforms like Slack for smoother teamwork

Functionize

Functionize Testing Framework

Functionize is a high-end testing platform that uses machine learning and computer vision for functional, performance, and visual testing. It features self-healing tests and scalability. Functionize is ideal for large projects that change often, but its costs and Windows-only design might make it less accessible for smaller teams.

Key features of Functionize include:

Self-healing tests: Automatically updates tests when the UI or functions change, cutting down on maintenance
Visual AI: Uses computer vision for accurate recognition of elements, so tests adapt to changing interfaces
Parallel testing: Runs tests on multiple browsers and devices at the same time for faster execution
Root cause analysis: Helps find the reasons behind test failures, making debugging easier for complex systems

Comparing AI-powered testing tools

Feature	Shortest	Testim	Mabl	Functionize
Core technology	AI-powered (Anthropic Claude API), built on Playwright	Machine Learning (Smart Locators), Cloud-based	AI-native, low-code, uses ML and computer vision	AI and ML with NLP and computer vision, cloud-based
Test creation	Natural language descriptions (e.g., “Login with email”)	Record-and-replay, low-code visual editor, supports coded enhancements	Low-code, AI-powered action words, visual recorder	NLP for scriptless tests, visual test editor
Ease of use	High: Plain English tests, minimal setup with shortest init	High: Codeless for non-technical users, intuitive UI	High: Codeless focus, accessible for beginners	Moderate: Scriptless but may require learning for advanced features
Self-healing tests	Limited: Relies on AI to adapt to minor changes, no explicit self-healing	Yes: Smart Locators auto-update element references	Yes: Auto-heals tests for UI/data changes	Yes: Strong self-healing with ML-driven updates
Supported test types	Functional, API, UI, GitHub 2FA authentication	Functional, UI, mobile (web/native), visual testing	Functional, performance, accessibility, API, visual regression	Functional, performance, load, visual, API
Integration	GitHub, Mailosaur, basic CI/CD support	CI/CD (Jenkins, Azure DevOps), Jira, Slack, Tricentis Device Cloud	CI/CD (GitHub, Azure, Bitbucket), Postman, Slack	CI/CD (Jenkins, GitLab), third-party apps via API Explorer
Cross-browser/Device support	Yes: Playwright-based, supports multiple browsers	Yes: Real browsers, iOS/Android native apps	Yes: Web, mobile, cross-browser/devices	Yes: Extensive browser/device coverage, parallel testing
Pricing model	Open source and depends on Anthropic API usage	Free tier, Essentials/Pro plans, custom pricing	Pay-as-you-go, subscription plans, custom pricing	Custom pricing, potentially high for small teams
Learning curve	Low: Natural language reduces technical barriers	Low: Codeless options, moderate for coded enhancements	Low: Intuitive GUI, low-code approach	Moderate: Advanced features require familiarity
Scalability	Moderate: Suitable for small to medium projects, API dependency	High: Scales for agile teams, parallel testing	High: Cloud-based, scales for continuous testing	High: Enterprise-grade, supports large-scale parallel testing
Unique strength	Natural language simplicity, GitHub 2FA support	Smart Locators for flaky test reduction, mobile native app support	AI-driven test generation, performance insights	Visual AI, comprehensive test coverage for complex apps
Best for	Teams wanting simple, scriptless E2E testing with minimal coding	Agile teams needing fast test creation and maintenance	DevOps teams prioritizing codeless, continuous testing	Enterprises with complex apps needing robust, scalable testing
Limitations	External API reliance, limited performance/accessibility testing	Less focus on performance, pricing complexity	Limited customization for advanced users, higher cost	High cost, Windows-centric design, less flexible for small teams

Testing with Shortest: A case study

In this section, we’ll look at how to test a demo application using Shortest. We’ll cover setup, writing a natural language test, and demonstrate advanced features like test chaining and API testing. Our demo app will be a simple React-based to-do list application using Next.js, which allows users to add, view, and delete tasks. The application will have a frontend UI and a basic API endpoint to fetch tasks.

To follow along, you can clone the GitHub repo. cd into the project directory and run npm install && npm run dev. This app creates a simple UI where users can add and delete tasks, stored in the component’s state, and an API that returns a static list of tasks, simulating a backend response.

Adding Shortest to our application

To install Shortest, the command below will help you set up the process in a new or existing project:

npx @antiwork/shortest init

This command will:

Install @antiwork/shortest as a dev dependency
Create a shortest.config.ts file
Generate a .env.local file with placeholders
Update .gitignore to include .env.local and .shortest/

Now edit shortest.config.ts to match the application setup:

import type { ShortestConfig } from "@antiwork/shortest";
export default {
  headless: false,
  baseUrl: "http://localhost:3000",
  browser: {
    contextOptions: {
      ignoreHTTPSErrors: true
    },
  },
  testPattern: "**/*.test.ts",
  ai: {
    provider: "anthropic",
    apiKey: process.env.ANTHROPIC_API_KEY
  },
} satisfies ShortestConfig;

Edit .env.local and add your Anthropic API key (you’ll need to sign up for one here). You can also configure browser behavior using the browser.contextOptions property in your config file. This will allow you to pass custom Playwright browser context options.

Ensure .env.local is in .gitignore to avoid committing sensitive data.

Writing and executing a natural language test

In this section, we’ll explore how to write and execute tests using Shortest. We’ll write a test to verify adding a task to the to-do list.

Over 200k developers use LogRocket to create better digital experiences

Learn more →

Create a test file using the specified pattern in the config file app/todo.test.ts:

import { shortest } from '@antiwork/shortest';

shortest('Add a new task to the to-do list', {
  task: 'Buy groceries',
});

This test instructs Shortest to add a task with the text “Buy groceries” to the list. Now run the test using this command:

npx shortest app/todo.test.ts

Here’s what happens:

Shortest launches a browser (Playwright-based) in non-headless mode (headless: false)
It navigates to http://localhost:3000
The Anthropic Claude API interprets the natural language description, identifies the input field and “Add” button, enters “Buy groceries,” and clicks the button
A screenshot is saved in .shortest/ for verification

The test passes if the task appears in the list. You’ll see the browser perform the actions live, and the console will report success:

Found 1 test file(s)
❯ app/todo.test.ts (1)
  ● Add a new task to the to-do list
    ✓ passed
  ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ 

   Tests          1 passed (1)
   Duration       10.84s
   Started at     3:47:47 PM
   Tokens         0 tokens (≈ $0.00)

 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

Demonstrating advanced features like test chaining and API testing

Let’s demonstrate test chaining and API testing to showcase Shortest’s advanced capabilities. We’ll chain tests to add a task and then delete it. Edit the app/todo.test.ts file and execute the test:

import { shortest } from '@antiwork/shortest';

shortest([
  'Add a new task to the to-do list with text Buy groceries',
  'Delete the task with text Buy groceries from the to-do list',
]);

Shortest will make sure that:

The first test adds “Buy groceries” to the list
The second test locates the task and clicks its “Delete” button
Shortest’s AI ensures the sequence executes correctly, maintaining browser state between tests

Now, let’s test the /api/tasks endpoint to ensure it returns the expected tasks. Add the code below to the app/todo.test.ts file and execute the test:

import { shortest } from '@antiwork/shortest';

const API_BASE_URI = 'http://localhost:3000/api';

// UI Test Chain
shortest([
  'Add a new task to the to-do list with text Buy groceries',
  'Delete the task with text Buy groceries from the to-do list',
]);
// API Test
shortest(`
  Test the API GET endpoint ${API_BASE_URI}/tasks
  Expect the response to contain a list of tasks including Sample Task 1
`);

Here’s what happens:

The UI tests run as before
The API test sends a GET request to /api/tasks
Shortest’s AI verifies that the response includes “Sample Task 1” (from our static tasks array)

The API test passes if the response contains the expected task. Shortest then logs the API response details, and the test suite completes successfully:

  ● Test the API GET endpoint http://localhost:3000/api/tasks Expect the response to contain a list of tasks including Sample Task 1
    ✓ passed
    ↳ 6,414 tokens (≈ $0.02)
  ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ 

   Tests          1 passed (1)
   Duration       19.65s
   Started at     4:32:52 PM
   Tokens         6,414 tokens (≈ $0.02)

 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

Using callbacks for custom assertions

Shortest allows you to use callback functions for custom checks and actions after your browser tests run. This feature lets you create more complex test scenarios, like checking your database or making API calls, to see how your application is doing after user interactions.

To demonstrate callbacks, let’s add a test with a custom assertion to verify the task count after adding a task. Add this to the app/todo.test.ts file and run the test:

shortest('Add a task and verify task count', {
  task: 'Learn TypeScript',
}).after(async ({ page }) => {
  const taskCount = await page.locator('li').count();
  if (taskCount < 1) {
    throw new Error('No tasks found in the list');
  }
});

The test confirms the task was added, enhancing reliability with custom logic. What happens:

The test adds “Learn TypeScript” to the list
The .after callback uses Playwright’s API to count <li> elements (tasks)
If at least one task exists, the assertion passes

  ● Add a task and verify task count
    ✓ passed
    ↳ 40,100 tokens (≈ $0.13)
  ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ 

   Tests          1 passed (1)
   Duration       55.93s
   Started at     4:35:45 PM
   Tokens         40,100 tokens (≈ $0.13)
  
 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

Using lifecycle hooks

Lifecycle hooks let you run code before and after tests. This helps with tasks like setting up the task list, navigating to the app, cleaning the UI state, and more. In your app/todo.test.ts file, add the code below and run the test:

shortest.beforeAll(async ({ page }) => {
  await page.goto('http://localhost:3000');
  // Clear any existing tasks by deleting all visible tasks
  while (await page.locator('button:text("Delete")').count() > 0) {
    await page.locator('button:text("Delete")').first().click();
  }
});

shortest.beforeEach(async ({ page }) => {
  await page.reload();
});

shortest.afterEach(async ({ page }) => {
  // Clear the input field to prevent carryover
  await page.locator('input[placeholder="Enter a new task"]').fill('');
});

shortest.afterAll(async ({ page }) => {
  await page.close();
});

Here are the lifecycle hooks Shortest provides:

beforeAll: Executes once before all tests. Ideal for initial setup, such as navigating to the app and clearing any pre-existing tasks by clicking all “Delete” buttons
beforeEach: Executes before each test. Useful for resetting the UI state, like reloading the page to clear tasks stored in the component’s state
afterEach: Executes after each test. Handy for cleanup, such as clearing the input field to ensure no text persists between tests
afterAll: Executes once after all tests. Suitable for final cleanup, like closing the browser to free system resources

The hooks ensure a consistent and isolated testing environment. In the code above, each test starts with an empty task list, the input field is cleared post-test, and the browser is closed at the end, preventing state leakage and ensuring reliable test execution.

Comparing Shortest with traditional testing frameworks

Shortest has many advantages over traditional frameworks like Selenium and Cypress.

Traditional testing tools require long and complicated code for browser tasks, and they don’t have built-in AI support, making them slow and prone to errors. For example, Cypress, while modern, uses a lot of JavaScript. Even though it has started to implement some AI features like automatic test creation for missing UI elements, it is not primarily AI-driven.

Shortest’s AI features offer a different approach, allowing testers to write shorter, human-friendly tests and reducing the time and technical skills required to set up tests. For example, a login test in Selenium can take dozens of lines of code to navigate the website and manage waits, while Shortest can achieve this with just one simple sentence. Similarly, Cypress simplifies some tasks, but still needs specific commands like cy.get() and cy.click() to do so.

Shortest uses Playwright to provide performance similar to Cypress, but it also integrates with the Claude API to handle complex tasks automatically, such as managing dynamic forms or validating API responses. These are tasks for which traditional frameworks require manual coding.

It is important to note, however, that Shortest relies on Anthropic’s API, which means it depends on an external service. This is different from Selenium and Cypress, which are self-contained. Another thing to consider is that Shortest’s natural language method might feel less precise for developers who want detailed control over their tests.

Conclusion

AI-driven testing tools like Shortest, Testim, Mabl, and Functionize are changing how we do end-to-end testing. These tools use automation to help teams spend less time on maintenance and allow non-coders to take part in testing, resulting in higher quality software. While traditional tools like Selenium and Cypress are still effective, AI-powered tools offer a strong option for teams that want to improve their testing processes.

As AI technology advances, we will likely see even more improvements that simplify testing and strengthen software reliability.

Windsurf vs. Cursor: When to choose the challenger

Windsurf AI brings agentic coding and terminal control right into your IDE. We compare it to Cursor, explore its features, and build a real frontend project.

Chizaram Ken

Jul 31, 2025 ⋅ 9 min read

The CSS `if()` function: Conditional styling will never be the same

The CSS Working Group has approved the if() function for development, a feature that promises to bring true conditional styling directly to our stylesheets.

Ikeh Akinyemi

Jul 30, 2025 ⋅ 12 min read

Next.js 15.4 is here: What’s new and what to expect

Next.js 15.4 is here, and it’s more than just a typical update. This version marks a major milestone for the framework and its growing ecosystem.

Abiola Farounbi

Jul 29, 2025 ⋅ 6 min read

Build interactive React UIs for LLM outputs using llm-ui

If you’re building an LLM-powered application, llm-ui is a powerful tool to help you add structure, flexibility, and polish to your AI interfaces.

Emmanuel John

Jul 29, 2025 ⋅ 9 min read

View all posts

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

AI-powered e2e testing: Getting started with Shortest

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Challenges with traditional testing frameworks

How AI-powered tools address these challenges

Practical benefits of AI-powered testing tools

Time saving

Reduced maintenance overhead

Improved collaboration

Scalability and reliability

Overview of AI-powered testing tools

Shortest

Testim

Mabl

Functionize

Comparing AI-powered testing tools

Testing with Shortest: A case study

Adding Shortest to our application

Writing and executing a natural language test

Over 200k developers use LogRocket to create better digital experiences

Demonstrating advanced features like test chaining and API testing

Using callbacks for custom assertions

Using lifecycle hooks

Comparing Shortest with traditional testing frameworks

More great articles from LogRocket:

Conclusion

Stop guessing about your digital experience with LogRocket

Recent posts:

Windsurf vs. Cursor: When to choose the challenger

The CSS `if()` function: Conditional styling will never be the same

Next.js 15.4 is here: What’s new and what to expect

Build interactive React UIs for LLM outputs using llm-ui

Leave a ReplyCancel reply

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Challenges with traditional testing frameworks

How AI-powered tools address these challenges

Practical benefits of AI-powered testing tools

Time saving

Reduced maintenance overhead

Improved collaboration

Scalability and reliability

Overview of AI-powered testing tools

Shortest

Testim

Mabl

Functionize

Comparing AI-powered testing tools

Testing with Shortest: A case study

Adding Shortest to our application

Writing and executing a natural language test

Over 200k developers use LogRocket to create better digital experiences

Demonstrating advanced features like test chaining and API testing

Using callbacks for custom assertions

Using lifecycle hooks

Comparing Shortest with traditional testing frameworks

More great articles from LogRocket:

Conclusion

Stop guessing about your digital experience with LogRocket

Recent posts:

Windsurf vs. Cursor: When to choose the challenger

The CSS if() function: Conditional styling will never be the same

Next.js 15.4 is here: What’s new and what to expect

Build interactive React UIs for LLM outputs using llm-ui

Leave a ReplyCancel reply

The CSS `if()` function: Conditional styling will never be the same