Florian Rappl Technology enthusiast and solution architect in the IoT space.

Headless Recorder: A new tool to speed up browser automation

5 min read 1599

The old adage states that only a small fraction of development time goes into actual development. While writing the software is certainly an important thing, areas like documentation or testing should not be ignored — the latter, especially, can be a big burden and consume a lot of developer time.

Sure, that’s where automated tests come in handy. While unit tests form a solid basis, they are neither fully conclusive nor robust on internal changes. Quite often, we find ourselves either over-engineering for the purpose of having more robust and conclusive tests, or having to mock way too many implementation details.

A nice way around many of the pitfalls of unit tests is to mix in component or integration testing. Here, the idea of performing full end-to-end tests is certainly attractive.

Finally, we may have a setup that gives us the 80/20 (coverage/effort) distribution that we are after. Unfortunately, end-to-end tests may be even more fragile than unit tests and require a much more complex setup. To make matters worse, writing them may be harder, too.

This is where a Headless Recorder comes to the rescue.

What is Headless Recorder?

The term headless usually indicates that a proper frontend or UI is missing; instead, a more technical interface is used. For the Headless Recorder project, that’s not really the case. Here, the authors wanted to find a common feature of all the potential targets of their application. We’ll come back to that in a second.

First, let’s tackle what Headless Recorder actually is: it’s a small tool that can be used in the browser to record all the actions taken on a website. These actions are then transformed into JavaScript code that can be used with either Playwright or Puppeteer (indeed, Headless Recorder was previously known as Puppeteer Recorder).

If you haven’t heard of these tools, they’re browser automation applications. They are used to program what a browser should do, like clicking a link or filling out some form.

That already sheds some light on why the authors chose to use the name “Headless” Recorder. It’s a recorder for so-called browser automation frameworks, which are usually used with browsers in headless modes, i.e., when running without a UI. The recorder actually has a (very small) UI. It comes with buttons to pause or modify the recording.

The tool comes in the form of a browser extension, which can be installed via the Chrome Webstore. Once installed, it’s rather easy to get rolling.

We made a custom demo for .
No really. Click here to check it out.

In the following sections, we’ll use Playwright as our framework of choice.

Using Playwright

Playwright is a modern alternative to Puppeteer. The API methods are nearly identical in most cases, while others have been designed with a different goal in mind.

To get Playwright running, we take an existing Node.js project and install it as a dependency.

npm i playwright

This looks like:

Headless Recorder Installation

As we can see, the installation also provides all necessary browser drivers; included are Webkit, Chrome, and Firefox.

A simple script using Playwright might look like this:

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto('http://whatsmyuseragent.org/');
  await page.screenshot({ path: `example.png` });
  await browser.close();
})();

While this script looks simple and fast to write, real-world scripts are much longer and difficult to get right. The major issue here is stability — to instruct the tooling to work reliably even when little things change or timings vary.

There are multiple use cases for a tool like Playwright. As stated initially, writing end-to-end tests is one of them. Here, Playwright may be combined with a test framework like Jest to represent a powerful approach ready to tackle any end-to-end testing challenge.

Common workflows

Now that we know a bit about the Headless Recorder and one of its potential targets in Playwright, we can have a look at how to use it exactly.

Let’s start with a simple task: searching for an item on Amazon and getting the number of results. First, we need to understand (and be able to perform) this task manually. Otherwise, any attempt to automate it will ultimately fail.

The identified steps may read like:

  1. Open the Amazon page
  2. Enter the search text
  3. Click on the search
  4. Select the number of results

These steps may sound straightforward, but the trick lies in teaching Playwright how to do it. Luckily, we can use the Headless Recorder to help us here.

Let’s start again and record the steps. After we performed the four steps, we stop the recording and extract the code using the Playwright version.

We get:

const { chromium } = require('playwright');
(async () => {
  const browser = await chromium.launch()
  const context = await browser.newContext()
  const page = await context.newPage()

  const navigationPromise = page.waitForNavigation()

  await page.goto('https://www.amazon.com/')

  await page.setViewportSize({ width: 1920, height: 969 })

  await navigationPromise

  await page.waitForSelector('#nav-search #twotabsearchtextbox')
  await page.click('#nav-search #twotabsearchtextbox')

  await page.waitForSelector('.nav-searchbar > .nav-right > .nav-search-submit > #nav-search-submit-text > .nav-input')
  await page.click('.nav-searchbar > .nav-right > .nav-search-submit > #nav-search-submit-text > .nav-input')

  await navigationPromise

  await page.waitForSelector('.s-desktop-width-max > .sg-col-14-of-20 > .sg-col-inner > .a-section > span:nth-child(1)')
  await page.click('.s-desktop-width-max > .sg-col-14-of-20 > .sg-col-inner > .a-section > span:nth-child(1)')

  await browser.close()
})()

While the code above is not very pretty, it also has some problems:

  • The input is missing
  • Some of the selectors seem to be very fragile
  • The navigationPromise is used multiple times but may only once have an effect
  • The result is only clicked and not selected

The good news is that the generated code already provides a great basis to enhance.

We’ll start by removing the navigationPromise. This part could have been done with a setting in the Headless Recorder as well. Next, instead of using just a click on the search text box, we’ll use the fill function for searching for the word “headless.” Finally, we replace the click on the results text by getting the textContent.

The modified variant looks like:

const { chromium } = require('playwright');
(async () => {
  const browser = await chromium.launch()
  const context = await browser.newContext()
  const page = await context.newPage()

  await page.goto('https://www.amazon.com/')

  await page.setViewportSize({ width: 1920, height: 969 })

  await page.waitForSelector('#nav-search #twotabsearchtextbox')
  await page.fill('#nav-search #twotabsearchtextbox', 'headless')

  await page.waitForSelector('.nav-searchbar > .nav-right > .nav-search-submit > #nav-search-submit-text > .nav-input')
  await page.click('.nav-searchbar > .nav-right > .nav-search-submit > #nav-search-submit-text > .nav-input')

  await page.waitForSelector('.s-desktop-width-max > .sg-col-14-of-20 > .sg-col-inner > .a-section > span:nth-child(1)')
  const content = await page.textContent('.s-desktop-width-max > .sg-col-14-of-20 > .sg-col-inner > .a-section > span:nth-child(1)')

  console.log(content)

  await browser.close()
})();

Again, the majority of the calls remain the same; thus, most of the work has been spared here.

As a result, this will print 1-48 of over 2,000 results for in the console.

The other popular use case is to automate the browser for testing. Here, we may use a test runner such as Jest to actually collect the results.

One way to add Jest is by installing two packages:

npm i jest jest-cli --save-dev

Adding a file suffixed with .test.js will be recognized as a test using the default configuration. Using the code from before, we’ll end up with:

const { chromium } = require('playwright');

test('has the right search results', async () => {
    const browser = await chromium.launch()
    const context = await browser.newContext()
    const page = await context.newPage()

    await page.goto('https://www.amazon.com/')

    await page.setViewportSize({ width: 1920, height: 969 })

    await page.waitForSelector('#nav-search #twotabsearchtextbox')
    await page.fill('#nav-search #twotabsearchtextbox', 'headless')

    await page.waitForSelector('.nav-searchbar > .nav-right > .nav-search-submit > #nav-search-submit-text > .nav-input')
    await page.click('.nav-searchbar > .nav-right > .nav-search-submit > #nav-search-submit-text > .nav-input')

    await page.waitForSelector('.s-desktop-width-max > .sg-col-14-of-20 > .sg-col-inner > .a-section > span:nth-child(1)')
    const content = await page.textContent('.s-desktop-width-max > .sg-col-14-of-20 > .sg-col-inner > .a-section > span:nth-child(1)')

    await browser.close()

    expect(content).toBe('1-48 of over 2,000 results for');
});

More assertions are possible, too. In general, we can simplify some of these tests by putting shared functionality in methods that are reused by all tests.

Before optimizing, let’s check the output in the command line:

$ npx jest
 PASS  ./my.test.js (14.371 s)
  ✓ has the right search results (2160 ms)

Test Suites: 1 passed, 1 total
Tests:       1 passed, 1 total
Snapshots:   0 total
Time:        25.854 s
Ran all test suites.

Now let’s refine the code above. We obtain:

const { chromium } = require('playwright');

let browser, context, page;

beforeEach(async () => {
    browser = await chromium.launch()
    context = await browser.newContext()
    page = await context.newPage()
});

afterEach(async () => {
    await browser.close()
});

test('has the right search results', async () => {
    await page.goto('https://www.amazon.com/')

    // ... (as above)

    expect(content).toBe('1-48 of over 2,000 results for');
});

Wonderful — without using any E2E test framework, we are able to leverage E2E tests for preventing regressions and using reliable smoke tests.

Another great use of browser automation is with enhanced logging. In here, we essentially want to allow using replays of user actions to simplify bug reporting and ease development.

Conclusion

Browser automation has come a long way, but still faces the issue that lots of work is necessary to make sure the right actions are performed automatically. Using Headless Recorder we can improve the situation by a fair margin letting us focus on the functionality instead of technicalities. While we’ll still need to pay some attention to the technical details, the tooling boosts our productivity greatly.

What will you do with the Headless Recorder?

Florian Rappl Technology enthusiast and solution architect in the IoT space.

Leave a Reply