Chidume Nnamdi I'm a software engineer with 6+ years of experience. I've worked with different stacks ranging from WAMP, to MERN, to MEAN. My language of choice is JavaScript; frameworks are Angular and Nodejs.

Creating visuals of your webpage with Puppeteer

7 min read 2000

The puppeteer and Node logos.

Puppeteer is a Node.js module built by Google used to emulate the Chrome browser or Chromium in a Node environment.

From the Puppeteer API docs: Puppeteer is a Node library which provides a high-level API to control Chromium or Chrome over the DevTools Protocol.

So basically, Puppeteer is a browser you run on Node.js. It contains APIs that mimic the browser. These APIs enable you to carry out different operations, like:

  • Generating PDF from a webpage
  • Generating screenshots from a webpage
  • Testing Chrome extensions
  • So many more

In this post, we will learn how we can use Puppeteer to generate screenshots from a website URL.

Creating visuals

Creating visuals of your webpage is quite easy using the Puppeteer Node module.

First, we install the puppeteer Node module:

npm i puppeteer

Then, we’ll create our .js file and require the “puppeteer” library.

const puppeteer = require("puppeteer")

Now, create a browser context and a new page:

const puppeteer = require("puppeteer")
const browser = await puppeteer.launch({ headless: true })
const page = await browser.newPage()

Note: The Puppeteer lib is promise-based. That means its APIs mostly return promise objects.

With _const browser = await puppeteer.launch({ headless: true })_, we created a new browser instance using the launch API in the Puppeteer class instance, puppeteer. This actually launches a Chromium instance. The browser is an instance of a browser class.

The setting { headless: true } means that the browser will be a headless instance of Chromium.

We made a custom demo for .
No really. Click here to check it out.

Notice that the launch returns a promise, which resolves to a browser instance. Here, that would be browser.

_const page = await browser.newPage()_

Browsers can hold so many pages. So this newPage() method in Browser creates a new page in a default browser context. The page is an object of a page class.

Now, using the page object, we will load or navigate to the webpage that we want to take a screenshot of:

await page.goto('https://medium.com')>/pre>

Here, we are loading the Medium home page. The method goto will resolve when the load event is fired by the browser, indicating that the page has successfully loaded. Now, with the page medium.com loaded, we can then take the screenshot.

const screenShot = await page.screenshot({
    path: "./",
    type: "png",
    fullPage: true
})

The screenshot method of the page object does it all. It takes the screenshot of the current page.

The screenshot method takes in some configurations:

_path_: This indicates the file path where we want to save the image. Here, we will be saving at the current working directory. If there is no path, the image will not be saved to disk.

_type_: Indicates the type of image encoding to use either png or jpeg.

_fullPage_: This will make the screenshot the full scrollable size of the page.

There are other settings, including:

_quality_: The quality of the image, between 0-100. Not applicable to png images.

_omitBackground_: Hides default white background and allows capturing screenshots with transparency.

_encoding_: The encoding of the image can be either base64 or binary. Defaults to binary.

The screenshot method returns a promise that will resolve to either a buffer or base64 based on the encoding property value in the settings. So in our own example here, the screenshot method will return a promise that will resolve to a binary. The screenshot variable will hold the binary image of the Medium frontpage.

With this, our code is complete:

// screenshot.js
const puppeteer = require("puppeteer")

( async function() {

    const browser = await puppeteer.launch({ headless: true })

    const page = await browser.newPage()

    await page.goto('https://medium.com')

    const screenShot = await page.screenshot({
        path: "./",
        type: "png",
        fullPage: true
    })
}()
)

We can run the file in our Node.js environment:

node screenshot.js

This will generate a screenshot of Medium and save it in our current directory.

You can substitute the https://medium.com with the webpage of your choice you want to capture its screenshot.

Setting viewport size

Without specifying the fullPage option while taking screenshots with Puppeteer, Puppeteer will simulate a browser window with a default size set to 800×600.

We can change the screenshot size using the setViewport API in the page class.

await page.setViewport({
    width: 1200,
    height: 1500
})

This will generate a screenshot of the webpage with width 1200px and a height of 1500px.

Setting specific area for screenshot

We can clip a region of the page and take a screenshot of it. In other words, we can take a screenshot of a specific area on a webpage.

This is done by passing a clip object to the page.screenshot(...) method.

const screenShot = await page.screenshot({
        path: "./",
        type: "png",
        clip: {
            ...            
        }
    })

The clip object has the following fields:

  • _x_: The top-left of the webpage x-axis of the clip area
  • _y_: The top-left of the webpage y-axis of the clip area
  • _width_: The width the clipping area would take
  • _height_: The height of the clipping area
const screenShot = await page.screenshot({
        path: "./",
        type: "png",
        clip: {
            x: 0,
            y: 0,
            width: 360,
            height: 400
        }
    })

The above code would take a screenshot of the webpage starting from the 0,0 => x,y axis and going 300px right and 400px down.

const screenShot = await page.screenshot({
        path: "./",
        type: "png",
        clip: {
            x: 50,
            y: 60,
            width: 360,
            height: 400
        }
    })

This will triangulate the 50,60 => x,y axis on the page and move 360px right and 400px down, and then take a screenshot of the clipped area.

Background

We can omit the default white background Puppeteer gives us in the screenshots by passing the omitBackground boolean option.

const screenShot = await page.screenshot({
        ...,
        omitBackground: true
    })

This will take the screenshot of the webpage with a transparent background.

Waiting for page load

With our current load, a page is deemed to be fully loaded when the load event is fired. The load event is fired when the page is successfully loaded.

Puppeteer gives us more options apart from the load event to indicate to us when navigation is finished.

  • _domcontentloaded_: This will tell Puppeteer to fire page load finished when the DOMContentLoaded event is fired.

  • _networkidle0_: This will ensure Puppeteer tells us that a page load is finished when there are no more than 0 network connections for at least 500ms.

  • _networkidle2_: This will ensure Puppeteer tells us that a page load is finished when there are no more than 2 network connections for at least 500ms.

The default page load event is the load. We can set the above options in the page.goto(...) call.

To set the domcontentload page load event, we do this:

await .goto( "https://medium.com",{
    waitUntil: 'domcontentloaded'
});

The page load event is set in the waitUntil field in the options passed to page.goto(...). Here, the page load will be deemed complete when the DOMContentLoaded event is fired.

await .goto( "https://medium.com",{
    waitUnitl: 'networkidle0'
});

Here, the page load will be deemed complete when there are no network connections for at least 500ms.

await .goto( "https://medium.com",{
    waitUnitl: 'networkidle2'
});

Here, the page load will be deemed complete when there are no more than 2 network connections for at least 500ms.

Landscape mode

To take the screenshot of our webpage in landscape mode, we will pass the isLandscape option to the page.setViewport(...) call.

page.setViewport({
    ...,
    isLandscape: true
})

This will take the screenshot of a webpage in a landscape mode.

Making it a Service

We can extend this implementation to turn it into a service, like an online webpage to image service.

This service will let users take screenshots of their webpage or any selected webpage.

We will implement this service in Node.js.

Here is the backend code:

/** require dependencies */
const express = require("express")
const cors = require('cors')
const bodyParser = require('body-parser')
const helmet = require('helmet')
const path = require('path')
const puppeteer = require("puppeteer")

const app = express()

let port = process.env.PORT || 5000

/** set up middlewares */
app.use(cors())
app.use(bodyParser.json({limit: '50mb'}))
app.use(helmet())

app.get("/", (request, response) => {
    response.sendFile(path.join(__dirname, 'index.html'));
});

app.get("/style.css", (request, response) => {
    response.sendFile(path.join(__dirname, 'style.css'));
});

app.get('*', function(req, res) {
  res.sendFile(path.resolve(__dirname, 'index.html'));
});

app.use('/static',express.static(path.join(__dirname,'static')))
app.use('/assets',express.static(path.join(__dirname,'assets')))

app.post('/api/screenshot', (req, res, next) => {
    const { url } = req.body

    var screenshot = takeScreenshot(url)
    res.send({result: screenshot })
    next()    
})

async function takeScreenshot(url) {
    const browser = await puppeteer.launch({ 
       headless: true,
       args: ['--no-sandbox'] 
     });
    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle0' });
    const screenShot = await page.screenshot({
        path: "./",
        type: "png",
        fullPage: true
    })

    await browser.close();
    return screenshot;
}

/** start server */
app.listen(port, () => {
    console.log(`Server started at port: ${port}`);
});

The takeScreenshot function is the main code. We’ll get the URL of the webpage we want to screenshot from the request body and pass it to the takeScreenshot function. A Puppeteer Browser instance is created, and we’ll navigate to the URL. The page is then screenshotted and the screenshot is returned back, which is then sent to the user.

Let’s see the frontend code:

<title>Webpage to Image</title>

<style>
</style>

<body>
    <header>
        <div class="title-bar"><h2>Webpage to Image</h2></div>
    </header>
    <div class="container">
        <div class="info close" id="info">
            <h3>Info</h3>
        </div>
        <div class="wrapper">
            <div class="div-input" style="">
                <input id="webpageUrl" type="text" placeholder="Type your webpage URL" />
            </div>
            <div class="div-button">
                <button id="webpageButton" onclick="return generateImage(event)">
                    Generate Image
                </button>
            </div>                
        </div>
    </div>
</body>
<script src="./axios/axios.min.js"></script>
<script>

    webpageUrl.addEventListener("keydown", (evt) => {
        if(evt.key == "Enter")
            generateImage(evt)
    })

    function generateImage(evt) {
        evt.preventDefault()

        showLoading(true)
        disableButton(true)

        if(webpageUrl.value.length === 0) {
            showLoading(false)
            info.innerHTML = `
                <h3>Error</h3>
                Please, Type in the webpage URL.
            `
            info.classList.add("info-danger")
            info.classList.remove("close")

            setTimeout(() => {
                info.classList.add("close")
                info.classList.remove("info-danger")
            }, 3000)
        } else
            axios.post("api/screenshot", { url: webpageUrl.value } ).then( res => {
                showLoading(false)
                const { result } = res.data
                const blob = new Blob([result], {type: 'image/png'})
                const link = document.createElement('a')
                link.href = window.URL.createObjectURL(blob)
                link.download = `your-file-name.png`
                link.click()
            }).catch(err => {
                showLoading(false)

                // err

                info.innerHTML = `
                    <h3>Error</h3>
                    ${err}
                `
                info.classList.add("info-danger")
                info.classList.remove("close")
                setTimeout(() => {
                    info.classList.add("close")
                    info.classList.remove("info-danger")
                }, 3000)
            })

        disableButton(false)
    }

    function showLoading(show) {
        if(show == true) {
            webpageButton.innerHTML = "wait..."
        } else {
            webpageButton.innerHTML = "Generate Image"
        }
    }

    function disableButton(disable) {
        if(disable == true) {
            webpageButton.setAttribute("disable", true)
        } else {
            webpageButton.removeAttribute("disable")
        }
    }
</script>
...

We have input and a button. We type the webpage we want to screenshot in the input. When clicked, the button will call the backend API at “api/screenshot” and send the webpage URL in the input box along with it. The backend code at “api/screenshot” will generate the screenshot and send it back to the user. The resolved function in the Axios get call will run, and we will programmatically download the generated image to our storage.

It’s very simple.

There are more features you can add to this:

  • Image type to be generated
  • Batch image processing (here, the user can add an array of webpages and the service will generate their images in batch processing)
  • Screenshot size
  • Area of the webpage to screenshot
  • Background transparency
  • etc

You can set the options based on different options in the Puppeteer APIs.

The sky is the limit.

Conclusion

Puppeteer is freaking awesome, no doubt.

It has so many APIs you can leverage to cook up some cool stuff. Find them here:

puppeteer/puppeteer

Headless Chrome Node.js API. Contribute to puppeteer/puppeteer development by creating an account on GitHub.

If you have any question regarding this, feel free to comment, email, or DM me.

200’s only Monitor failed and slow network requests in production

Deploying a Node-based web app or website is the easy part. Making sure your Node instance continues to serve resources to your app is where things get tougher. If you’re interested in ensuring requests to the backend or third party services are successful, try LogRocket. https://logrocket.com/signup/

LogRocket is like a DVR for web apps, recording literally everything that happens on your site. Instead of guessing why problems happen, you can aggregate and report on problematic network requests to quickly understand the root cause.

LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. .
Chidume Nnamdi I'm a software engineer with 6+ years of experience. I've worked with different stacks ranging from WAMP, to MERN, to MEAN. My language of choice is JavaScript; frameworks are Angular and Nodejs.

Leave a Reply