CodeWithMMAK

Test Automation using Puppeteer, NodeJS and Javascript

A beginner-friendly guide to setting up a web automation and scraping framework using Puppeteer, NodeJS, and JavaScript.

CodeWithMMAK
January 7, 2019
3 min

Introduction

🎯 Quick Answer

Puppeteer and Node.js provide a high-performance, native-feeling automation experience for Chrome and Chromium. Unlike Selenium, Puppeteer communicates directly with the browser via the DevTools Protocol, enabling faster execution, better stability, and advanced features like network interception and PDF generation. It is the preferred choice for modern web scraping, performance analysis, and UI testing in the JavaScript ecosystem.

For a long time, I wanted to look into the Puppeteer tool which is developed by the Chrome DevTools team. Unlike Selenium, Puppeteer can perform browser actions directly.

📖 Key Definitions

Puppeteer

A Node.js library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol.

DevTools Protocol

A protocol that allows for tools to instrument, inspect, debug, and profile Chromium, Chrome, and other Blink-based browsers.

Headless Mode

Running a browser without a visible UI, which is faster and uses fewer resources, ideal for CI/CD environments.

Web Scraping

The process of using bots to extract content and data from a website.

What is Puppeteer?

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

What can be done using Puppeteer?

Most of the operations that you can do manually in the browser can be done using Puppeteer! Here are a few examples to get you started:

  • Generate screenshots and PDFs of pages.
  • Automate form submission, UI testing, keyboard input, etc.
  • Create an up-to-date, automated testing environment.
  • Capture a timeline trace of your site to help diagnose performance issues.
  • Test Chrome Extensions.
  • It is also widely used in Web Scraping.

🚀 Step-by-Step Implementation

1

Prerequisites

2

Initialize Project

Create a folder named puppeteer-nodejs-javascript and run npm init --yes to generate a package.json.

3

Install Puppeteer

Run npm install --save puppeteer to download the library and a compatible version of Chromium.

4

Create Spec Folder

Create a specs folder to house your automation scripts.

5

Write Your Script

Create a .js file using require('puppeteer') and implement an async function to launch the browser and navigate to a URL.

6

Execute Script

Run your script using node specs/yourfilename.js in the terminal.

Building the Framework

Let's go through the step-by-step process of creating an automation framework using Puppeteer, NodeJS, and JavaScript.

  1. Install Node.JS
  2. Install Visual Studio Code
  3. Create a folder named puppeteer-nodejs-javascript.
  4. Create a .gitignore file:
Code Snippet
node_modules/
temp/
test-results/
downloads/*
log/*
  1. Create a default package.json file:
Code Snippet
npm init --yes
  1. Install Puppeteer:
Code Snippet
npm install --save puppeteer
  1. Install dev dependencies:
Code Snippet
npm install --save-dev @types/node

Creating Your First Test

Create a folder named specs. Inside it, create getPageScreenshot.js:

Code Snippet
const puppeteer = require('puppeteer');

async function getPageScreenshot() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    await page.goto('https://scrapethissite.com/pages/forms/');
    console.log('User navigated to site');

    await page.screenshot({
        path: './screenshots/HockeyTeams.png'
    });
    console.log('Page screenshot taken');

    await browser.close();
    console.log('Browser closed');
}

getPageScreenshot();

⚠️ Common Errors & Pitfalls

  • Chromium Download Failure

    Sometimes npm install fails to download Chromium due to network restrictions. Use PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true and point to a local Chrome instance if needed.

  • Zombie Processes

    If your script crashes before browser.close(), Chromium processes might stay alive. Always use try...catch...finally to ensure the browser closes.

  • Selector Not Found

    Puppeteer is fast. If you try to click an element before it's rendered, it will fail. Use page.waitForSelector() before interacting.

Best Practices

  • Use headless: false during development to see what's happening in the browser.
  • Always use await for every Puppeteer action to maintain synchronous execution flow.
  • Implement a proper screenshots directory and ensure it exists before running scripts that save images.
  • Use page.setViewport() to ensure consistent rendering across different environments.

Frequently Asked Questions

Does Puppeteer support Firefox?

Yes, Puppeteer has experimental support for Firefox, but it is primarily optimized for Chrome/Chromium.

How do I handle multiple tabs?

Use browser.pages() to get an array of all open pages or browser.waitForTarget() to detect new tabs opening.

Can I use Puppeteer with Jest?

Absolutely! Puppeteer is often paired with Jest for a complete testing solution (Jest-Puppeteer).

Running the Test

To run the test, open the terminal and type:

Code Snippet
node specs/getPageScreenshot.js

Code Repository

The sample framework is hosted on GitHub: puppeteer-nodejs-javascript

Have a suggestion or found a bug? Fork this project to help make this even better.

Conclusion

Puppeteer is a game-changer for web automation. Its direct connection to the browser engine provides unparalleled speed and control. By following this guide, you've taken the first step toward mastering a tool that is essential for modern web engineering and data extraction.

📝 Summary & Key Takeaways

This guide introduced Puppeteer as a high-level API for controlling Chrome/Chromium via Node.js. we covered the fundamental definitions, provided a step-by-step setup guide for a basic automation framework, and demonstrated a practical example of capturing a page screenshot. By highlighting best practices like using waitForSelector and addressing common errors like zombie processes, this tutorial provides a solid foundation for beginners to explore the vast capabilities of Puppeteer in web testing and scraping.

Share it with your network and help others learn too!

Follow me on social media for more developer tips, tricks, and tutorials. Let's connect and build something great together!