In the past, Node.js was often not an option when building applications that require CPU intensive computation. This is due to its nonblocking, event-driven I/O architecture. With the advent of thread workers in Node.js, it is possible to use it for CPU intensive applications. In this article, we will take a look at certain use cases of worker threads in a Node.js application.
Before continuing with the use cases of thread workers in Node.js, let’s do a quick comparison of I/O-bound vs. CPU-bound programs in Node.
A program is said to be bound by a resource if an increase in the resource leads to improved performance of the program. Increase in the speed of the I/O subsystem (such as memory, hard disk speed, or network connection) increases the performance of an I/O-bound program.
This is typical of Node.js applications as the event loop often spends time waiting for the network, filesystem and perhaps database I/O to complete their operations before continuing with code execution or returning a response. Increasing hard disk speed and/or network connection would usually improve the overall performance of the application or program.
A program is CPU bound if its processing time reduces by an increase in CPU. For instance, a program that calculates the hash of a file will processes faster on a 2.2GHz processor and process slower on a 1.2GHz.
For CPU bound applications the majority of the time is spent using the CPU to do calculations. In Node.js, CPU bound applications block the event and cause other requests to be held up.
Don’t block the event loop, keep it running and avoid anything that could block the thread-like synchronous network calls or infinite loops.
Node runs in a single-threaded event loop, using non-blocking I/O calls, allowing it to concurrently support tens of thousands of computations running at the same time, for example serving multiple incoming HTTP requests. This works well and is fast as long as the work associated with each client at any given time is small. But if you perform CPU intensive calculations, your concurrent Node.js server will come to a screeching halt. Other incoming requests will wait as only one request is being served at a time.
Certain strategies have been used to cope with CPU intensive tasks in Node.js. Multiple processes (like cluster API) that make sure that the CPU is optimally used, child processes that spawn up a new process to handle blocking tasks.
These strategies are advantageous because the event loop is not blocked, it also allows separation of processes, so if something goes wrong in one process, it does not affect other processes. However, since the child processes run in isolation they are not able to share memory with each other and the communication of data must be via JSON, which requires serialization and deserialization of data.
The best solution for CPU intensive computation in Node.js is to run multiple Node.js instances inside the same process, where memory can be shared and there would be no need to pass data via JSON. This is exactly what worker threads do in Node.js.
We will look at a few use cases of thread workers in a Node.js application. We will not be looking at thread worker APIs because we will just be looking at use cases of thread workers in a node application. If you are not familiar with thread workers you can visit this post get started with how to use thread worker APIs.
Let’s say you are building an application that allows users to upload a profile image and then you generate multiple sizes (eg: 100 x 100 and 64 x 64) of the image for the various use cases within the application. The process of resizing the image is CPU intensive and having to resize into two different sizes would also increase the time spent by the CPU resizing the image. The task of resizing the image can be outsourced to a separate thread while the main thread handles other lightweight tasks.
// worker.js const { parentPort, workerData } = require("worker_threads"); const sharp = require("sharp"); async function resize() { const outputPath = "public/images/" + Date.now() + ".png"; const { image, size } = workerData; await sharp(image) .resize(size, size, { fit: "cover" }) .toFile(outputPath); parentPort.postMessage(outputPath); } resize()
// mainThread.js const { Worker } = require("worker_threads"); module.exports = function imageResizer(image, size) { return new Promise((resolve, reject) => { const worker = new Worker(__dirname + "/worker.js", { workerData: { image, size } }); worker.on("message", resolve); worker.on("error", reject); worker.on("exit", code => { if (code !== 0) reject(new Error(`Worker stopped with exit code ${code}`)); }); }); };
The main thread has a method that creates a thread for resizing each image. It passes the size and the image to the thread worker using the workerData
property. The worker resizes the image with sharp and sends it back to the main thread.
Video compression is another CPU intensive task that can be outsourced to the thread worker. Most video streaming applications would usually have multiple variations of a single video which is shown to users depending on their network connection. Thread workers can do the job of compressing the video to various sizes.
ffmpeg-fluet
is a commonly used module for video processing in Node.js applications. It is dependent on ffmpeg
which is a complete, cross-platform solution to record, convert and stream audio and video.
Because of the overhead of creating workers each time you need to use a new thread, it is recommended that you create a pool of workers which you can use when you need them as opposed to creating workers on the fly. To create a worker pool we use an NPM module node-worker-threads-pool
, it creates worker threads pool using Node’s worker_threads module.
// worker.js const { parentPort, workerData } = require("worker_threads"); const ffmpeg = require("fluent-ffmpeg"); function resizeVideo({ inputPath, size, parentPort }) { const outputPath = "public/videos/" + Date.now() + size + ".mp4"; ffmpeg(inputPath) .audioCodec("libmp3lame") .videoCodec("libx264") .size(size) .on("error", function(err) { console.log("An error occurred: " + err.message); }) .on("end", function() { parentPort.postMessage(outputPath); }) .save(outputPath); } parentPort.on("message", param => { resizeVideo({ ...param, parentPort }); });
// mainThread.js const { StaticPool } = require("node-worker-threads-pool"); const filePath = __dirname + "/worker.js"; const pool = new StaticPool({ size: 4, task: filePath, workerData: "workerData!" }); const videoSizes = ["1920x1080", "1280x720", "854x480", "640x360"]; module.exports = async function compressVideo(inputPath) { const compressedVideos = []; videoSizes.forEach(async size => { const video = await pool.exec({ inputPath, size }); compressedVideos.push(video); }); };
Suppose you have to store your files on cloud storage. You want to be sure that the files that you store are not tampered by any third party. You can do it by computing hash of that file using a cryptographic hash algorithm.
You save these hashes and their storage location in your database. When you download the files, you compute the hash again to see if they match. The process of computing the hash is CPU intensive and can be done in a thread worker:
// hashing.js const { Worker, isMainThread, parentPort, workerData } = require('worker_threads'); const crypto = require("crypto"); const fs = require("fs"); if (isMainThread) { module.exports = async function hashFile(filePath) { return new Promise((resolve, reject) => { const worker = new Worker(__filename); worker.on('message', resolve); worker.on('error', reject); worker.on('exit', (code) => { if (code !== 0) reject(new Error(`Worker stopped with exit code ${code}`)); }); worker.postMessage(filePath) }); }; } else { const algorithm = "sha1"; const shasum = crypto.createHash(algorithm); const stream = fs.ReadStream(workerData); stream.on("data", function(data) { shasum.update(data); }); stream.on("end", function() { const hash = shasum.digest("hex"); parentPort.postMessage(hash); }); }
Notice that we have both the worker thread code and the main thread code in the same file. The isMainThread
property of the thread worker helps us determine the current thread and run the code appropriate for each thread.
The main thread creates a new worker and listens to events from the worker. The worker thread calculates the hash of a stream of data using the Node.js crypto method called createHash
.
A Node.js thread worker is a great option when we want to improve performance by freeing up the event loop. One thing to note is that workers are useful for performing CPU-intensive JavaScript operations. Do not use them for I/O, since Node.js’s built-in mechanisms for performing operations asynchronously already treat it more efficiently than worker threads can.
Deploying a Node-based web app or website is the easy part. Making sure your Node instance continues to serve resources to your app is where things get tougher. If you’re interested in ensuring requests to the backend or third-party services are successful, try LogRocket.
LogRocket is like a DVR for web and mobile apps, recording literally everything that happens while a user interacts with your app. Instead of guessing why problems happen, you can aggregate and report on problematic network requests to quickly understand the root cause.
LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. Start monitoring for free.
Would you be interested in joining LogRocket's developer community?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up nowWith the right tools and strategies, JavaScript debugging can become much easier. Explore eight strategies for effective JavaScript debugging, including source maps and other techniques using Chrome DevTools.
This Angular guide demonstrates how to create a pseudo-spreadsheet application with reactive forms using the `FormArray` container.
Implement a loading state, or loading skeleton, in React with and without external dependencies like the React Loading Skeleton package.
The beta version of Tailwind CSS v4.0 was released a few months ago. Explore the new developments and how Tailwind makes the build process faster and simpler.
One Reply to "Use cases for Node workers"
I believe the following line in the last example:
const stream = fs.ReadStream(filePath);
should be:
const stream = fs.ReadStream(wokerData);