You’ve written and deployed your application and gathered users – congrats! But what’s next?
Improvements, getting rid of bottlenecks, increasing execution speed, and more enhancements are in line.
In order to make these improvements, you first have to be aware of your app’s existing performance characteristics. Only when you’ve identified the slow parts and the bottlenecks of the logic can you effectively improve performance.
However, nobody likes a trial-and-error process of guessing which parts might be slower.
Luckily for you, Node.js provides various built-in performance hooks to measure execution speed, find out what parts of code are worth optimizing, and collect a granular view of your app’s code execution.
In this article, you’ll learn how to use Node.js performance hooks and measurement APIs to identify bottlenecks and enhance your application’s performance for faster response times and improved resource efficiency.
Before reading this piece, you should have some basic knowledge of Node and JavaScript as well as some experience building applications with both.
We will start by taking a look at why and when we should use the Performance API provided by Node and the various options it provides.
Consider a case where you want to measure the execution time for a specific block of code. For this, you might have used the Date
object like this:
let start = Date.now(); for (let i = 0; i < 10000; i++) { } // stand-in for some complex calculation let end = Date.now(); console.log(end - start);
However, if you run the above and observe, you’ll notice that this is not precise enough.
For example, an empty loop like the above will log either 0
or 1
as the difference and not give us enough granularity. Date
class can only offer milliseconds level of granularity, and if the code runs in order of 100 nanoseconds, this will not give us a correct measurement.
For that, we can use the Performance API instead to get a better measurement:
const {performance} = require('node:perf_hooks'); let start = performance.now() for (let i = 0; i < 10000; i++) {} let end = performance.now() console.log(end - start);
With this, we get a more granular value, which on my system is in the range of 0.18
to 0.21
milliseconds, with a precision of up to 15-16 decimal places. This is a simple way we can use the Node Performance API for a better measurement of execution time.
The API also provides a method to precisely mark a point in time during the run of the program. We can use the performance.mark
method to get a timestamp of an event with high precision, such as the start of loop iterations.
Let’s run the following code:
let start_mark = performance.mark("loop_start", {detail:"starting loop of 1000 iterations"} ); for (let i = 0; i < 10000; i++) {} let end_mark = performance.mark("loop_end", {detail:"ending loop of 1000 iterations"} ); console.log( start_mark, end_mark );
When we do, we’ll get this output:
PerformanceMark { name: 'loop_start', entryType: 'mark', startTime: 27.891528000007384, duration: 0, detail: 'starting loop of 1000 iterations' } PerformanceMark { name: 'loop_end', entryType: 'mark', startTime: 28.996093000052497, duration: 0, detail: 'ending loop of 1000 iterations' }
The mark
function takes the name of the mark as the first parameter. The detail
in the second parameter object allows for extra details regarding that mark, such as the number of iterations run, database query parameters, and so on.
The object returned by the mark
function can then be used to export the timing data to something like Prometheus using the Prometheus exporter sdk. This allows us to query and visualize the timing info outside the application. As a mark is an instantaneous point in time, the duration
field in the returned object is always zero.
Instead of manually calling performance.now
and calculating the difference between two events, we can do the same using marks and the measure
function. We can use the names given to the marks above to measure the duration between two marks:
performance.mark("loop_start", {detail:"starting loop of 1000 iterations"} ); for (let i = 0; i < 10000; i++) {} performance.mark("loop_end", {detail:"ending loop of 1000 iterations"} ); console.log(performance.measure("loop_time","loop_start","loop_end"));
The first argument to measure
is the name we want to give to the measurement. Then the next two arguments specify the names of the marks to start and end the measurement on, respectively.
Both of these two arguments are optional — if neither is given, performance.measure
will return the time elapsed between the application start and the measure
call. If we provide only the first argument, the function will return time elapsed between performance.mark
with that name and the measure
call.
If both are provided, the function will return a high-precision time difference between them. For the above example, we will get an output like this:
PerformanceMeasure { name: 'loop_time', entryType: 'measure', startTime: 27.991639000130817, duration: 1.019368999870494 }
This can again be used with Prometheus exporter in order to export custom measurement metrics. If you have a setup which does blue-green or canary deployments, you can compare the performance of the old and new versions to see if your optimization works as expected or not.
Finally, one thing to note is that the Performance API internally uses a fixed size buffer to store the marks and measures, so we need to clean them once we are done using them. This can be done using this:
performance.clearMarks("mark_name");
Or this:
performance.clearMeasures("measure_name");
These functions will remove the mark/measure with the given name from the respective buffer. If you call these functions without providing any argument, they will clear all the marks/measures that are present in the buffers, so be careful when calling these functions without any argument.
Let us now see how we can use this API to optimize our application. For our example, we will consider a case where we are fetching some data from the database, then manually sorting it and returning it to the user.
We want to see how much time each operation takes, and what would be the best place to optimize first. For this, we will first measure various events that take place:
async function main(){ const querySize = 10; // ideally this will come from user's request performance.mark("db_query_start",{detail:`query size ${querySize}`}); const data = fetchData(querySize); performance.mark("db_query_end",{detail:`query size ${querySize}`}); performance.mark("sort_start",{detail:`sort size ${querySize}`}); const sorted = sortData(data); performance.mark("sort_end",{detail:`sort size ${querySize}`}); console.log(performance.measure("db_time","db_query_start","db_query_end")); console.log(performance.measure("sort_time","sort_start","sort_end")); // clear the marks... }
We start by declaring the query size, which in a real app would probably come from user’s request.
Then we use the performance.mark
function to mark the starts and ends of database fetch and sorting operations. Finally we output the duration between these events using the performance.measure
function. We get an output like this:
PerformanceMeasure { name: 'db_time', entryType: 'measure', startTime: 27.811830999795347, duration: 1.482880000025034 } PerformanceMeasure { name: 'sort_time', entryType: 'measure', startTime: 29.31366699980572, duration: 0.09800400026142597 }
To see how both of these operations perform with increasing query size, we will change the query size value and note the measurements. On my system, I get the following:
Query size | Database fetch time (ms) | Sort time (ms) |
---|---|---|
10 | 1.48 | 0.098 |
100 | 1.65 | 1.235 |
1000 | 2.11 | 7.214 |
10000 | 3.8 | 21036.7 |
As we can see here, the sorting time is growing rapidly as the query size increases, and optimizing it first can be more beneficial. By using some different sorting algorithms, we get the below:
Query size | Database fetch time (ms) | Sort time (ms) |
---|---|---|
10 | 1.5 | 0.28 |
100 | 1.97 | 0.4 |
1000 | 2.35 | 5.78 |
10000 | 3.5 | 17.53 |
While the sorting time is slightly worse for very small query sizes, the time grows slowly compared to the original measurements. Hence, changing the sorting algorithm here would be beneficial if we expect to frequently deal with large query sizes.
Similarly, we can measure the difference in database fetch times before and after creating an index on the queried fields. Then we can decide if the index creation is useful or which fields provide more benefits when used for indices.
When creating UI-based apps, we need the UI to be responsive even when some heavy processing task is in progress. If the UI freezes when processing large data, it would be a bad user experience to deal with. On websites, this can be done using web workers.
For apps running directly using Node, we can use Node’s worker_threads
module to offload the computationally intensive tasks to background threads.
Note that this is useful only when the task is CPU-intensive, such as sorting or parsing data. If the task depends on I/O such as reading a file or fetching a network resource, using Node’s async-await
is more efficient than using workers.
We can create and use workers as follows:
const { Worker, isMainThread, parentPort, workerData, } = require("node:worker_threads"); async function main() { const data = await fetchData(10); let sorted = await new Promise((resolve, reject) => { const worker = new Worker(__filename, { workerData: data, }); worker.on("message", resolve); worker.on("error", reject); worker.on("exit", (code) => { if (code !== 0) reject(new Error(`Worker stopped with exit code ${code}`)); }); }); } function worker() { const data = workerData; sortData(data); parentPort.postMessage(data); } if (isMainThread) { // we are in the main thread of our application main().then(() => { console.log("done"); }); } else { // we are in the background thread spawned by the main thread worker(); }
We start by importing the required functions and variable declarations from the worker_threads
module. We then define two functions — main
which will run in the main thread and worker
which will run in the worker thread.
We then check if the script is being executed as the main thread or as a worker thread and call the main/worker functions accordingly. To keep this example simple, we define all these in a single file, but we can also separate out the main and worker functions in their own files.
In the main function, we fetch the data as before. We then create a promise, and in it we create a new worker. The Worker
constructor requires a file path, which will be run as the worker thread.
Here we give it the same file using __filename
builtin. In the second parameter, we pass the data to be sorted as workerData
. This workerData
will be provided to the worker thread by the Node runtime.
Finally, we listen to the events from the worker — on receiving a message, we resolve the promise, and in the case of errors or non-zero exit code, we reject the promise.
In the worker thread, we get the data passed from the main thread in the variable workerData
which is imported from the worker_threads
module. Here we sort it and post a message to the main thread containing the sorted data.
In the main thread, instead of immediately awaiting the promise, we can keep it in a queue or check on it periodically. This way we can keep our main thread responsive when the worker thread is doing the sorting calculations. We can also send intermediate messages from the worker thread indicating the sorting progress.
While each app will have its own way to optimize for performance, there are some common starting points for Node.js apps.
You must instrument and measure the performance of your app before you start optimizing it so you can know exactly which functions or API/DB calls need to be optimized.
Trying to do do blind optimization can worsen performance, which is why using Performance hooks and APIs provided by Node to measure is a good starting point.
To decide if your optimization works or not, you should have a handy way to measure the performance before and after.
This can be done by having two builds — one with and one without the changes, having a script that runs tests and measurements, and something that can give you a comparison. Having clear before-and-after performance values for changes can help you decide if the changes are worth it.
If your application uses a database and queries it frequently, you should consider creating an index on the queried parameters for improving the retrieval performance.
This will come at a cost of potentially increased storage size and possibly higher insert/update query times, so you should carefully measure the before/after in your use cases and decide if the trade-offs are good or not.
Another way to improve performance is by using some caching scheme in order to quickly respond to database or API queries. This can be used effectively if you can cache the API responses with query parameters and then use this cache to respond to later requests.
Note that caching is a double-edged sword. You need to carefully evaluate how long to keep a cache entry, on what basis to evict the entries, and when to invalidate the cache. Incorrectly doing this can not only worsen your performance but also risk sending incorrect data or leaked data across users.
If you have ever taken a look at node_modules
or checked the disk size taken by node_modules
, you know how heavy dependencies can be in a Node project.
You need to be careful while adding new dependencies, as they can add a lot more transitive dependencies, and parsing all these can impact the startup performance of your app. You can try mitigating this through the following:
package.json
that are not used in the app anymore and can be removed. This can be useful to shrink number of dependencies and the build size of your packageThe Performance API provided by Node can help in determining not only which parts are slower but also how much time they take. You can further explore this data by exporting it as traces or metrics to something like Jaeger or Prometheus.
Remember — having a ton of data only makes it harder to explore, so a good strategy would be to first only measure timings of coarse events such as function calls or even end-to-end processing of requests, and then add more and more fine grained measurements to functions which are taking the most time.
Deploying a Node-based web app or website is the easy part. Making sure your Node instance continues to serve resources to your app is where things get tougher. If you’re interested in ensuring requests to the backend or third-party services are successful, try LogRocket.
LogRocket is like a DVR for web and mobile apps, recording literally everything that happens while a user interacts with your app. Instead of guessing why problems happen, you can aggregate and report on problematic network requests to quickly understand the root cause.
LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. Start monitoring for free.
Hey there, want to help make our blog better?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up nowLearn how to manage memory leaks in Rust, avoid unsafe behavior, and use tools like weak references to ensure efficient programs.
Bypass anti-bot measures in Node.js with curl-impersonate. Learn how it mimics browsers to overcome bot detection for web scraping.
Handle frontend data discrepancies with eventual consistency using WebSockets, Docker Compose, and practical code examples.
Efficient initializing is crucial to smooth-running websites. One way to optimize that process is through lazy initialization in Rust 1.80.