Streams are one of the major features that most Node.js applications rely on, especially when handling HTTP requests, reading/writing files, and making socket communications. Streams are very predictable since we can always expect data, error, and end events when using streams.
This article will teach Node developers how to use streams to efficiently handle large amounts of data. This is a typical real-world challenge faced by Node developers when they have to deal with a large data source, and it may not be feasible to process this data all at once.
This article will cover the following topics:
The following are four main types of streams in Node.js:
Streams come in handy when we are working with files that are too large to read into memory and process as a whole.
For example, consider Node.js streams a good choice if you are working on a video conference/streaming application that would require the transfer of data in smaller chunks to enable high-volume web streaming while avoiding network latency.
Batching is a common pattern for data optimization which involves the collection of data in chunks, storing these data in memory, and writing them to disk once all the data are stored in memory.
Let’s take a look at a typical batching process:
const fs = require("fs"); const https = require("https"); const url = "some file url"; https.get(url, (res) => { const chunks = []; res .on("data", (data) => chunks.push(data)) .on("end", () => fs.writeFile("file.txt", Buffer.concat(chunks), (err) => { err ? console.error(err) : console.log("saved successfully!"); }) ); });
Here, all of the data is pushed into an array. When the data event is triggered and once the “end” event is triggered, indicating that we are done receiving the data, we proceed to write the data to a file using the fs.writeFile
and Buffer.concat
methods.
The major downside with batching is insufficient memory allocation because all the data is stored in memory before writing to disk.
Writing data as we receive it is a more efficient approach to handling large files. This is where streams come in handy.
The Node.js fs
module exposes some of the native Node Stream API, which can be used to compose streams.
We’ll be covering readable, writable, and transform streams. You can read our blog post about duplex streams in Node.js if you want to learn more about them.
const fs = require("fs"); const fileStream = fs.createWriteStream('./file.txt') for (let i = 0; i <= 20000; i++) { fileStream.write("Hello world welcome to Node.js\n" ); }
A writeable stream is created using the createWriteStream()
method, which requires the path of the file to write to as a parameter.
Running the above snippet will create a file named file.txt
in your current directory with 20,000 lines of Hello world welcome to Node.js
in it.
const fs = require("fs"); const fileStream = fs.createReadStream("./file.txt"); fileStream .on("data", (data) => { console.log("Read data:", data.toString()); }) .on("end", () => { console.log("No more data."); });
Here, the data
event handler will execute each time a chunk of data has been read, while the end
event handler will execute once there is no more data.
Running the above snippet will log 20,000 lines of the Hello world welcome to Node.js
string from ./file.txt
to the console.
Transform streams have both readable and writable features. It allows the processing of input data followed by outputting data in the processed format.
To create a transform stream, we need to import the Transform
class from the Node.js stream module. The transform
stream constructor accepts a function containing the data processing/transformation logic:
const fs = require("fs"); const { Transform } = require("stream"); const fileStream= fs.createReadStream("./file.txt"); const transformedData= fs.createWriteStream("./transformedData.txt"); const uppercase = new Transform({ transform(chunk, encoding, callback) { callback(null, chunk.toString().toUpperCase()); }, }); fileStream.pipe(uppercase).pipe(transformedData);
Here, we create a new transform
stream containing a function that expects three arguments: the first being the chunk
of data, the second is encoding
(which comes in handy if the chunk is a string), followed by a callback
which gets called with the transformed results.
Running the above snippet will transform all the text in ./file.txt
to uppercase then write it to transformedData.txt
.
If we run this script and we open the resulting file, we’ll see that all the text has been transformed to uppercase.
Piping streams is a vital technique used to connect multiple streams together. It comes in handy when we need to break down complex processing into smaller tasks and execute them sequentially. Node.js provides a native pipe
method for this purpose:
fileStream.pipe(uppercase).pipe(transformedData);
Refer to the code snippet under Composing transform streams for more detail on the above snippet.
Node 10 introduced the Pipeline API to enhance error handling with Node.js streams. The pipeline
method accepts any number of streams
followed by a callback
function that handles any errors in our pipeline
and will be executed once the pipeline
has been completed:
pipeline(...streams, callback) const fs = require("fs"); const { pipeline, Transform } = require("stream"); pipeline( streamA, streamB, streamC, (err) => { if (err) { console.error("An error occured in pipeline.", err); } else { console.log("Pipeline execcution successful"); } } );
When using pipeline
, the series of streams should be passed sequentially in the order in which they need to be executed.
We can also handle stream errors using pipes as follows:
const fs = require("fs"); const fileStream= fs.createReadStream("./file.txt"); let b = otherStreamType() let c = createWriteStream() fileStream.on('error', function(e){handleError(e)}) .pipe(b) .on('error', function(e){handleError(e)}) .pipe(c) .on('error', function(e){handleError(e)});
As seen in the above snippet, we have to create an error
event handler for each pipe
created. With this, we can keep track of the context for errors, which becomes useful when debugging. The drawback with this technique is its verbosity.
In this article, we’ve explored Node.js streams, when to use them, and how to implement them.
Knowledge of Node.js streams is essential because they are a great tool to rely on when handling large sets of data. Check out the Node.js API docs for more information about streams.
Deploying a Node-based web app or website is the easy part. Making sure your Node instance continues to serve resources to your app is where things get tougher. If you’re interested in ensuring requests to the backend or third-party services are successful, try LogRocket.
LogRocket is like a DVR for web and mobile apps, recording literally everything that happens while a user interacts with your app. Instead of guessing why problems happen, you can aggregate and report on problematic network requests to quickly understand the root cause.
LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. Start monitoring for free.
Hey there, want to help make our blog better?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up nowLearn how to implement one-way and two-way data binding in Vue.js, using v-model and advanced techniques like defineModel for better apps.
Compare Prisma and Drizzle ORMs to learn their differences, strengths, and weaknesses for data access and migrations.
It’s easy for devs to default to JavaScript to fix every problem. Let’s use the RoLP to find simpler alternatives with HTML and CSS.
Learn how to manage memory leaks in Rust, avoid unsafe behavior, and use tools like weak references to ensure efficient programs.