Home
>
Software Development
>
Node.js Readable Streams Explained – Simple Explaination For Beginners

January 5, 2023 by Phu Nguyen

Node.js Readable Streams Explained – Simple Explaination For Beginners

Main Contents:

Node.js streams can be a challenge for even the most experienced developers. While working with them in the development of TailFile, I found that the key to understanding streams is recognizing that there are many different components at play. The Node.js documentation on streams is extensive, but it can be difficult to locate all of the important details in one place.

Additionally, streams are also very stateful, so how they function does at times depend on the mode that they’re in. Hopefully, in this article, I can help clarify some of the confusion surrounding streams, focusing specifically on implementing read streams. It’s important to note that writable streams and filesystem streams may have different implementations for similar concepts.

Key Summary

Overview: The article by InApps Technology provides a comprehensive guide on Node.js streams, explaining their functionality, types, and practical applications, emphasizing their role in efficiently handling large data sets in Node.js applications.
What are Node.js Streams?:
- Streams are objects in Node.js that allow reading or writing data in chunks, rather than loading entire datasets into memory, enabling efficient processing of large files, network data, or real-time data.
- Designed to handle data asynchronously, improving performance and scalability for I/O-intensive tasks.
Types of Streams:
- Readable Streams: Used to read data sequentially (e.g., reading a file or HTTP request body).
  - Example: fs.createReadStream() for reading a large file in chunks.
- Writable Streams: Used to write data sequentially (e.g., writing to a file or HTTP response).
  - Example: fs.createWriteStream() for writing data to a file.
- Duplex Streams: Support both reading and writing (e.g., WebSocket connections).
  - Example: net.Socket for TCP connections.
- Transform Streams: A type of duplex stream that modifies or transforms data as it is read or written (e.g., compressing data).
  - Example: zlib.createGzip() for compressing data.
Key Concepts:
- Buffering: Streams process data in small chunks (buffers), typically 64KB for files or 16KB for network streams, reducing memory usage.
- Events: Streams emit events like data (new chunk available), end (reading complete), error (issues encountered), and finish (writing complete).
- Piping: Simplifies data transfer by connecting a readable stream to a writable stream (e.g., readable.pipe(writable)).
- Backpressure: Manages flow when a writable stream cannot process data as fast as a readable stream, preventing memory overload.
How to Use Streams:

Reading a File:
const fs = require(‘fs’);

const readStream = fs.createReadStream(‘largefile.txt’, { encoding: ‘utf8’ });

readStream.on(‘data’, (chunk) => console.log(`Received chunk: ${chunk}`));

readStream.on(‘end’, () => console.log(‘Finished reading’));

readStream.on(‘error’, (err) => console.error(`Error: ${err}`));

Writing to a File:
const fs = require(‘fs’);

const writeStream = fs.createWriteStream(‘output.txt’);

writeStream.write(‘Hello, Streams!’);

writeStream.end();

writeStream.on(‘finish’, () => console.log(‘Write completed’));

Piping Example:
const fs = require(‘fs’);

const readStream = fs.createReadStream(‘input.txt’);

const writeStream = fs.createWriteStream(‘output.txt’);

readStream.pipe(writeStream);

Transform Example (Compression):
const fs = require(‘fs’);

const zlib = require(‘zlib’);

fs.createReadStream(‘input.txt’)

.pipe(zlib.createGzip())

.pipe(fs.createWriteStream(‘input.txt.gz’));

Benefits:
- Memory Efficiency: Processes data in chunks, ideal for large files or streams (e.g., video streaming, log processing).
- Performance: Asynchronous handling improves speed and responsiveness for I/O operations.
- Scalability: Supports high-throughput applications like real-time analytics or file uploads.
- Cost Efficiency: Offshore development in Vietnam ($20-$50/hour via InApps Technology) optimizes costs for stream-based Node.js projects.
Use Cases:
- Streaming large media files (e.g., video or audio) in web applications.
- Processing log files in real-time for analytics or monitoring.
- Handling HTTP request/response data in APIs or web servers.
- Real-time data pipelines for IoT or financial applications.
Challenges:
- Complexity: Managing stream events and backpressure requires careful handling to avoid errors.
- Error Handling: Uncaught errors can crash streams, necessitating robust error listeners.
- Debugging: Tracing issues in asynchronous stream pipelines can be challenging.
Best Practices:
- Always handle error events to prevent application crashes.
- Use piping for simple read-write operations to reduce boilerplate code.
- Monitor backpressure with pause() and resume() or use libraries like stream.PassThrough.
- Test streams with realistic data sizes to ensure performance and stability.
InApps Technology’s Role:
- Offers expertise in Node.js development, including stream-based applications, ensuring efficient, scalable solutions.
- Leverages Vietnam’s 200,000+ IT professionals, providing cost-effective rates ($20-$50/hour) for 20-40% savings compared to U.S./EU ($80-$150/hour).
- Supports end-to-end development with Agile methodologies, using tools like Jira and Slack for collaboration.
Recommendations:
- Start with simple readable and writable streams to understand core concepts before tackling duplex or transform streams.
- Use built-in Node.js modules (e.g., fs, zlib) for common stream tasks to minimize dependencies.
- Profile memory usage and performance for large-scale stream applications.
- Partner with InApps Technology for expert Node.js stream development, leveraging Vietnam’s skilled developers for high-performance, cost-effective solutions.

What’s a Stream Implementation?

What is a stream implementation and what does it help in Node.js streams?

A readable implementation is a piece of code that extends Readable, which is the Node.js base class for reading streams. It can also be a simple call to the new Readable() constructor, if you want a custom stream without defining your own class. I’m sure plenty of you have used streams from the likes of HTTP res handlers to fs.createReadStream file streams. An implementation, however, needs to respect the rules for streams, namely that certain functions are overridden when the system calls them for stream flow situations. Let’s talk about what some of this looks like.

const {Readable} = require(‘stream’) // This data can also come from other streams :] let dataToStream = [ ‘This is line 1n' , ‘This is line 2n' , ‘This is line 3n' ] class MyReadable extends Readable { constructor(opts) { super(opts) } _read() { // The consumer is ready for more data this.push(dataToStream.shift()) if (!dataToStream.length) { this.push(null) // End the stream } } _destroy() { // Not necessary, but illustrates things to do on end dataToStream = null } } new MyReadable().pipe(process.stdout)

const {Readable} = require(‘stream’)

// This data can also come from other streams :]

let dataToStream = [

‘This is line 1n’

, ‘This is line 2n’

, ‘This is line 3n’

]

class MyReadable extends Readable {

constructor(opts) {

super(opts)

}

_read() {

// The consumer is ready for more data

this.push(dataToStream.shift())

if (!dataToStream.length) {

this.push(null) // End the stream

}

_destroy() {

// Not necessary, but illustrates things to do on end

dataToStream = null

}

new MyReadable().pipe(process.stdout)

The takeaways:

Of course, call super(opts)or nothing will work.
_read is required and is called automatically when new data is wanted.
Calling push(<some data>) will cause the data to go into an internal buffer, and it will be consumed when something, like a piped writable stream, wants it.
push(null) is required to end the read stream properly.
- An 'end' event will be emitted after this.
- A 'close' event will also be emitted unless emitClose: false was set in the constructor.
_destroy is optional for cleanup things. Never override destroy; always use the underscored method for this and for _read.

For such a simple implementation, there’s no need for the class. A class is more appropriate for things that are more complicated in terms of their underlying data resources, such as TailFile. This particular example can also be accomplished by constructing a Readable inline:

const {Readable} = require(‘stream’) // This data can also come from other streams :] let dataToStream = [ ‘This is line 1n' , ‘This is line 2n' , ‘This is line 3n' ] const myReadable = new Readable({ read() { this.push(dataToStream.shift()) if (!dataToStream.length) { this.push(null) // End the stream } } , destroy() { dataToStream = null } }) myReadable.pipe(process.stdout)

const {Readable} = require(‘stream’)

// This data can also come from other streams :]

let dataToStream = [

‘This is line 1n’

, ‘This is line 2n’

, ‘This is line 3n’

]

const myReadable = new Readable({

read() {

this.push(dataToStream.shift())

if (!dataToStream.length) {

this.push(null) // End the stream

}

, destroy() {

dataToStream = null

}

})

myReadable.pipe(process.stdout)

However, there’s one major problem with this code. If the data set were larger, from a file stream, for example, then this code is repeating a widespread mistake with node streams:

This doesn’t respect backpressure.

What’s Backpressure?

Remember the internal buffer that I mentioned above? This is an in-memory data structure that holds the streaming chunks of data — objects, strings or buffers. Its size is controlled by the highWaterMark property, and the default is 16KB of byte data, or 16 objects if the stream is in object mode. When data is pushed through the readable stream, the push method may return false. If so, that means that the highWaterMark is close to, or has been, exceeded, and that is called backpressure.

If that happens, it’s up to the implementation to stop pushing data and wait for the _read call to come, signifying that the consumer is ready for more data, so push calls can resume. This is where a lot of folks fail to implement streams properly. Here are a couple of tips about pushing data through read streams:

It’s not necessary to wait for _read to be called to push data as long as backpressure is respected. Data can continually be pushed until backpressure is reached. If the data size isn’t very large, it’s possible that backpressure will never be reached.
The data from the buffer will not be consumed until the stream is in reading mode. If data is being pushed, but there are no 'data' events and no pipe, then backpressure will certainly be reached if the data size exceeds the default buffer size.

This is an excerpt from TailFile, which reads chunks from the underlying resource until backpressure is reached or all the data is read. Upon backpressure, the stream is stored and reading is resumed when _read is called.

async _readChunks(stream) { for await (const chunk of stream) { this[kStartPos] += chunk.length if (!this.push(chunk)) { this[kStream] = stream this[kPollTimer] = null return } } // Chunks read successfully (no backpressure) return } _read() { if (this[kStream]) { this._readChunks(this[kStream]) } return }

async _readChunks(stream) {

for await (const chunk of stream) {

this[kStartPos] += chunk.length

if (!this.push(chunk)) {

this[kStream] = stream

this[kPollTimer] = null

return

}

// Chunks read successfully (no backpressure)

return

}

_read() {

if (this[kStream]) {

this._readChunks(this[kStream])

}

return

}

Streams-powered Node APIs

Due to their advantages, many Node.js core modules provide native stream handling capabilities, most notably:

net.Socket is the main node api that is stream are based on, which underlies most of the following APIs
process.stdin returns a stream connected to stdin
process.stdout returns a stream connected to stdout
process.stderr returns a stream connected to stderr
fs.createReadStream() creates a readable stream to a file
fs.createWriteStream() creates a writable stream to a file
net.connect() initiates a stream-based connection
http.request() returns an instance of the http.ClientRequest class, which is a writable stream
zlib.createGzip() compress data using gzip (a compression algorithm) into a stream
zlib.createGunzip() decompress a gzip stream.
zlib.createDeflate() compress data using deflate (a compression algorithm) into a stream
zlib.createInflate() decompress a deflate stream

Closing thoughts

node-js-streams

There’s much more to understand about Node.js streams, especially when it comes to writing streams, but the core concepts remain the same. Since the information about streams is scattered, it will be challenging to complicate all of them in one place.

As I write this, I cannot find the place where I learned that ‘push can be called continuously’, but trust me, it’s a thing, even though the backpressure doc below always recommends waiting for _read. The fact is, depending on what you’re trying to implement, the code becomes less clear-cut, but as long as backpressure rules are followed and methods are overridden as required, then you’re on the right track!

About InApps

InApps is an outsourcing and development company located in Vietnam. At InApps, we currently provide services on DevOps, Offshore Software Development, MVP App Development, and Custom App Development.

We’ve established the partnership with more than 60+ businesses from the US, the UK, Europe to Hongkong.

But more than being a partner, we aim to become the business’s companion. Therefore we continue to update helpful articles for those who are in need.

If you’re interested in Node.js topic, you might find these articles helpful:

[sociallocker id=”2721″]

List of Keywords users find our article on Google:

buffering wikipedia

how to receive a phone call with node

kstream

buffer in nodejs

hire unified.js developers

rust const

software implementation consultant resume

value stream manager resume

js industries

retail salesperson resume example

node stream to buffer

nodejs backpressure

linkedin nodejs

the push wikipedia

rust fs file

readable js

big js takeaways

node.js resume template

netnode

nodejs readable

js object push

node js readable stream

rust syscall

how to learn node js properly

words that end with js

node.js streams

streams node

js readablestream

readable stream

js push to object

salesperson resume example

buffer js

isline

nodejs class

automl python

filestream node

whatsapp js

[/sociallocker]

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.