node.js 流媒体
by Samer Buna
通过Samer Buna
Update: This article is now part of my book “Node.js Beyond The Basics”.
更新:这篇文章现在是我的书《超越基础的Node.js》的一部分。
Update: This article is now part of my book “Node.js Beyond The Basics”.
更新:这篇文章现在是我的书《超越基础的Node.js》的一部分。
Read the updated version of this content and more about Node at jscomplete.com/node-beyond-basics.
在jscomplete.com/node-beyond-basics中阅读此内容的更新版本以及有关Node的更多信息。
Node.js streams have a reputation for being hard to work with, and even harder to understand. Well I’ve got good news for you — that’s no longer the case.
Node.js流因难以使用甚至难以理解而享有盛誉。 好吧,我对您有个好消息-情况已不再如此。
Over the years, developers created lots of packages out there with the sole purpose of making working with streams easier. But in this article, I’m going to focus on the native Node.js stream API.
多年来,开发人员在那里创建了许多程序包,其唯一目的是简化流的工作。 但是在本文中,我将重点介绍本机Node.js流API 。
“Streams are Node’s best and most misunderstood idea.”
“流是Node最好的也是最容易被误解的想法。”
“Streams are Node’s best and most misunderstood idea.”
“流是Node最好的也是最容易被误解的想法。”
Streams are collections of data — just like arrays or strings. The difference is that streams might not be available all at once, and they don’t have to fit in memory. This makes streams really powerful when working with large amounts of data, or data that’s coming from an external source one chunk at a time.
流是数据的集合,就像数组或字符串一样。 不同之处在于,流可能不会一次全部可用,并且不必容纳在内存中。 当处理大量数据或一次来自一个外部数据块的数据时,这使流真正强大。
However, streams are not only about working with big data. They also give us the power of composability in our code. Just like we can compose powerful linux commands by piping other smaller Linux commands, we can do exactly the same in Node with streams.
但是,流不仅涉及处理大数据。 它们还使我们的代码具有可组合性。 就像我们可以通过传递其他较小的Linux命令来组成功能强大的linux命令一样,我们可以在Node中使用流进行完全相同的操作。
const grep = ... // A stream for the grep output
const wc = ... // A stream for the wc input
grep.pipe(wc)
Many of the built-in modules in Node implement the streaming interface:
Node中的许多内置模块都实现了流接口:
The list above has some examples for native Node.js objects that are also readable and writable streams. Some of these objects are both readable and writable streams, like TCP sockets, zlib and crypto streams.
上面的列表提供了一些本机Node.js对象的示例,这些对象也是可读和可写的流。 其中一些对象既是可读写流,例如TCP套接字,zlib和加密流。
Notice that the objects are also closely related. While an HTTP response is a readable stream on the client, it’s a writable stream on the server. This is because in the HTTP case, we basically read from one object (http.IncomingMessage
) and write to the other (http.ServerResponse
).
请注意,对象也密切相关。 虽然HTTP响应是客户端上的可读流,但它是服务器上的可写流。 这是因为在HTTP情况下,我们基本上从一个对象( http.IncomingMessage
)读取并写入另一个对象( http.ServerResponse
)。
Also note how the stdio
streams (stdin
, stdout
, stderr
) have the inverse stream types when it comes to child processes. This allows for a really easy way to pipe to and from these streams from the main process stdio
streams.
另请注意,对于子进程, stdio
流( stdin
, stdout
, stderr
)如何具有逆流类型。 这提供了一种非常简单的方法来从主流程stdio
流中stdio
传输这些流。
Theory is great, but often not 100% convincing. Let’s see an example demonstrating the difference streams can make in code when it comes to memory consumption.
理论很棒,但通常不是100%令人信服。 让我们看一个示例,该示例演示流在内存消耗方面可以在代码中产生的差异。
Let’s create a big file first:
让我们首先创建一个大文件:
const fs = require('fs');
const file = fs.createWriteStream('./big.file');
for(let i=0; i<= 1e6; i++) {
file.write('Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n');
}
file.end();
Look what I used to create that big file. A writable stream!
看看我用来创建那个大文件的东西。 可写流!
The fs
module can be used to read from and write to files using a stream interface. In the example above, we’re writing to that big.file
through a writable stream 1 million lines with a loop.
fs
模块可用于使用流接口读取和写入文件。 在上面的示例中,我们通过带有循环的100万行可写流写入big.file
。
Running the script above generates a file that’s about ~400 MB.
运行上面的脚本会生成一个大约400 MB的文件。
Here’s a simple Node web server designed to exclusively serve the big.file
:
这是一个简单的Node Web服务器,旨在专门为big.file
服务:
const fs = require('fs');
const server = require('http').createServer();
server.on('request', (req, res) => {
fs.readFile('./big.file', (err, data) => {
if (err) throw err;
res.end(data);
});
});
server.listen(8000);
When the server gets a request, it’ll serve the big file using the asynchronous method, fs.readFile
. But hey, it’s not like we’re blocking the event loop or anything. Every thing is great, right? Right?
服务器收到请求后,将使用异步方法fs.readFile
服务大文件。 但是,这并不是说我们要阻止事件循环或其他任何事情。 每件事都很棒,对吗? 对?
Well, let’s see what happens when we run the server, connect to it, and monitor the memory while doing so.
好吧,让我们看看运行服务器,连接到服务器并监视内存时会发生什么。
When I ran the server, it started out with a normal amount of memory, 8.7 MB:
运行服务器时,它以正常的内存量8.7 MB开头:
Then I connected to the server. Note what happened to the memory consumed:
然后我连接到服务器。 注意消耗的内存发生了什么:
Wow — the memory consumption jumped to 434.8 MB.
哇-内存消耗跃升至434.8 MB。
We basically put the whole big.file
content in memory before we wrote it out to the response object. This is very inefficient.
我们基本上将整个big.file
内容放入内存中,然后再将其写到响应对象中。 这是非常低效的。
The HTTP response object (res
in the code above) is also a writable stream. This means if we have a readable stream that represents the content of big.file
, we can just pipe those two on each other and achieve mostly the same result without consuming ~400 MB of memory.
HTTP响应对象(上面代码中的res
)也是可写流。 这意味着,如果我们有一个表示big.file
内容的可读流,我们可以将这两个管道相互big.file
,并获得几乎相同的结果,而无需占用约400 MB的内存。
Node’s fs
module can give us a readable stream for any file using the createReadStream
method. We can pipe that to the response object:
Node的fs
模块可以使用createReadStream
方法为我们提供任何文件的可读流。 我们可以通过管道将其传递给响应对象:
const fs = require('fs');
const server = require('http').createServer();
server.on('request', (req, res) => {
const src = fs.createReadStream('./big.file');
src.pipe(res);
});
server.listen(8000);
Now when you connect to this server, a magical thing happens (look at the memory consumption):
现在,当您连接到该服务器时,发生了一件神奇的事情(请查看内存消耗):
What’s happening?
发生了什么?
When a client asks for that big file, we stream it one chunk at a time, which means we don’t buffer it in memory at all. The memory usage grew by about 25 MB and that’s it.
当客户端请求该大文件时,我们一次将其流化为一个块,这意味着我们根本不会将其缓冲在内存中。 内存使用量增加了约25 MB,仅此而已。
You can push this example to its limits. Regenerate the big.file
with five million lines instead of just one million, which would take the file to well over 2 GB, and that’s actually bigger than the default buffer limit in Node.
您可以将此示例推到极限。 用500万行(而不是100万行)重新生成big.file
,这将使文件超过2 GB,这实际上大于Node中的默认缓冲区限制。
If you try to serve that file using fs.readFile
, you simply can’t, by default (you can change the limits). But with fs.createReadStream
, there is no problem at all streaming 2 GB of data to the requester, and best of all, the process memory usage will roughly be the same.
如果您尝试使用fs.readFile
提供该文件,则默认情况下完全不能(可以更改限制)。 但是使用fs.createReadStream
,将2 GB的数据流传输到请求者完全没有问题,而且最重要的是,进程内存使用情况将大致相同。
Ready to learn streams now?
准备好学习流媒体了吗?
This article is a write-up of part of my Pluralsight course about Node.js. I cover similar content in video format there.
本文是我的Pluralsight课程有关Node.js的一部分的文章 。 我在那里以视频格式介绍了类似的内容。
There are four fundamental stream types in Node.js: Readable, Writable, Duplex, and Transform streams.
Node.js中有四种基本流类型:可读,可写,双工和转换流。
A readable stream is an abstraction for a source from which data can be consumed. An example of that is the fs.createReadStream
method.
可读流是可以从中消费数据的源的抽象。 fs.createReadStream
方法就是一个例子。
A writable stream is an abstraction for a destination to which data can be written. An example of that is the fs.createWriteStream
method.
可写流是可向其写入数据的目标的抽象。 fs.createWriteStream
方法就是一个例子。
A transform stream is basically a duplex stream that can be used to modify or transform the data as it is written and read. An example of that is the zlib.createGzip
stream to compress the data using gzip. You can think of a transform stream as a function where the input is the writable stream part and the output is readable stream part. You might also hear transform streams referred to as “through streams.”
转换流基本上是一种双工流,可用于在写入和读取数据时修改或转换数据。 一个示例是zlib.createGzip
流,该流使用gzip压缩数据。 您可以将转换流视为函数,其中输入是可写流部分,输出是可读流部分。 您可能还会听到称为“ 通过流 ”的变换流 。
All streams are instances of EventEmitter
. They emit events that can be used to read and write data. However, we can consume streams data in a simpler way using the pipe
method.
所有流都是EventEmitter
实例。 它们发出可用于读取和写入数据的事件。 但是,我们可以使用pipe
方法以更简单的方式使用流数据。
Here’s the magic line that you need to remember:
这是您需要记住的魔术线:
readableSrc.pipe(writableDest)
In this simple line, we’re piping the output of a readable stream — the source of data, as the input of a writable stream — the destination. The source has to be a readable stream and the destination has to be a writable one. Of course, they can both be duplex/transform streams as well. In fact, if we’re piping into a duplex stream, we can chain pipe calls just like we do in Linux:
在这一简单的代码行中,我们将可读流的输出(数据源,作为可写流的输入),目的地传递到管道中。 源必须是可读流,而目的地必须是可写流。 当然,它们也可以都是双工/转换流。 实际上,如果我们将管道传输到双工流中,则可以像在Linux中那样链接管道调用:
readableSrc
.pipe(transformStream1)
.pipe(transformStream2)
.pipe(finalWrtitableDest)
The pipe
method returns the destination stream, which enabled us to do the chaining above. For streams a
(readable), b
and c
(duplex), and d
(writable), we can:
pipe
方法返回目标流,这使我们能够进行上述链接。 对于流a
(可读), b
和c
(双工)和d
(可写),我们可以:
a.pipe(b).pipe(c).pipe(d)
# Which is equivalent to:
a.pipe(b)
b.pipe(c)
c.pipe(d)
# Which, in Linux, is equivalent to:
$ a | b | c | d
The pipe
method is the easiest way to consume streams. It’s generally recommended to either use the pipe
method or consume streams with events, but avoid mixing these two. Usually when you’re using the pipe
method you don’t need to use events, but if you need to consume the streams in more custom ways, events would be the way to go.
pipe
方法是消耗流的最简单方法。 通常建议要么使用pipe
方法,要么使用带有事件的流,但是避免将两者混合使用。 通常,当您使用pipe
方法时,不需要使用事件,但是如果您需要以更多的自定义方式使用流,则可以使用事件。
Beside reading from a readable stream source and writing to a writable destination, the pipe
method automatically manages a few things along the way. For example, it handles errors, end-of-files, and the cases when one stream is slower or faster than the other.
除了从可读流源中读取并写入可写目标之外, pipe
方法还自动管理过程中的一些事情。 例如,它处理错误,文件结束以及一个流比另一个流慢或快的情况。
However, streams can also be consumed with events directly. Here’s the simplified event-equivalent code of what the pipe
method mainly does to read and write data:
但是,流也可以直接与事件一起使用。 这是pipe
方法主要用于读取和写入数据的简化事件等效代码:
# readable.pipe(writable)
readable.on('data', (chunk) => {
writable.write(chunk);
});
readable.on('end', () => {
writable.end();
});
Here’s a list of the important events and functions that can be used with readable and writable streams:
这是可与可读和可写流一起使用的重要事件和功能的列表:
The events and functions are somehow related because they are usually used together.
事件和功能以某种方式相关,因为它们通常一起使用。
The most important events on a readable stream are:
可读流上最重要的事件是:
The data
event, which is emitted whenever the stream passes a chunk of data to the consumer
data
事件,每当流将大量数据传递给使用者时发出
The end
event, which is emitted when there is no more data to be consumed from the stream.
end
事件,当没有更多数据要从流中使用时发出。
The most important events on a writable stream are:
可写流上最重要的事件是:
The drain
event, which is a signal that the writable stream can receive more data.
drain
事件,这是可写流可以接收更多数据的信号。
The finish
event, which is emitted when all data has been flushed to the underlying system.
finish
事件,当所有数据都已刷新到基础系统时发出。
Events and functions can be combined to make for a custom and optimized use of streams. To consume a readable stream, we can use the pipe
/unpipe
methods, or the read
/unshift
/resume
methods. To consume a writable stream, we can make it the destination of pipe
/unpipe
, or just write to it with the write
method and call the end
method when we’re done.
可以组合事件和功能以实现流的自定义和优化使用。 要消耗可读流,我们可以使用pipe
/ unpipe
方法或read
/ unshift
/ resume
方法。 要使用可写流,我们可以使其成为pipe
/ unpipe
的目的地,或者仅使用write
方法对其进行write
并在完成后调用end
方法。
Readable streams have two main modes that affect the way we can consume them:
可读流有两种主要模式会影响我们的消费方式:
They can be either in the paused mode
他们可以处于暂停模式
Or in the flowing mode
或在流动模式下
Those modes are sometimes referred to as pull and push modes.
这些模式有时称为拉动和推动模式。
All readable streams start in the paused mode by default but they can be easily switched to flowing and back to paused when needed. Sometimes, the switching happens automatically.
默认情况下,所有可读流都以暂停模式启动,但是可以轻松地将其切换为连续流,并在需要时返回暂停状态。 有时,切换会自动发生。
When a readable stream is in the paused mode, we can use the read()
method to read from the stream on demand, however, for a readable stream in the flowing mode, the data is continuously flowing and we have to listen to events to consume it.
当可读流处于暂停模式时,我们可以使用read()
方法按需从流中读取数据,但是,对于处于流模式的可读流,数据一直在流动,我们必须侦听事件消耗它。
In the flowing mode, data can actually be lost if no consumers are available to handle it. This is why, when we have a readable stream in flowing mode, we need a data
event handler. In fact, just adding a data
event handler switches a paused stream into flowing mode and removing the data
event handler switches the stream back to paused mode. Some of this is done for backward compatibility with the older Node streams interface.
在流动模式下,如果没有使用者可以处理数据,则实际上可能会丢失数据。 这就是为什么当我们在流模式下具有可读流时,需要一个data
事件处理程序。 实际上,仅添加data
事件处理程序会将暂停的流切换为流模式,而删除data
事件处理程序会将流切换回暂停模式。 这样做是为了与较旧的Node stream接口向后兼容。
To manually switch between these two stream modes, you can use the resume()
and pause()
methods.
要在这两种流模式之间手动切换,可以使用resume()
和pause()
方法。
When consuming readable streams using the pipe
method, we don’t have to worry about these modes as pipe
manages them automatically.
使用pipe
方法使用可读流时,我们不必担心这些模式,因为pipe
自动对其进行管理。
When we talk about streams in Node.js, there are two main different tasks:
当我们谈论Node.js中的流时,有两个主要的不同任务:
The task of implementing the streams.
实现流的任务。
The task of consuming them.
消耗它们的任务。
So far we’ve been talking about only consuming streams. Let’s implement some!
到目前为止,我们一直在谈论仅消耗流。 让我们实现一些!
Stream implementers are usually the ones who require
the stream
module.
流实现者通常是require
stream
模块的人。
To implement a writable stream, we need to to use the Writable
constructor from the stream module.
要实现可写流,我们需要使用流模块中的Writable
构造函数。
const { Writable } = require('stream');
We can implement a writable stream in many ways. We can, for example, extend the Writable
constructor if we want
我们可以通过多种方式实现可写流。 例如,如果需要,我们可以扩展Writable
构造函数
class myWritableStream extends Writable {
}
However, I prefer the simpler constructor approach. We just create an object from the Writable
constructor and pass it a number of options. The only required option is a write
function which exposes the chunk of data to be written.
但是,我更喜欢简单的构造方法。 我们只是从Writable
构造函数创建一个对象,并将其传递给许多选项。 唯一需要的选项是write
函数,该函数公开write
的数据块。
const { Writable } = require('stream');
const outStream = new Writable({
write(chunk, encoding, callback) {
console.log(chunk.toString());
callback();
}
});
process.stdin.pipe(outStream);
This write method takes three arguments.
此write方法带有三个参数。
The chunk is usually a buffer unless we configure the stream differently.
除非我们以不同的方式配置流,否则块通常是缓冲区。
The encoding argument is needed in that case, but usually we can ignore it.
在这种情况下,需要使用encoding参数,但是通常我们可以忽略它。
The callback is a function that we need to call after we’re done processing the data chunk. It’s what signals whether the write was successful or not. To signal a failure, call the callback with an error object.
回调是在处理完数据块之后需要调用的函数。 这是写操作是否成功的信号。 要发出失败信号,请使用错误对象调用回调。
In outStream
, we simply console.log
the chunk as a string and call the callback
after that without an error to indicate success. This is a very simple and probably not so useful echo stream. It will echo back anything it receives.
在outStream
,我们只需将console.log
块作为字符串进行记录,然后在没有错误的情况下调用callback
以指示成功。 这是一个非常简单且可能不太有用的回声流。 它将回显收到的任何内容。
To consume this stream, we can simply use it with process.stdin
, which is a readable stream, so we can just pipe process.stdin
into our outStream
.
要使用这个流,我们可以简单地使用它process.stdin
,这是一个可流,所以我们只要管process.stdin
到我们outStream
。
When we run the code above, anything we type into process.stdin
will be echoed back using the outStream
console.log
line.
当我们运行上面的代码时,我们输入到process.stdin
任何内容都将使用outStream
console.log
行回outStream
。
This is not a very useful stream to implement because it’s actually already implemented and built-in. This is very much equivalent to process.stdout
. We can just pipe stdin
into stdout
and we’ll get the exact same echo feature with this single line:
这不是一个非常有用的实现流,因为它实际上已经实现并且是内置的。 这非常等同于process.stdout
。 我们可以将stdin
管道stdin
到stdout
并通过以下这一行获得完全相同的回显功能:
process.stdin.pipe(process.stdout);
To implement a readable stream, we require the Readable
interface, and construct an object from it, and implement a read()
method in the stream’s configuration parameter:
为了实现可读流,我们需要Readable
接口,并从中构造一个对象,并在流的配置参数中实现read()
方法:
const { Readable } = require('stream');
const inStream = new Readable({
read() {}
});
There is a simple way to implement readable streams. We can just directly push
the data that we want the consumers to consume.
有一种实现可读流的简单方法。 我们可以直接push
我们希望消费者使用的数据。
const { Readable } = require('stream');
const inStream = new Readable({
read() {}
});
inStream.push('ABCDEFGHIJKLM');
inStream.push('NOPQRSTUVWXYZ');
inStream.push(null); // No more data
inStream.pipe(process.stdout);
When we push
a null
object, that means we want to signal that the stream does not have any more data.
当我们push
一个null
对象时,这意味着我们要发信号通知该流不再有任何数据。
To consume this simple readable stream, we can simply pipe it into the writable stream process.stdout
.
为了使用这个简单的可读流,我们可以简单地将其通过管道写入可写流process.stdout
。
When we run the code above, we’ll be reading all the data from inStream
and echoing it to the standard out. Very simple, but also not very efficient.
当我们运行上面的代码时,我们将从inStream
读取所有数据,并将其回显到标准输出中。 很简单,但也不是很有效。
We’re basically pushing all the data in the stream before piping it to process.stdout
. The much better way is to push data on demand, when a consumer asks for it. We can do that by implementing the read()
method in the configuration object:
我们基本上是在将数据流传输到process.stdout
之前推送流中的所有数据。 更好的方法是在消费者要求时按需推送数据。 我们可以通过在配置对象中实现read()
方法来做到这一点:
const inStream = new Readable({
read(size) {
// there is a demand on the data... Someone wants to read it.
}
});
When the read method is called on a readable stream, the implementation can push partial data to the queue. For example, we can push one letter at a time, starting with character code 65 (which represents A), and incrementing that on every push:
在可读流上调用read方法时,实现可以将部分数据推送到队列。 例如,我们可以一次推送一个字母,从字符代码65(代表A)开始,并在每次推送时递增一次:
const inStream = new Readable({
read(size) {
this.push(String.fromCharCode(this.currentCharCode++));
if (this.currentCharCode > 90) {
this.push(null);
}
}
});
inStream.currentCharCode = 65;
inStream.pipe(process.stdout);
While the consumer is reading a readable stream, the read
method will continue to fire, and we’ll push more letters. We need to stop this cycle somewhere, and that’s why an if statement to push null when the currentCharCode is greater than 90 (which represents Z).
当使用者读取可读流时, read
方法将继续触发,我们将推送更多字母。 我们需要在某个地方停止此循环,这就是为什么当currentCharCode大于90(代表Z)时,如果if语句将null推入的原因。
This code is equivalent to the simpler one we started with but now we’re pushing data on demand when the consumer asks for it. You should always do that.
该代码与我们开始时使用的简单代码等效,但是现在我们在消费者要求时按需推送数据。 您应该始终这样做。
With Duplex streams, we can implement both readable and writable streams with the same object. It’s as if we inherit from both interfaces.
使用Duplex流,我们可以使用同一对象实现可读流和可写流。 就像我们从两个接口继承一样。
Here’s an example duplex stream that combines the two writable and readable examples implemented above:
这是一个示例双向流示例,结合了上面实现的两个可写和可读示例:
const { Duplex } = require('stream');
const inoutStream = new Duplex({
write(chunk, encoding, callback) {
console.log(chunk.toString());
callback();
},
read(size) {
this.push(String.fromCharCode(this.currentCharCode++));
if (this.currentCharCode > 90) {
this.push(null);
}
}
});
inoutStream.currentCharCode = 65;
process.stdin.pipe(inoutStream).pipe(process.stdout);
By combining the methods, we can use this duplex stream to read the letters from A to Z and we can also use it for its echo feature. We pipe the readable stdin
stream into this duplex stream to use the echo feature and we pipe the duplex stream itself into the writable stdout
stream to see the letters A through Z.
通过组合这些方法,我们可以使用此双工流从A到Z读取字母,也可以将其用于回波功能。 我们将可读的stdin
流通过管道传送到此双工流以使用回显功能,然后将双工流本身通过管道写入可写的stdout
流以查看字母A到Z。
It’s important to understand that the readable and writable sides of a duplex stream operate completely independently from one another. This is merely a grouping of two features into an object.
重要的是要理解,双工流的可读和可写面彼此完全独立地运行。 这仅仅是将两个特征组合成一个对象。
A transform stream is the more interesting duplex stream because its output is computed from its input.
转换流是更有趣的双工流,因为其输出是根据其输入计算的。
For a transform stream, we don’t have to implement the read
or write
methods, we only need to implement a transform
method, which combines both of them. It has the signature of the write
method and we can use it to push
data as well.
对于转换流,我们不必实现read
或write
方法,只需要实现将两者结合在一起的transform
方法即可。 它具有write
方法的签名,我们也可以使用它来push
数据。
Here’s a simple transform stream which echoes back anything you type into it after transforming it to upper case format:
这是一个简单的转换流,它将转换为大写格式后回显您键入的任何内容:
const { Transform } = require('stream');
const upperCaseTr = new Transform({
transform(chunk, encoding, callback) {
this.push(chunk.toString().toUpperCase());
callback();
}
});
process.stdin.pipe(upperCaseTr).pipe(process.stdout);
In this transform stream, which we’re consuming exactly like the previous duplex stream example, we only implemented a transform()
method. In that method, we convert the chunk
into its upper case version and then push
that version as the readable part.
与前面的双工流示例完全一样,在此转换流中,我们仅实现了transform()
方法。 在该方法中,我们将chunk
转换为大写版本,然后将该版本push
为可读部分。
By default, streams expect Buffer/String values. There is an objectMode
flag that we can set to have the stream accept any JavaScript object.
默认情况下,流需要缓冲区/字符串值。 我们可以设置一个objectMode
标志来使流接受任何JavaScript对象。
Here’s a simple example to demonstrate that. The following combination of transform streams makes for a feature to map a string of comma-separated values into a JavaScript object. So “a,b,c,d”
becomes {a: b, c: d}
.
这是一个简单的例子来证明这一点。 以下转换流组合使功能可以将逗号分隔的值字符串映射到JavaScript对象。 因此, “a,b,c,d”
变为{a: b, c: d}
。
const { Transform } = require('stream');
const commaSplitter = new Transform({
readableObjectMode: true,
transform(chunk, encoding, callback) {
this.push(chunk.toString().trim().split(','));
callback();
}
});
const arrayToObject = new Transform({
readableObjectMode: true,
writableObjectMode: true,
transform(chunk, encoding, callback) {
const obj = {};
for(let i=0; i < chunk.length; i+=2) {
obj[chunk[i]] = chunk[i+1];
}
this.push(obj);
callback();
}
});
const objectToString = new Transform({
writableObjectMode: true,
transform(chunk, encoding, callback) {
this.push(JSON.stringify(chunk) + '\n');
callback();
}
});
process.stdin
.pipe(commaSplitter)
.pipe(arrayToObject)
.pipe(objectToString)
.pipe(process.stdout)
We pass the input string (for example, “a,b,c,d”
) through commaSplitter
which pushes an array as its readable data ([“a”, “b”, “c”, “d”]
). Adding the readableObjectMode
flag on that stream is necessary because we’re pushing an object there, not a string.
我们通过commaSplitter
传递输入字符串(例如“a,b,c,d”
), commaSplitter
将数组作为其可读数据( [“a”, “b”, “c”, “d”]
)推送。 必须在该流上添加readableObjectMode
标志,因为我们在此处推送一个对象,而不是字符串。
We then take the array and pipe it into the arrayToObject
stream. We need a writableObjectMode
flag to make that stream accept an object. It’ll also push an object (the input array mapped into an object) and that’s why we also needed the readableObjectMode
flag there as well. The last objectToString
stream accepts an object but pushes out a string, and that’s why we only needed a writableObjectMode
flag there. The readable part is a normal string (the stringified object).
然后,我们获取数组并将其通过arrayToObject
传输到arrayToObject
流。 我们需要一个writableObjectMode
标志,以使该流接受一个对象。 它还将推送一个对象(将输入数组映射到一个对象),这就是为什么我们在那里也需要readableObjectMode
标志的原因。 最后一个objectToString
流接受一个对象但推出一个字符串,这就是为什么我们在那里只需要writableObjectMode
标志的原因。 可读部分是普通字符串(字符串化的对象)。
Node has a few very useful built-in transform streams. Namely, the zlib and crypto streams.
Node有一些非常有用的内置转换流。 即zlib和crypto流。
Here’s an example that uses the zlib.createGzip()
stream combined with the fs
readable/writable streams to create a file-compression script:
这是一个使用zlib.createGzip()
流和fs
可读/可写流结合以创建文件压缩脚本的示例:
const fs = require('fs');
const zlib = require('zlib');
const file = process.argv[2];
fs.createReadStream(file)
.pipe(zlib.createGzip())
.pipe(fs.createWriteStream(file + '.gz'));
You can use this script to gzip any file you pass as the argument. We’re piping a readable stream for that file into the zlib built-in transform stream and then into a writable stream for the new gzipped file. Simple.
您可以使用此脚本对作为参数传递的任何文件进行gzip压缩。 我们正在将该文件的可读流传递到zlib内置转换流中,然后传递给新gzip压缩文件的可写流。 简单。
The cool thing about using pipes is that we can actually combine them with events if we need to. Say, for example, I want the user to see a progress indicator while the script is working and a “Done” message when the script is done. Since the pipe
method returns the destination stream, we can chain the registration of events handlers as well:
使用管道的好处是,我们可以根据需要将它们与事件实际结合。 举例来说,我希望用户在脚本运行时看到进度指示器,并在脚本完成时看到“完成”消息。 由于pipe
方法返回目标流,因此我们还可以链接事件处理程序的注册:
const fs = require('fs');
const zlib = require('zlib');
const file = process.argv[2];
fs.createReadStream(file)
.pipe(zlib.createGzip())
.on('data', () => process.stdout.write('.'))
.pipe(fs.createWriteStream(file + '.zz'))
.on('finish', () => console.log('Done'));
So with the pipe
method, we get to easily consume streams, but we can still further customize our interaction with those streams using events where needed.
因此,通过pipe
方法,我们可以轻松使用流,但仍可以根据需要使用事件进一步定制与这些流的交互。
What’s great about the pipe
method though is that we can use it to compose our program piece by piece, in a much readable way. For example, instead of listening to the data
event above, we can simply create a transform stream to report progress, and replace the .on()
call with another .pipe()
call:
不过, pipe
方法的优点在于,我们可以使用它以一种易于阅读的方式来逐段组成程序。 例如,我们不必监听上面的data
事件,而只需创建一个转换流来报告进度,然后将.on()
调用替换为另一个.pipe()
调用:
const fs = require('fs');
const zlib = require('zlib');
const file = process.argv[2];
const { Transform } = require('stream');
const reportProgress = new Transform({
transform(chunk, encoding, callback) {
process.stdout.write('.');
callback(null, chunk);
}
});
fs.createReadStream(file)
.pipe(zlib.createGzip())
.pipe(reportProgress)
.pipe(fs.createWriteStream(file + '.zz'))
.on('finish', () => console.log('Done'));
This reportProgress
stream is a simple pass-through stream, but it reports the progress to standard out as well. Note how I used the second argument in the callback()
function to push the data inside the transform()
method. This is equivalent to pushing the data first.
这个reportProgress
流是一个简单的传递流,但是它也将进度报告为标准输出。 请注意,我如何在callback()
函数中使用第二个参数将数据推入transform()
方法内。 这等效于先推送数据。
The applications of combining streams are endless. For example, if we need to encrypt the file before or after we gzip it, all we need to do is pipe another transform stream in that exact order that we needed. We can use Node’s crypto
module for that:
合并流的应用是无止境的。 例如,如果我们需要在gzip压缩之前或之后对文件进行加密,那么我们所需要做的就是按照我们所需的确切顺序传递另一个转换流。 我们可以为此使用Node的crypto
模块:
const crypto = require('crypto');
// ...
fs.createReadStream(file)
.pipe(zlib.createGzip())
.pipe(crypto.createCipher('aes192', 'a_secret'))
.pipe(reportProgress)
.pipe(fs.createWriteStream(file + '.zz'))
.on('finish', () => console.log('Done'));
The script above compresses and then encrypts the passed file and only those who have the secret can use the outputted file. We can’t unzip this file with the normal unzip utilities because it’s encrypted.
上面的脚本先压缩然后加密传递的文件,只有拥有机密的人才能使用输出的文件。 我们无法使用常规的解压缩工具对该文件进行解压缩,因为该文件已加密。
To actually be able to unzip anything zipped with the script above, we need to use the opposite streams for crypto and zlib in a reverse order, which is simple:
为了真正能够解压缩上面脚本中压缩的任何内容,我们需要以相反的顺序将相反的流用于crypto和zlib,这很简单:
fs.createReadStream(file)
.pipe(crypto.createDecipher('aes192', 'a_secret'))
.pipe(zlib.createGunzip())
.pipe(reportProgress)
.pipe(fs.createWriteStream(file.slice(0, -3)))
.on('finish', () => console.log('Done'));
Assuming the passed file is the compressed version, the code above will create a read stream from that, pipe it into the crypto createDecipher()
stream (using the same secret), pipe the output of that into the zlib createGunzip()
stream, and then write things out back to a file without the extension part.
假设传递的文件是压缩版本,则上面的代码将创建一个读取流,将其通过管道createGunzip()
到crypto createDecipher()
流(使用相同的秘密),将其输出管道通过zlib createGunzip()
流,然后然后将内容写回到没有扩展名部分的文件中。
That’s all I have for this topic. Thanks for reading! Until next time!
这就是我要做的所有事情。 谢谢阅读! 直到下一次!
Learning React or Node? Checkout my books:
学习React还是Node? 结帐我的书:
翻译自: https://www.freecodecamp.org/news/node-js-streams-everything-you-need-to-know-c9141306be93/
node.js 流媒体