当前位置: 首页 > 工具软件 > aws-gate > 使用案例 >

aws lambda使用_如何使用AWS Lambda为发布/订阅消息选择最佳事件源

廉博赡
2023-12-01

aws lambda使用

by Yan Cui

崔燕

如何使用AWS Lambda为发布/订阅消息选择最佳事件源 (How to choose the best event source for pub/sub messaging with AWS Lambda)

AWS offers a wealth of options for imple­ment­ing mes­sag­ing pat­terns such as Publish/Subscribe (often short­ened to pub/sub) with AWS Lamb­da. In this article, we’ll com­pare and con­trast some of these options.

AWS提供了许多选项来实现消息传递模式,例如使用AWS Lambda实现Publish/Subscribe (通常简称为Publish/Subscribe )。 在本文中,我们将比较和对比其中一些选项。

发布/订阅模式 (The pub/sub pattern)

Pub/Sub is a mes­sag­ing pat­tern where pub­lish­ers and sub­scribers are decou­pled through an inter­me­di­ary message bro­ker (ZeroMQ, Rab­bit­MQ, SNS, and so on).

发布/订阅是一种消息传递模式,其中发布者和订阅者通过中间消息代理(ZeroMQ,RabbitMQ,SNS等)解耦。

In the AWS ecosys­tem, the obvi­ous can­di­date for the bro­ker role is Simple Notification Service (SNS).

在AWS生态系统中,代理角色的明显候选者是简单通知服务(SNS)。

SNS will make three attempts for your Lambda func­tion to process a mes­sage before send­ing it to a Dead Let­ter Queue (DLQ) if a DLQ is spec­i­fied for the func­tion. How­ev­er, accord­ing to an analy­sis by the folks at Ops­Ge­nie, the number of retries can be as many as six.

如果为该函数指定了DLQ,SNS将在您的Lambda函数将消息发送到死信队列(DLQ)之前进行三次尝试。 但是,根据OpsGenie员工的分析 ,重试次数可以多达6次。

Anoth­er thing to con­sid­er is the degree of par­al­lelism this set­up offers. For each mes­sage, SNS will cre­ate a new invo­ca­tion of your func­tion. So if you pub­lish 100 mes­sages to SNS, then you can have 100 con­cur­rent exe­cu­tions of the sub­scribed Lamb­da func­tion.

要考虑的另一件事是此设置提供的并行度。 对于每条消息,SNS都会创建一个新的函数调用。 因此,如果将100条消息发布到SNS,则可以同时执行100个并发执行的所预订Lambda函数。

This is great if you’re opti­mis­ing for through­put.

如果您要优化吞吐量,那就太好了。

How­ev­er, we’re often con­strained by the max through­put our down­stream depen­den­cies can han­dle — data­bas­es, S3, internal/external ser­vices, and so on.

但是,我们经常受到下游依赖项可以处理的最大吞吐量的约束 -数据库,S3,内部/外部服务等。

If the burst in through­put is short, then there’s a good chance the retries would be suf­fi­cient (there’s a ran­domized, expo­nen­tial back off between retries too) and you won’t miss any mes­sages.

如果吞吐量突发很短,那么重试就很有可能就足够了(重试之间也有随机的,指数的退避),并且您不会错过任何消息。

If the burst in through­put is sus­tained over a long peri­od of time, then you can exhaust the max number of retries. At this point you’ll have to rely on the DLQ and pos­si­bly human inter­ven­tion in order to recov­er the mes­sages that couldn’t be processed the first time around.

如果吞吐量的爆发持续了很长时间,那么您可以用尽最大的重试次数。 此时,您必须依靠DLQ以及可能的人工干预来恢复第一次无法处理的消息。

Sim­i­lar­ly, if the down­stream depen­den­cy expe­ri­ences an out­age, then all mes­sages received and retried dur­ing the out­age are bound to fail.

同样,如果下游依赖项发生中断,则在中断期间接收和重试的所有消息都必然会失败。

You can also run into the Lamb­da lim­it on the number of con­cur­rent exe­cu­tions in a region. Since this is an account-wide lim­it, it will also impact your oth­er sys­tems within the account that rely on AWS Lamb­da: APIs, event pro­cess­ing, cron jobs, and so on.

您还可以在区域中并发执行次数遇到Lambda限制 。 由于这是一个帐户范围的限制,因此也会影响帐户中依赖AWS Lambda的其他系统:API,事件处理,cron作业等。

SNS is also prone to suf­fer from tem­po­ral issues, like bursts in traf­fic, down­stream out­age, and so on. Kine­sis, on the oth­er hand, deals with these issues much bet­ter as described below:

SNS还容易受到时间问题的困扰,例如流量突发,下游中断等。 另一方面,Kinesis可以更好地处理这些问题,如下所述:

  • The degree of par­al­lelism is con­strained by the number of shards, which can be used to amor­tize bursts in the mes­sage rate

    并行度受分片数量的限制,可用于分摊消息速率中的突发
  • Records are retried until suc­cess is achieved, unless the out­age lasts longer than the reten­tion pol­i­cy you have on the stream (the default is 24 hours). You will even­tu­al­ly be able to process the records

    将重试记录,直到成功为止,除非中断的持续时间长于您在流中拥有的保留策略(默认为24小时)。 您最终将能够处理记录

But Kine­sis Streams is not with­out its own prob­lems. In fact, from my expe­ri­ence using Kine­sis Streams with Lamb­da, I have found a number of caveats that need­ to be under­stood in order to make effec­tive use of the service.

但是Kinesis Streams并非没有自己的问题。 实际上,根据我将Kinesis Streams与Lambda结合使用的经验,我发现了一些需要注意的注意事项,以便有效地使用该服务。

You can read about these caveats in another article I wrote here.

您可以在我在这里写的另一篇文章中了解这些警告。

Inter­est­ing­ly, Kine­sis Streams is not the only stream­ing option avail­able on AWS. There is also DynamoDB Streams.

有趣的是,Kinesis Streams并不是AWS上唯一可用的流媒体选项。 还有DynamoDB流。

By and large, DynamoDB Streams + Lamb­da works the same way as Kine­sis Streams + Lamb­da. Oper­a­tional­ly, it does have some inter­est­ing twists:

总的来说,DynamoDB Streams + Lambda的工作方式与Kinesis Streams + Lambda相同。 在操作上,确实有一些有趣的转折:

  • DynamoDB Streams auto scales the number of shards

    DynamoDB Streams自动缩放分片数量
  • If you’re pro­cess­ing DynamoDB Streams with AWS Lamb­da, then you don’t pay for the reads from DynamoDB Streams (but you still pay for the read and write capac­i­ty units for the DynamoDB table itself)

    如果您要使用AWS Lambda处理DynamoDB Streams,则无需为DynamoDB Streams的读取付费(但是您仍需为DynamoDB表本身的读写容量单位付费)
  • Kine­sis Streams offers the option to extend data reten­tion to 7 days, but DynamoDB Streams doesn’t offer such an option

    Kinesis Streams提供了将数据保留期延长至7天的选项,但DynamoDB Streams不提供这种选择

The fact that DynamoDB Streams auto scales the number of shards can be a dou­ble-edged sword. On one hand, it elim­i­nates the need for you to man­age and scale the stream (or come up with home-baked auto scal­ing solu­tions). But on the oth­er hand, it can also dimin­ish the abil­i­ty to amor­tize spikes in the load you pass on to down­stream sys­tems.

DynamoDB流自动扩展分片数量的事实可能是一把双刃剑。 一方面,它消除了您管理和扩展流(或提供自制的自动扩展解决方案 )的需要。 但另一方面,它也可能削弱摊销传递给下游系统的负载中的峰值的能力。

As far as I know, there is no way to lim­it the number of shards a DynamoDB stream can scale up to, which is some­thing you’d sure­ly con­sid­er when imple­ment­ing your own auto scal­ing solu­tion.

据我所知,没有办法限制DynamoDB流可以扩展的分片数量,这是您在实现自己的自动扩展解决方案时一定要考虑的问题。

I think the most per­ti­nent ques­tion is, “what is your source of truth?”

我认为最相关的问题是: “what is your source of truth?”

Does a row being writ­ten in DynamoDB make it canon to the state of your system? This is cer­tain­ly the case in most N-tier sys­tems that are built around a data­base, regard­less of whether it’s an RDBMS or NoSQL database.

在DynamoDB中写的行是否使其符合系统状态? 在围绕数据库构建的大多数N层系统中,无论是RDBMS数据库还是NoSQL数据库,都肯定是这种情况。

In an event-sourced sys­tem where state is mod­eled as a sequence of events (as opposed to a snap­shot), the source of truth might well be the Kine­sis stream. For example, as soon as an event is writ­ten to the stream, it’s con­sid­ered canon to the state of the sys­tem.

在将事件建模为一系列事件(而不是快照)的事件源系统中,真理的源头很可能就是Kinesis流。 例如,一旦将事件写入流,就将其视为系统状态的标准。

Then, there’re oth­er con­sid­er­a­tions around cost, auto-scal­ing, and so on.

然后,还有其他一些有关成本,自动缩放等方面的考虑。

From a devel­op­ment point of view, DynamoDB streams also have some limitations and short­com­ings:

从开发的角度来看,DynamoDB流也有一些限制和不足:

  • Each stream is lim­it­ed to events from one table

    每个流仅限于一个表中的事件
  • The records describe DynamoDB events and not events from your domain, which I’ve always felt cre­ates a sense of dis­so­nance when work­ing with these events

    记录描述的是DynamoDB事件,而不是您域中的事件,在处理这些事件时,我总是觉得这会产生不和谐感

Exclud­ing the cost of Lamb­da invo­ca­tions for pro­cess­ing the mes­sages, here are some cost pro­jec­tions for using SNS vs Kine­sis Streams vs DynamoDB Streams as the bro­ker. I’m mak­ing the assump­tion that through­put is con­sis­tent, and that each mes­sage is 1KB in size.

不包括用于处理消息的Lambda调用的成本,以下是一些使用SNS vs Kinesis Streams vs DynamoDB Streams作为代理的成本预测。 我假设吞吐量是一致的,每个消息的大小为1KB。

Month­ly cost at 1 msg/s

每月费用为1 msg / s

Month­ly cost at 1,000 msg/s

每月费用为1,000 msg / s

These pro­jec­tions should not be tak­en at face val­ue. For starters, the assump­tion about a per­fect­ly con­sis­tent through­put and mes­sage size is unre­al­is­tic, and you’ll need some head­room with Kine­sis and DynamoDB streams even if you’re not hit­ting the throt­tling lim­its.

这些预测不应以票面价值为依据。 对于初学者来说,关于吞吐量和消息大小完全一致的假设是不现实的,并且即使您未达到节流限制,您仍需要Kinesis和DynamoDB流有一定的余量。

That said, what these pro­jec­tions do tell me is that:

也就是说,这些预测告诉我的是:

  1. You get an awful lot with each shard in Kine­sis streams

    Kinesis流中的每个分片都会给您带来很多麻烦
  2. While there’s a base­line cost for using Kine­sis streams, the cost goes down when usage scales up as com­pared to SNS and DynamoDB streams, thanks to the sig­nif­i­cant­ly low­er cost per mil­lion requests

    尽管使用Kinesis流有基准成本,但与SNS和DynamoDB流相比,使用量增加时成本降低,这是因为每百万个请求的成本大大降低

Whilst SNS, Kine­sis, and DynamoDB streams are your basic choic­es for the bro­ker, Lamb­da func­tions can also act as bro­kers in their own right and prop­a­gate events to oth­er ser­vices.

尽管SNS,Kinesis和DynamoDB流是代理的基本选择,但Lambda函数也可以自己充当代理并将事件传播到其他服务。

This is the approach used by the aws-lamb­da-fanout project from awslabs. It allows you to prop­a­gate events from Kine­sis and DynamoDB streams to oth­er ser­vices that can­not direct­ly sub­scribe to the three basic choice of bro­kers (either because of account/region lim­i­ta­tions or that they’re just not sup­port­ed).

这是awslabs的aws-lambda-fanout项目使用的方法。 它允许您将事件从Kinesis和DynamoDB流传播到其他不能直接订阅代理的三种基本选择的服务(由于帐户/区域限制或不支持它们)。

While it’s a nice idea and def­i­nite­ly meets some spe­cif­ic needs, it’s worth bear­ing in mind the extra com­plex­i­ties it intro­duces, such as han­dling par­tial fail­ures, deal­ing with down­stream out­ages, mis­con­fig­u­ra­tions, and so on.

尽管这是一个不错的主意,并且肯定可以满足某些特定需求,但值得记住的是它引入了额外的复杂性,例如处理部分故障,处理下游中断,配置错误等。

结论 (Conclusion)

So what is the best event source for doing pub-sub with AWS Lamb­da? Like most tech deci­sions, it depends on the prob­lem you’re try­ing to solve, and the con­straints you’re work­ing with.

那么,使用AWS Lambda进行发布订阅的最佳事件源是什么? 像大多数高科技的决定,这取决于你试图解决的问题 ,而你正在使用的限制

In this post, we looked at SNS, Kine­sis Streams, and DynamoDB Streams as can­di­dates for the bro­ker role. We walked through a num­ber of sce­nar­ios to see how the choice of event source affects scal­a­bil­i­ty, par­al­lelism, and resilience against tem­po­ral issues and cost.

在这篇文章中,我们研究了SNS,Kinesis Streams和DynamoDB Streams作为代理角色的候选人。 我们遍历了许多场景,以了解事件源的选择如何影响可伸缩性,并行性以及针对时间问题和成本的弹性。

You should now have a much bet­ter under­stand­ing of the trade­offs between various event sources when work­ing with Lamb­da.

现在,使用Lambda时,您应该对各种事件源之间的权衡有了更好的了解。

Until next time!

直到下一次!

翻译自: https://www.freecodecamp.org/news/how-to-choose-the-best-event-source-for-pub-sub-messaging-with-aws-lambda-31ca4db9be69/

aws lambda使用

 类似资料: