当前位置: 首页 > 知识库问答 >
问题:

如何在使用Boto3删除SQS消息时防止连接超时

松亦
2023-03-14

我有一系列来自SQS队列事件触发器的AWS Lambdas。但是,有时当我试图从队列中删除消息时,尝试会一次又一次地超时,直到我的Lambda超时发生。

我启用了调试日志记录,确认这是套接字超时,但除此之外,我没有得到任何进一步的详细信息。这似乎也是不规则的。起初,我认为这是一个Lambda预热问题,但在成功运行Lambda多次并在第一次部署时,我发现了这个问题。

到目前为止,我所尝试的:

  • 我想也许使用Boto客户端和Boto资源是个问题,但是我看到两种方法都有相同的结果。
  • 我已经调整了连接和读取超时,使其高于默认值,但是,连接只是在引擎盖下用Boto重试逻辑重试。
  • 我尝试降低连接超时,但这只意味着在lambda超时之前有更多的重试。
  • 我已经尝试了标准和FIFO队列类型,两者都有相同的问题

其他几个细节:

  • Python v3.8.5

我正在使用的代码片段:

config = Config(connect_timeout=30, read_timeout=30, retries={'total_max_attempts': 1}, region_name='us-east-1')
sqs_client = boto3.client(service_name='sqs', config=config)
receiptHandle = event['Records'][0]['receiptHandle']\
fromQueueName = eventSourceARN.split(':')[-1]
fromQueue = sqs_client.get_queue_url(QueueName=fromQueueName)
fromQueueUrl = sqs_client.get_queue_url(QueueName=fromQueueName)['QueueUrl']
messageDelete = sqs_client.delete_message(QueueUrl=fromQueueUrl, ReceiptHandle=receiptHandle)

我看到的调试异常的示例如下:

[DEBUG] 2020-10-29T21:27:28.32Z 3c60cac9-6d99-58c6-84c9-92dc581919fd retry needed, retryable exception caught: Connect timeout on endpoint URL: "https://queue.amazonaws.com/" Traceback (most recent call last): File "/var/task/urllib3/connection.py", line 159, in _new_conn conn = connection.create_connection( File "/var/task/urllib3/util/connection.py", line 84, in create_connection raise err File "/var/task/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) socket.timeout: timed out During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/python/botocore/httpsession.py", line 254, in send urllib_response = conn.urlopen( File "/var/task/urllib3/connectionpool.py", line 726, in urlopen retries = retries.increment( File "/var/task/urllib3/util/retry.py", line 386, in increment raise six.reraise(type(error), error, _stacktrace) File "/var/task/urllib3/packages/six.py", line 735, in reraise raise value File "/var/task/urllib3/connectionpool.py", line 670, in urlopen httplib_response = self._make_request( File "/var/task/urllib3/connectionpool.py", line 381, in _make_request self._validate_conn(conn) File "/var/task/urllib3/connectionpool.py", line 978, in _validate_conn conn.connect() File "/var/task/urllib3/connection.py", line 309, in connect conn = self._new_conn() File "/var/task/urllib3/connection.py", line 164, in _new_conn raise ConnectTimeoutError( urllib3.exceptions.ConnectTimeoutError: (<botocore.awsrequest.AWSHTTPSConnection object at 0x7f27b56b7460>, 'Connection to queue.amazonaws.com timed out. (connect timeout=15)') During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/python/utils.py", line 79, in preflight_check fromQueue = sqs_client.get_queue_url(QueueName=fromQueueName) File "/opt/python/botocore/client.py", line 357, in _api_call return self._make_api_call(operation_name, kwargs) File "/opt/python/botocore/client.py", line 662, in _make_api_call http, parsed_response = self._make_request( File "/opt/python/botocore/client.py", line 682, in _make_request return self._endpoint.make_request(operation_model, request_dict) File "/opt/python/botocore/endpoint.py", line 102, in make_request return self._send_request(request_dict, operation_model) File "/opt/python/botocore/endpoint.py", line 136, in _send_request while self._needs_retry(attempts, operation_model, request_dict, File "/opt/python/botocore/endpoint.py", line 253, in _needs_retry responses = self._event_emitter.emit( File "/opt/python/botocore/hooks.py", line 356, in emit return self._emitter.emit(aliased_event_name, **kwargs) File "/opt/python/botocore/hooks.py", line 228, in emit return self._emit(event_name, kwargs) File "/opt/python/botocore/hooks.py", line 211, in _emit response = handler(**kwargs) File "/opt/python/botocore/retryhandler.py", line 183, in __call__ if self._checker(attempts, response, caught_exception): File "/opt/python/botocore/retryhandler.py", line 250, in __call__ should_retry = self._should_retry(attempt_number, response, File "/opt/python/botocore/retryhandler.py", line 277, in _should_retry return self._checker(attempt_number, response, caught_exception) File "/opt/python/botocore/retryhandler.py", line 316, in __call__ checker_response = checker(attempt_number, response, File "/opt/python/botocore/retryhandler.py", line 222, in __call__ return self._check_caught_exception( File "/opt/python/botocore/retryhandler.py", line 359, in _check_caught_exception raise caught_exception File "/opt/python/botocore/endpoint.py", line 200, in _do_get_response http_response = self._send(request) File "/opt/python/botocore/endpoint.py", line 269, in _send return self.http_session.send(request) File "/opt/python/botocore/httpsession.py", line 287, in send raise ConnectTimeoutError(endpoint_url=request.url, error=e) botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "https://queue.amazonaws.com/" During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/python/botocore/retryhandler.py", line 269, in _should_retry return self._checker(attempt_number, response, caught_exception) File "/opt/python/botocore/retryhandler.py", line 316, in __call__ checker_response = checker(attempt_number, response, File "/opt/python/botocore/retryhandler.py", line 222, in __call__ return self._check_caught_exception( File "/opt/python/botocore/retryhandler.py", line 359, in _check_caught_exception raise caught_exception File "/opt/python/botocore/endpoint.py", line 200, in _do_get_response http_response = self._send(request) File "/opt/python/botocore/endpoint.py", line 269, in _send return self.http_session.send(request) File "/opt/python/botocore/httpsession.py", line 287, in send raise ConnectTimeoutError(endpoint_url=request.url, error=e) botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "https://queue.amazonaws.com/"

共有1个答案

曹浩波
2023-03-14

基于评论。

SQS超时是由于lambda函数与VPC相关联,而VPC没有SQS VPC接口终结点。如果没有endpoint或NAT网关,该函数将无法连接到SQS。

解决方案是为SQS服务添加VPC接口endpoint。

 类似资料:
  • 我目前正在使用亚马逊的SQS,在尝试删除当前“正在运行”的队列消息时遇到问题。 下面是一些示例代码: 现在,在接收到句柄和消息体之后,我将接收句柄字符串存储到云存储中(例如DynamoDB)。随后,我从存储服务中加载该句柄,并使用类似于以下内容的方式调用delete: 但是,当运行该行时,我收到一条“输入收据句柄无效”的错误消息。 注意,我知道这条消息没有被重新接收,所以记录的接收句柄应该是最新的

  • 我已经部署了一个AWS Lambda函数,它在SQS队列接收到消息时触发。该函数向Rest API发出请求,如果响应不是Ok,则需要再次处理SQS消息。 这就是为什么我需要将消息重新发送到队列,但我更愿意以编程方式删除SQS消息,尽管我找不到如何配置SQS。我尝试过消息保留,但似乎触发器事件会导致消息被删除。

  • 我正在探索AWS SQS服务。当尝试使用java sdk从队列中删除消息时,我遇到了一些问题。 队列是在SQS中创建的,它有三条消息。该队列由AWS3存储支持,用于处理大型消息。 下面是通过多次轮询接收消息的方法。 日志消息: 我搞不清例外的原因。在上面的使用java SDK的代码片段中,我是否遗漏了什么? 提前感谢任何建议。

  • 问题内容: 在开发过程中,本地WAMP服务器如何从测试服务器获取最新数据是对数据库进行了转储,然后使用source命令上载该转储以加载.sql文件。 最近,在导入的最后,我们遇到了有关@old变量的错误,这些变量在更改原始设置(如外键约束)之前存储了这些设置(因此请关闭外键约束,以使导入不会在以下情况下引发错误)它会重新创建表,并在尚未创建表之一时尝试创建外键。我发现原因是产品表获取越来越多的数据

  • 当一个文件被添加到我的S3存储桶中时,会触发一个S3PUT事件,将一条消息放入SQS。我已经配置了一个Lambda,一旦有消息可用,它就会被触发。 在lambda函数中,我发送了一个API请求,以在ECS Fargate容器上运行一个任务,其中包含从SQS接收的消息的环境变量。在容器中,我使用消息从S3下载文件,进行处理,如果处理成功,我希望从SQS中删除消息。 然而,在我的lambda执行后,消

  • 我有一个用于用户配置文件图像的模型,当我删除具有默认图像的用户时,默认图像也会被删除。我相信这是必须的,因为我设置了on_delete=models.CASCADE。 我尝试在ImageField中启用_delete=PROTECT,但它无法识别该属性。