Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: SQS receive message failed: connect ETIMEDOUT <ip>:443 every few hours #480

Closed
kazazor opened this issue Mar 29, 2024 · 4 comments
Closed

Comments

@kazazor
Copy link

kazazor commented Mar 29, 2024

Describe the bug

My setup is pretty simple:

  • ECS Fargate (spot/regular) that listens to SQS messages within my AWS account. Really nothing special.
  • This is my setup:
this.sqsConsumer = Consumer.create({
        queueUrl: eventBusSqsUrl,
        handleMessageBatch: async (messages: Message[]) => this.handleSqsMessages(messages),
        batchSize: 10,
        visibilityTimeout: 30, // In seconds
        waitTimeSeconds: 15, // Long polling interval (in seconds)
        terminateVisibilityTimeout: false // Do not terminate visibility timeout on processing error,
      });

      this.sqsConsumer.on("error", error => {
        this.logger.error(error, "Error in the SQS consumer");
      });

      this.sqsConsumer.on("processing_error", error => {
        this.logger.error(error, "Processing error");
      });

      this.sqsConsumer.on("timeout_error", error => {
        this.logger.error(error, "Timeout error");
      });

I'm also listening to the shutting down of the server signals and stopping the consumer from listening. Using NestJS I do this:

onApplicationShutdown(_signal?: string): void {
    this.stopEventSubscribers();
  }

stopEventSubscribers(): void {
    this.sqsConsumer?.stop();
  }

I'm getting:

{
  "msg": "Error in the SQS consumer",
  "err": {
      "type": "SQSError",
      "message": "SQS receive message failed: connect ETIMEDOUT <IP>:443"
      "name": "SQSError",
      "code": "TimeoutError",
  }
}
  1. This happens every few hours. Why is that?
  2. The message printed is Error in the SQS consumer, which we get from the general error event and not the timeout_error event. would expect it to arrive from. Why is that? Could it be a bug?

Your minimal, reproducible example

https://gist.github.com/kazazor/b4ecc2b22edcc2b697b4eb0e0c852fc5

Steps to reproduce

There are no actual steps here. I am using the setup I mentioned and letting the application run for days. Every few hours, the timeout occurs.

Expected behavior

  1. Not getting timeouts
  2. If I'm getting timeouts, it's supposed to be from the timeout_error event

How often does this bug happen?

Often

Screenshots or Videos

No response

Platform

  • ECS Docker Alpine

Package version

8.2.0

AWS SDK version

No response

Additional context

I'm also uncertain if the consumer continue to function after that..? Maybe this behavior is ok and I could ignore this error..

@nicholasgriffintn
Copy link
Member

nicholasgriffintn commented Mar 29, 2024

Hey, so this looks like an issue on AWS' side and we can't support their services.

In terms of the error sent, all errors from SQS will come through the error listener, the timeout listener is specifically for when handle message times out, related to the handle message timeout functionality sqs consumer has.

@kazazor
Copy link
Author

kazazor commented Mar 29, 2024

Thanks I'll give it a go.

Regarding the internal timeout a consumer has, can you elaborate on that a bit more?
I'm assuming you're not talking about the visibility window timeout so I'm not sure I get what internal logic could timeout for a consumer

@nicholasgriffintn
Copy link
Member

Sure, that event is documented here: https://bbc.github.io/sqs-consumer/interfaces/Events.html#timeout_error

Specifically relating to this option: https://bbc.github.io/sqs-consumer/interfaces/ConsumerOptions.html#handleMessageTimeout which fires if the handleMessage function supplied takes longer than the time set in the option.

The logic for the error emitter can be found here as well: https://github.com/bbc/sqs-consumer/blob/main/src/consumer.ts#L205

@kazazor kazazor closed this as completed Mar 29, 2024
Copy link

This issue has been closed for more than 30 days. If this issue is still occuring, please open a new issue with more recent context.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants