8.1 SQS - Simple Queue Service
Last updated
Was this helpful?
Last updated
Was this helpful?
Amazon SQS is a web service that gives you access to a message queue that can be used to store messages while waiting for a computer to process them. It provides decoupling to the different parts of your system.
Amazon SQS is a distributed queue system that enables web service applications to quickly and reliably queue messages that one component in the application generates to be consumed by another component. A queue is a temporary repository for messages that are awaiting processing.
SQS is pull-based service, it means it needs other services (EC2 instances) to pull messages in it. On the contrary, SNS (Simple Notification Service) is push-based service. Both of producer (SQS) and consumer (EC2 instances) can be made highly available or auto-scaling. If the consumer is down, the messages will not be discarded. The messages will be stored in SQS and waiting for another/backup consumer to consume.
Using Amazon SQS, you can decouple the components of an application so they run independently, with Amazon SQS easing message management between components. Any component of a distributed application can store messages in a fail-safe queue. Messages can contain up to 256 KB of text in any format (json, xml, etc). Any component can later retrieve the messages programmatically using the Amazon SQS API.
The queue acts as a buffer between the component producing and saving data, and the component receiving the data for processing. This means the queue resolves issues that arise if the producer is producing work faster than the consumer can process it, or if the producer or consumer are only intermittently connected to the network. You can configure auto-scaling groups to monitor the SQS queue, if it goes over a certain number of messages in the queue, you then can start provisioning additional EC2 instances to process the messages in the queue. This brings elasticity to your application.
There are two types of queue:
Standard Queues (default)
FIFO Queues
Amazon SQS offers standard as the default queue type. A standard queue lets you have a nearly-unlimited number of transactions per second. Standard queues guarantee that a message is delivered at least once. However, occasionally (because of the highly-distributed architecture that allows high throughput), more than one copy of a message might be delivered out of order. Standard queues provide best-effort ordering which ensures that messages are generally delivered in the same order as they are sent, but it is not guaranteed.
The FIFO queues complements the standard queue. The most important features of this queue type are FIFO delivery and exactly-once processing: The order in which messages are sent and received is strictly preserved and a message is delivered once and remains available until a consumer processes and deletes it; duplicates are not introduced into the queue. FIFO queues also support message groups that allow multiple ordered message groups within a single queue. FIFO queues are limited to 300 transactions per second (TPS), but have all the capabilities of standard queues.
SQS is pull-based, not push-based.
Messages can be in any text and they can be up to 256 KB in size.
Messages can be kept in the queue from 1 minute to 14 days. The default is 4 days.
Visibility Time Out is the amount of time that the message is invisible in the SQS queue after a consumer picks up that message. (During this time, the message is still in the queue, but invisible). If the job is processed before the visibility time out expires, the message then should be deleted from the queue by the consumer (the consumer component has to issue a DeleteMessage to SQS). If the job is not processed within that time, the message will become visible again and another reader will process it. This could result in the same message being delivered twice. So if you have your visibility timeout to 1 minute, but the job that it is doing is big data analytics, what is gonna happen is the message is going to be visible in the queue again (not come back into the queue) because it is gonna take more than 1 minute, for example 5 minutes, to actually process that big data. So then another EC2 instance will pick that up so you could actually deliver message multiple times, because your visibility timeout is too low.
The maximum of visibility timeout is 12 hours. So if something is going to take more than 12 hours then SQS is probably not a good choice for you because you are gonna delivering messages multiple times.
SQS guarantees that your messages will be processed at least once. The message could be processed multiple times especially if your visibility timeout is lower than the amount of time that it takes for that EC2 instance to do the job.
SQS long polling is a way to retrieve messages from your SQS queues. By default, SQS is using short polling. The regular short polling returns immediately, even if the message queue being polled is empty. Long polling doesn't return a response until a message arrives in the message queue, or the long polling times out. Long polling can save your money because you don't need to ask for message from a empty queue constantly. Long polling can be done by setting the ReceiveMessageWaitTimeSeconds attribute of the queue to a value greater than 0.