Q: We are designing an application where we need to accept a steady stream of large binary objects up to 1GB each. We want our architecture to allow for scaling out. What would you select as the best option for intake of the BLOBs?
Simple Workflow Service
Simple Notification Service
Simple Queue Service
Kinesis Data Streams
Correct answer was SQS
I read that AWS had launched a new feature of AWS Application Auto Scaling that let you define scaling policies that automatically add and remove shards to an Amazon Kinesis Data Stream.
“As you scale horizontally, you must ensure that the Amazon SQS queue that you use has enough connections or threads to support the number of concurrent message producers and consumers that send requests and receive responses. For example, by default, instances of the AWS SDK for Java AmazonSQSClient class maintain at most 50 connections to Amazon SQS. To create additional concurrent producers and consumers, you must adjust the maximum number of allowable producer and consumer threads on an AmazonSQSClientBuilder object…
So now you can automatically add and remove shards to an Amazon Kinesis Data Stream, while for SQS you may need to do (manual) adjustments is SQS still the preferred option?
The tough part of this question for me is the 1GB blob size. We couldn’t use either Kinesis or SQS to take in messages of those size. So, it kind of suggests that we need some form of processing that we’ll need to do using maybe EC2 instances for example.
This is a classic AWS reference architecture pattern to use SQS as a queue and scale worker instances in and out to keep up with the desired throughput. If I were writing the question, I’d probably include something about the processing step because as written, you could use either SQS or Kinesis if you really wanted to do so. Only thing with Kinesis is the retention time of the messages as Kinesis has a shorter default retention time.