AWS Certified Solutions Architect - Professional 2020

Sign Up Free or Log In to participate!

What’s best, Kinesis Streams to Lambda to S3 or Kinesis Firehose to S3 to trigger lambdas with an event?


I’m studying to take the AWS Solutions Architect Professional exam and I have a question related to Kinesis in combination with Lambda and S3. I took an AWS exam test and in one of the questions about reducing the overall cost of an architecture there was an option that basically read ‘Replace Kinesis Streams and the lambda consumers that write the processed data they create to S3 with Kinesis Firehose to write directly to S3. Trigger a Lambda using S3 events’. By discarding the other answers, I think this is the right one but I don’t get what’s the difference between processing the data with Lambda after it’s been written to S3 instead of the opposite. Actually, doing it this way, the data has to be written twice to S3, once by Kinesis Firehose and another one by Lambda. 

I could be wrong about the other answers but, does anyone know of any good reason to do this?

1 Answers

WIthout knowing the other choices, it’s hard to say if this was the best one.   I would tend to agree that it sounds like the same amount of work and thus cost to use Firehose with Lambda versus Streams with Lambda on the surface.   However, there might be another angle here…  Think of Streams as always-on.  You’re billed based on how many open shards you have.  Firehose is billed by the GB so you’re only going to incur cost as data flows through.   It’s hard to say the magnitude of cost differential if you don’t know the payload frequency or size, but generally on an AWS exam, stuff that is running all the time is considered more expensive than stuff that’s more on-demand.

Juan Manuel

The question is about rearchitecting to reduce cost where some data is sent to Kinesis Data Stream and process and store to S3 by some Lambdas to then analyse the data with a transient Redshift cluster. The other options included replacing a transit Redshift cluster with a transit Redshift Spectrum cluster, compress the data for better utilisation by Redshift or reduce the number of shards but sending more than the 1MiB limit per shard per second limit. As far as I understand, a combination of Kinesis + Athena must be cheaper that any transient Redshift cluster, that’s way I think the other options are not valid.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?