1 Answers
WIthout knowing the other choices, it’s hard to say if this was the best one. I would tend to agree that it sounds like the same amount of work and thus cost to use Firehose with Lambda versus Streams with Lambda on the surface. However, there might be another angle here… Think of Streams as always-on. You’re billed based on how many open shards you have. Firehose is billed by the GB so you’re only going to incur cost as data flows through. It’s hard to say the magnitude of cost differential if you don’t know the payload frequency or size, but generally on an AWS exam, stuff that is running all the time is considered more expensive than stuff that’s more on-demand.
The question is about rearchitecting to reduce cost where some data is sent to Kinesis Data Stream and process and store to S3 by some Lambdas to then analyse the data with a transient Redshift cluster. The other options included replacing a transit Redshift cluster with a transit Redshift Spectrum cluster, compress the data for better utilisation by Redshift or reduce the number of shards but sending more than the 1MiB limit per shard per second limit. As far as I understand, a combination of Kinesis + Athena must be cheaper that any transient Redshift cluster, that’s way I think the other options are not valid.