2 Answers
Perhaps this document on Write Sharding will help to explain.
"…in the case of a partition key that represents today’s date, you might choose a random number between 1 and 200 and concatenate it as a suffix to the date. This yields partition key values like 2014-07-09.1, 2014-07-09.2, and so on, through 2014-07-09.200. Because you are randomizing the partition key, the writes to the table on each day are spread evenly across multiple partitions. This results in better parallelism and higher overall throughput."
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-sharding.html
Jia, I think I see your confusion. What they are saying is not that "more variable input creates more variable hashes" but are saying "static input creates the same hash". Imagine every record created on the same day will have the same date (e.g. 2019-06-27) and this date will always produce the same hash and so always use the same partition. As you point out, as the date changes the hashing algorithm will properly and evenly distribute the records across the partitions but during the same day all the records will be in the same partition because the date is static. This is why Ben mentioned adding a random component to the date to cause the hash to vary.
Hi ben! thanks for youre reply!! i still cant seem to wrap my head around it! ^ this would make alot more sense if the keys arent hashed than by introducing varability in the keys it would lead to more spread out data but the fact that its being hash then makes it super counter intuitive to me.
unless they do something like base 64 the key and take the last n bits which in that case is a case study on a bad hashing algo.