Thanks for the data @metachris and the useful download tool @vitaliy .
I have some questions about the data accuracy for the month of September:
I took all of the sourcelog hash’s that appear more than 10 times and searched that against the transaction-data. There are about 222k unique hash’s that appear more than 10 times. The max count for some of these transactions ranges anywhere from 10-80 in a fairly distributed fashion.

Within the transaction-data files, the max count for duplicate hash’s that appear in the transaction-data tops out at 10.
Some of these transactions have quite a lot of ETH in them. Here is a subset of these transactions. I removed the timestamp column to do this groupby - it seems that the entire row is identical besides the timestamp.
Here is a subset of these questionable transactions:
shape: (222_716, 3)
| hash | value | count |
|---|---|---|
| str | f64 | u32 |
| “0xd5fe940e36f0f1751c942d0a607a5eece6a8c703e4bbdff4643b289aa2130c85” | 242.5 | 4 |
| “0xb629b6b6c56a04e6b691d7f280e68352651338f2682ab54249edfd820d3efb63” | 140.83454 | 10 |
| “0xab8686aec7609924a4292bea3a2c77de017fc461c57fb00fc0dd96a86d35481c” | 100.0019 | 10 |
| “0x47ecdc4d7d21d26a4c689e03b34080f0aa5530e1cc661f25a177c5f8996be8e9” | 96.0 | 5 |
| “0xed39307ab815f96b2e6bded72276e2e42305df500b595aa0dc23ab83f5a392b0” | 96.0 | 5 |
| “0xc0d026c482810220444e2231dcdb9be03dd08fe6533f83e9d774c2240b827ba7” | 84.71442 | 9 |
| “0x229a59832b9dd174d6c6326adb9754aad9e7ec8055e32ef74d643c693589a5ef” | 45.318917 | 9 |
| “0xbc7b0dca0f8697b67ee33a7dd859fa293061d454449f4baa8cf716a44940e4da” | 32.0 | 9 |
| “0x9381f46465a175e69da23ecb87821f8574c317eaee64b55efa7e666b8afe1c3f” | 27.992319 | 9 |
| “0x2437a267466257bc2f15af6e9e5915ae916996d3f0bc9bac90f71e2dad4f697b” | 26.398904 | 10 |
Are you able to verify whether this is accurate transaction data or if there is some sort of replication bug in the ingestion process?
