Mathimagics wrote:tdillon wrote:The key numbers are:
- sequential grid enumeration: 2m / sec / core
- random grid sampling: 130 / sec / core
For the LCT project, I invested in SSD drives. One 500Gb drive can hold the entire ED grid catalog, uncompressed.
On JILL (4-core i7700K) I have the catalog on a Samsung 860 EVO (1TB), and I can sample random grids at a rate of ~5400 grids / sec.
On JACK (16-core AMD Ryzon), it's a Samsung 970 EVO (1TB), but I only get 3300 random grids/sec, for some reason. Slower bus?
That's interesting. Thanks for the stats!
I think my approach can be sped up for sure, though at the expense of larger tables. It's main cost is that the solver must enumerate on average 1362 solutions per sample before finding the one it's looking for (a little more than the 1346 arising from the simple ratio of grid count to pattern count thanks to variation in the number of grids per pattern). Adding a scheme to make 4 additional binary choices would naively increase the table size by a factor of 16 while cutting the enumeration cost by a factor of a little less than 16 (again since the splits won't be balanced). But the table size can also be shrunk. Right now there are only 1059 distinct values for pattern solution counts, so that really only requires 11 bits instead of 16, and it's likely that this would drop to 10 or even 9 bits after the 16-way split just mentioned.
So optimistically I guess a 2.5 * 16 * 9 / 16 = 22.5GB table might yield a ~12-15x speedup. But 22.5GB is getting pretty big for most people to hold in memory.
Bottom line: for random sampling I don't think my approach can be made competitive with the SSD+catalog unless the SSD approach is IO-bound enough that it doesn't scale well with lots of cores.