The New Sudoku Players' Forum

by **tdillon** » Mon May 25, 2020 6:46 am

Hi All,

A while back I was exploring some things that called for enumerating and sampling Sudoku grids, so I added some tools in the tdoku project that may be of general interest. I'm aware from an old thread here that there exist tools to work with a 54GB (5.7GB compressed) catalog of all the solution grids, and that the sequential access rate (albeit on 2010 hardware) was in the range of 100-200kgrids/second. The approach I've taken is a little different, and has some advantages in terms of table size and streaming access rate, though I'm not sure how it compares on random access, or on the utility of its order of streaming access.

Briefly, I compute a table that stores, for each of the 36288^2 = 1316818944 essentially different patterns of x's and y's below, the number of solutions completing the grid. Each entry is two bytes for a total of 2.5GB, and the table sums to 3546146300288, which is the full grid count factoring out 9! * 72^2 for digit/row/col/block permutations.

Code: Select all: 1 2 3|x x x|x x x 4 5 6|x x x|x x x 7 8 9|x x x|x x x -----+-----+----- y y y|. . .|. . . y y y|. . .|. . . y y y|. . .|. . . -----+-----+----- y y y|. . .|. . . y y y|. . .|. . . y y y|. . .|. . .

Additionally, I compute a much smaller index that maps every millionth grid to the index of the pattern for which it's a solution and the offset of that solution given tdoku's order of enumeration (which, while stable for a given version of tdoku, is admittedly not in any way canonical). To find a particular grid we jump into the index, scan forward a short way to determine the correct pattern and offset, and then ask tdoku to find the nth solution for that pattern. For sequential access we just return every solution the solver finds, which is really fast.

The key numbers are:
* uncompressed table size: 2.5 GB
* compressed table size: 450 MB
* sequential grid enumeration: 2m / sec / core
* random grid sampling: 130 / sec / core

If this sounds useful to you, you can find the code here and the tables here.

Code: Select all: # (1) uncompress tables.tar.xz in the tdoku project directory # (2) build: ./BUILD.sh # (3) list or sample grids build/grid_tools list_grids 0 10 # list the first 10 grids build/grid_tools sample_grids 10 # sample 10 random grids # (4) sample some minimal puzzles build/grid_tools sample_puzzles

Other notes: (1) you can use grid_tools to compute the tables yourself, but it took me around 500 core hours. (2) also included, as illustrated above, is a rejection sampler for slowly generating minimal puzzles with uniform probability conditioned on clue count.

Website · by **denis_berthier** » Tue May 26, 2020 7:06 am

Hi tdillon

This seems to be very interesting. I've downloaded everything and will try parts of it as soon as my Mac has free time.
I've extensively used gsf's compressed file and decompressor program in the past, in combination with a program to generate minimal puzzles (with any number of clues) with a known bias (easily related to the number of clues).

tdillon wrote:The key numbers are:
* uncompressed table size: 2.5 GB
* compressed table size: 450 MB

Maybe we're not speaking of the same thing, but on my Mac, uncompressed tdoku-tables is 390.5 MB; inside it, data.zip is 73.2 MB and uncompressed data is 315.4 MB
Could you clarify?

tdillon wrote:If this sounds useful to you, you can find the code here

I think a better url to download the code is https://github.com/t-dillon/tdoku

tdillon wrote: (2) also included, as illustrated above, is a rejection sampler for slowly generating minimal puzzles with uniform probability conditioned on clue count.

Do you have stats about the generation rate of minimal puzzles for each fixed clue count?

by **tdillon** » Tue May 26, 2020 7:52 am

Hi Denis,

denis_berthier wrote:Maybe we're not speaking of the same thing, but on my Mac, uncompressed tdoku-tables is 390.5 MB; inside it, data.zip is 73.2 MB and uncompressed data is 315.4 MB. Could you clarify?

The compressed file tables.tar.xz found on the releases tab of the tdoku project should be 450 MB. It contains two files, grid.counts and grid.index, which are 2.5 GB and 20 MB respectively when uncompressed. The file data.zip is different: it contains a bunch of benchmarking datasets unrelated to grid enumeration (although I've added to data.zip a new set of puzzles arising from this sampler).

denis_berthier wrote:I think a better url to download the code is https://github.com/t-dillon/tdoku

Just so. I had in mind dropping a link to read the relevant code, but if you want to try using it then you certainly want the whole project.

denis_berthier wrote:Do you have stats about the generation rate of minimal puzzles for each fixed clue count?

No, I wasn't really keeping track. But roughly, to generate the 1m samples found in puzzles8_unbiased in data.zip I had a 32core/64 thread box churning for I think around a month, as well as help from a few laptops and a fast cloud VM for part of that time. I think the vast majority of the time is spent in grid sampling rather than testing whether a clue ordering gives a minimal puzzle, and the grid sampling is definitely something that could be sped up with a little work at the expense of a larger table.

Note the size distribution below from the 1m puzzles I generated agrees pretty well with the results in your paper:

Code: Select all: $ cat puzzles8_unbiased | tr -d . | awk '{ print length($1) }' | sort -n | uniq -c | awk -F' ' '{ n[$2]=$1; m+=$1 } END { for (i=21;i<32;i++) printf("%4d%8d%8.3f%\n", i,n[i],100*n[i]/m) }' 21 26 0.003% 22 1161 0.116% 23 18692 1.869% 24 118787 11.879% 25 305485 30.549% 26 337857 33.786% 27 170620 17.062% 28 41826 4.183% 29 5157 0.516% 30 375 0.037% 31 14 0.001%

Website · by **denis_berthier** » Tue May 26, 2020 8:50 am

tdillon wrote:The compressed file tables.tar.xz found on the releases tab of the tdoku project should be 450 MB. It contains two files, grid.counts and grid.index, which are 2.5 GB and 20 MB respectively when uncompressed. The file data.zip is different: it contains a bunch of benchmarking datasets unrelated to grid enumeration (although I've added to data.zip a new set of puzzles arising from this sampler).

OK, thanks. I confused "tables" with "tdoku-tables"
I now have 3 folders tdoku-master, tables and tdoku-tables inside a common super-folder. Should I move the other two to the inside of "tdoku-master" for a correct working?

tdillon wrote:
denis_berthier wrote:Do you have stats about the generation rate of minimal puzzles for each fixed clue count?

No, I wasn't really keeping track. But roughly, to generate the 1m samples found in puzzles8_unbiased in data.zip I had a 32core/64 thread box churning for I think around a month, as well as help from a few laptops and a fast cloud VM for part of that time. I think the vast majority of the time is spent in grid sampling rather than testing whether a clue ordering gives a minimal puzzle, and the grid sampling is definitely something that could be sped up with a little work at the expense of a larger table.

When I generated my some 6,000,000 controlled-bias puzzles, I also used several computers from my lab in parallel, when they were not used, mainly at night, on weekends and Easter holidays, and I didn't keep track either of the generation time. But it spanned two or three months and it seemed to be an eternity.

tdillon wrote:Note the size distribution below from the 1m puzzles I generated agrees pretty well with the results in your paper:
Code: Select all
$ cat puzzles8_unbiased | tr -d . | awk '{ print length($1) }' | sort -n | uniq -c | awk -F' ' '{ n[$2]=$1; m+=$1 } END { for (i=21;i<32;i++) printf("%4d%8d%8.3f%\n", i,n[i],100*n[i]/m) }' 21 26 0.003% 22 1161 0.116% 23 18692 1.869% 24 118787 11.879% 25 305485 30.549% 26 337857 33.786% 27 170620 17.062% 28 41826 4.183% 29 5157 0.516% 30 375 0.037% 31 14 0.001%

Great! It's always good to see a substantially different approach leading to close results.

I have one new question about generation of various types of puzzles (easy, hard, ...) by proximity search. You say you use a rating based on a sat-solver. Have you checked how it correlates with the usual SER?

by **tdillon** » Tue May 26, 2020 3:07 pm

I see the confusion ... you can ignore tdoku-tables.zip entirely. That's just a snapshot of the entire project that was taken by github when I created a release in order to host the large tables.tar.xz file. All you should need to set up is the following:

Code: Select all: git clone https://www.github.com/t-dillon/tdoku cd tdoku unzip data.zip tar xf tables.tar.xz # tdoku is generally fastest with clang-8 CC=clang-8 CXX=clang++-8 ./BUILD.sh # run tests and benchmarks, sample some puzzles build/run_tests build/run_benchmark -t5 -w1 data/* build/grid_tools sample_puzzles

If you want to use the proximity searching puzzle generator to make hard puzzles you'll also want to install minisat. I haven't tried this on a mac. On ubuntu at least minisat seems to require libz-dev, which may not get installed automatically.

Minisat difficulty has a decent correlation with SER, though I haven't tried to quantify it precisely or characterize when they disagree. Certainly there are some not-so-hard techniques based on uniqueness that are unavailable to minisat. I would also guess that a puzzle requiring many steps of moderate difficulty might look harder to minisat than one that becomes easy after a single really difficult step. It would be interesting to understand this better, and to understand how minisat difficulty relates to human experience. But I've chiefly used it because it's fast to evaluate and it's based on a powerful and general algorithm that probably comes closer to a notion of hard-for-computers.

Website · by **denis_berthier** » Wed May 27, 2020 4:19 am

tdillon wrote:Minisat difficulty has a decent correlation with SER, though I haven't tried to quantify it precisely or characterize when they disagree. Certainly there are some not-so-hard techniques based on uniqueness that are unavailable to minisat. I would also guess that a puzzle requiring many steps of moderate difficulty might look harder to minisat than one that becomes easy after a single really difficult step. It would be interesting to understand this better, and to understand how minisat difficulty relates to human experience. But I've chiefly used it because it's fast to evaluate and it's based on a powerful and general algorithm that probably comes closer to a notion of hard-for-computers.

For low ratings, uniqueness may make a difference, but in my experience, it rarely does, so that it's statistically meaningless.
Many moderate steps vs a single hard one is something no one currently knows how to deal with. All the current ratings are based on the hardest step. It may make the minisat one interesting on its own.
When you generate a puzzle, you get its minisat rating at the same time, don't you? Could you output it on the same line as the puzzle, as is usually done with SER? That'd allow easy correlation computations.
As for human rating, there are so many different views of how to classify the rules that any choice will be criticised by the majority. Some people even consider hidden singles as simpler than naked ones.

by **Mathimagics** » Wed May 27, 2020 5:08 am

Hi Tom,

tdillon wrote:Minisat difficulty has a decent correlation with SER ...

This is somewhat of a surprise to me.

What is your MiniSAT rating based on, exactly? Execution time?

by **tdillon** » Wed May 27, 2020 6:11 am

Yes, the proximity-searching generator prints out a solver backtrack rating among other things. But the backtracking encountered by the generator isn't the most useful number. Unlike SER, solver backtracking varies between different permutations of the same puzzle, so the generator rates puzzles by averaging over a configurable number of them. To get the best measure of difficulty you want to average over many permutations, but when generating puzzles you only want to average over a few, both to keep evaluation fast and to keep enough noise in the process to escape local minima.

So to generate puzzles I might run something like this:

Code: Select all: build/generate -p0 -c0 -g1 -r0 -n500 -d1 -e5 -s1 | tee puzzles.out

Which says generate vanilla (non-pencilmark) puzzles, with no preference based on clue count (other than the generator's intrinsic bias), with a positive preference for hard puzzles (that cause more guesses), keeping a pool of 500 puzzles, mutating puzzles by dropping 1 clue and re-completing, evaluating each puzzle over 5 permutations using minisat.

After a while you might find the puzzles rated hardest by the generator look like this:

Code: Select all: sort -t' ' -k3 -n puzzles.out | tail -n5 | tee hardest ..83...5...4..69..3...8.4.......8....7..4......356.8.........9...9..53..2.......1 22 137.2 -137.2 .......7....85.3.....93...8..7..1.2...........5.39....6..7...4..8..2.9..7.2...... 21 153.0 -153.0 .......7....85.3.....93...5..4..6.2.....4.....5.39....1......62.8....9..4.6...... 21 157.6 -157.6 ..1.56.7....8..3......3...56.7..1.2...........5.39....1......4..8......94.6...... 21 166.0 -166.0 .......7..4.85.3.....93...8..4..6.2...........5.39....1..7...4..8..4.9....6...... 21 189.6 -189.6

A minisat hardness of >130 would be impressive, but there is a lot of noise in the generator's evaluation, so I usually re-evaluate over, say, 200 permutations:

Code: Select all: paste <(cut -d' ' -f1 hardest) <(build/run_benchmark -a -b -sminisat_augmented -n200 hardest) ..83...5...4..69..3...8.4.......8....7..4......356.8.........9...9..53..2.......1 75.6 .......7....85.3.....93...8..7..1.2...........5.39....6..7...4..8..2.9..7.2...... 84.3 .......7....85.3.....93...5..4..6.2.....4.....5.39....1......62.8....9..4.6...... 86.1 ..1.56.7....8..3......3...56.7..1.2...........5.39....1......4..8......94.6...... 92.1 .......7..4.85.3.....93...8..4..6.2...........5.39....1..7...4..8..4.9....6...... 77.0

These particular puzzles have SER in the range 9.5 to 10.1. But in general you can use the benchmark command above to evaluate minisat hardness for any existing set of puzzles with SER ratings that you want to compare.

by **tdillon** » Wed May 27, 2020 6:35 am

Mathimagics wrote:Hi Tom,
tdillon wrote:Minisat difficulty has a decent correlation with SER ...

This is somewhat of a surprise to me.

I shouldn't overstate the correlation because I haven't really studied it in detail, but as rough indicators, easy puzzles tend to have a minisat rating of zero, while SER > 11 puzzles tend to have minisat ratings around 60:

Code: Select all: build/run_benchmark -a -b -sminisat_augmented -n200 <(head -n50 data/puzzles4_forum_hardest_1905_11+) | awk '{ x+=$1 } END { print x/50 }' 61.29

And here's the distribution of floor(SER) for a bunch of puzzles I've generated with minisat rating > 60:

Code: Select all: 637 2 696 3 15982 4 2062 5 380 6 2523 7 1362 8 27734 9 6203 10 152 11

So some not so hard, but the majority in the 9+ range.

Mathimagics wrote:What is your MiniSAT rating based on, exactly? Execution time?

You can base it either on time or on backtracks. That's what the -b option of run_benchmark governs, but I usually use backtracks because it makes the metrics a little more comparable when you use different solvers that may be much faster or slower, but that make similar decisions.

by **yzfwsf** » Wed May 27, 2020 6:51 am

Hi Tom
Is your puzzle generator based on seed and neighborhood search? If you don't use seeds and randomly generate puzzles, what is the probability of SE9.0 or higher?

by **tdillon** » Wed May 27, 2020 7:32 am

Hi yzfwsf,

The process is as follows:

1. Create a set of N randomly generated minimal puzzles.
2. Give each puzzle a score based on some linear combination of its clue count its backtracking rating via minisat.
3. Insert the scored puzzles into a priority queue.
4. Draw a random puzzle from the queue. Drop some clues, re-constrain, and re-minimize.
5. Score the new puzzle by evaluating with minisat under random permutations.
6. If the new puzzle better than the worst puzzle in the queue, insert it into the queue and eject the worst puzzle.
7. Goto 4.

So over time the set of puzzles in the queue gets enriched with hard puzzles or low clue puzzles or high clue puzzles depending on what we're driving at. How fast this happens, and whether it's likely to get stuck in a local minimum with a collapse of diversity, depends on the scoring weights, the number of clues you're dropping, and the pool size.

You can have it output every puzzle it generates along the way or just those that score well enough to enter the queue. But either way the generator doesn't compute SE scores. That's something you need to do after the fact. So it's hard to say how long it takes on average to find a SE9.0. I haven't focused much on SE scores so it's not a question I've tried to answer. But also it's just going to depend on parameters.

by **m_b_metcalf** » Wed May 27, 2020 7:50 am

yzfwsf wrote:Hi Tom
Is your puzzle generator based on seed and neighborhood search? If you don't use seeds and randomly generate puzzles, what is the probability of SE9.0 or higher?

I'm not Tom, but I thought it was an interesting question. I randomly generated 1296 puzzles (each by deleting clues at random from a fresh random grid until minimal). The spread of ratings was:

Hidden Text: Show

Not a single one rated 9.0 or higher. I then ran the program again, asking it to stop if it found a puzzle at 9.0 or above (using a proxy test), and this time it stopped after generating only 67 with

Code: Select all: . . 1 . 2 . . 3 . . . . 4 . . . . . 3 2 5 1 . . . . . . 1 6 . 5 . 7 . . 8 . . . . 1 . . . . . . 6 . . 4 . . 6 . . . . 9 . . . . . 7 . . 6 9 . 5 . 9 . . . . . . 7 ED=9.1/9.1/2.6

I repeated that test, and it stopped at 247 with a 9.0/1.2/1.2. So, they're rare but they exist.

Regards,

Mike

by **yzfwsf** » Wed May 27, 2020 8:09 am

Hi Tom
Interestingly, most of the puzzles you generate have similar MSLS.

Hidden Text: Show

Code: Select all: .......7....85.3.....93...8..7..1.2...........5.39....6..7...4..8..2.9..7.2...... 21 153.0 -153.0 MSLS:17 Cells r2368c1368+r1c6,17 Links 9r2,5r3,8r6,35r8,124c1,146c3,2467c6,16c8 25 Eliminations:r5c138,r7c3,r9c8<>1,r5c16,r1c1<>2,r1c13,r5c136,r4c1,r9c6<>4,r3c7,r8c4<>5,r5c368,r9c68,r1c3<>6,r2c29<>9 .......7....85.3.....93...5..4..6.2.....4.....5.39....1......62.8....9..4.6...... 21 157.6 -157.6 MSLS:17 Cells r1479c2457+r1c6,17 Links 1246r1,17r4,47r7,127r9,39c2,5c4,8c5,58c7 24 Eliminations:r1c39,r9c689,r4c9<>1,r1c13,r9c6<>2,r5c2<>3,r1c9,r7c6<>4,r5c47,r8c4<>5,r4c19,r7c3,r9c9<>7,r356c7<>8,r25c2<>9 ..1.56.7....8..3......3...56.7..1.2...........5.39....1......4..8......94.6...... 21 166.0 -166.0 MSLS:16 Cells r2368c1368, 16 Links 59r2,89r3,8r6,35r8,27c1,24c3,247c6,16c8 22 Eliminations:r9c8<>1,r5c136,r7c36,r1c1,r9c6<>2,r5c36<>4,r8c47<>5,r579c6<>7,r6c79,r3c7<>8,r3c247,r2c2<>9 .......7..4.85.3.....93...8..4..6.2...........5.39....1..7...4..8..4.9....6...... 21 189.6 -189.6 MSLS:17 Cells r2368c1368+r1c6,17 Links 9r2,5r3,8r6,35r8,267c1,127c3,1247c6,16c8 26 Eliminations:r5c368,r9c68,r1c3<>1,r1c13,r5c136,r7c36,r9c6<>2,r8c9<>3,r5c6<>4,r8c49,r3c7<>5,r5c18<>6,r5c13,r4c1<>7,r6c7<>8,r2c9<>9 .......7....85.3.....93...8..7..1.2...........5.39....6..7...4..8..2.9..7.2...... 84.3 MSLS:17 Cells r2368c1368+r1c6,17 Links 9r2,5r3,8r6,35r8,124c1,146c3,2467c6,16c8 25 Eliminations:r5c138,r7c3,r9c8<>1,r5c16,r1c1<>2,r1c13,r5c136,r4c1,r9c6<>4,r3c7,r8c4<>5,r5c368,r9c68,r1c3<>6,r2c29<>9 .......7....85.3.....93...5..4..6.2.....4.....5.39....1......62.8....9..4.6...... 86.1 MSLS:17 Cells r1479c2457+r1c6,17 Links 1246r1,17r4,47r7,127r9,39c2,5c4,8c5,58c7 24 Eliminations:r1c39,r9c689,r4c9<>1,r1c13,r9c6<>2,r5c2<>3,r1c9,r7c6<>4,r5c47,r8c4<>5,r4c19,r7c3,r9c9<>7,r356c7<>8,r25c2<>9 ..1.56.7....8..3......3...56.7..1.2...........5.39....1......4..8......94.6...... 92.1 MSLS:16 Cells r2368c1368, 16 Links 59r2,89r3,8r6,35r8,27c1,24c3,247c6,16c8 22 Eliminations:r9c8<>1,r5c136,r7c36,r1c1,r9c6<>2,r5c36<>4,r8c47<>5,r579c6<>7,r6c79,r3c7<>8,r3c247,r2c2<>9 .......7..4.85.3.....93...8..4..6.2...........5.39....1..7...4..8..4.9....6...... 77.0 MSLS:17 Cells r2368c1368+r1c6,17 Links 9r2,5r3,8r6,35r8,267c1,127c3,1247c6,16c8 26 Eliminations:r5c368,r9c68,r1c3<>1,r1c13,r5c136,r7c36,r9c6<>2,r8c9<>3,r5c6<>4,r8c49,r3c7<>5,r5c18<>6,r5c13,r4c1<>7,r6c7<>8,r2c9<>9

Hi Mike:
I use a layered top-down approach to get puzzles with difficulty above SE9.0. This method tries to keep the number of clues remaining in each row / column / box no more than three.

by **m_b_metcalf** » Wed May 27, 2020 8:18 am

yzfwsf wrote:I use a layered top-down approach to get puzzles with difficulty above SE9.0. This method tries to keep the number of clues remaining in each row / column / box no more than three.

Yes, if you're after hard puzzles you need biases of some sort. See here, for example.

Mike

by **Mathimagics** » Fri Jun 05, 2020 6:04 am

tdillon wrote:The key numbers are:
sequential grid enumeration: 2m / sec / core
random grid sampling: 130 / sec / core

For the LCT project, I invested in SSD drives. One 500Gb drive can hold the entire ED grid catalog, uncompressed.

On JILL (4-core i7700K) I have the catalog on a Samsung 860 EVO (1TB), and I can sample random grids at a rate of ~5400 grids / sec.

On JACK (16-core AMD Ryzon), it's a Samsung 970 EVO (1TB), but I only get 3300 random grids/sec, for some reason. Slower bus?

The New Sudoku Players' Forum

Tdoku grid_tools

Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools

Re: Tdoku grid_tools