Minisat hardness

Everything about Sudoku that doesn't fit in one of the other sections

Minisat hardness

Postby tdillon » Wed Jul 22, 2020 10:13 am

As discussed a bit in another thread I've experimented with using minisat for rating difficulty when generating puzzles. Specifically, I've been computing the minisat rating as the average number of decisions that minisat makes given an augmented SAT encoding that represents hidden singles and locked candidates and taking the average over many random row/col/band/stack/digit permutations of the puzzle. Given this, any puzzle that can be solved only with singles and locked candidates has a minisat rating of zero. Really hard sudoku can have a rating over 100. Really hard pencilmark sudoku can have minisat ratings of tens or hundreds of thousands.

This has some differences from usual rating schemes, some of which are pros, some of which are cons, some of which could go either way depending on what you're looking for, and some of which are just not understood, at least not by me:

  • ratings are fast
  • ratings are largely independent of any body of prior knowledge or practice of sudoku
  • ratings are not based on the hardest single step, but may instead be roughly proportional to the number of non-basic inferences or learned conflict clauses produced along the way.
  • ratings need to be averaged over many random permutations, so they are estimations that are not entirely reproducible.
  • ratings are influenced by minisat's heuristics, but probably to a lesser degree than they would be for backtrack ratings of solvers that don't use CDCL.
  • pigeonhole inferences are hard for minisat and uniqueness inferences are impossible, so if these techniques really simplify a puzzle's solution then the puzzle may be over-rated by minisat.
For a rough comparison to traditional ratings, here are distributions of minisat-hardness for a set of 1 million puzzles sampled using a controlled-bias generator, grouped by floor(SE rating):

Code: Select all
Distribution of Minisat-Hardness of Unbiased Puzzles by SE Range
                  SE=1   SE=2   SE=3   SE=4   SE=5   SE=6   SE=7   SE=8   SE=9
-------------------------------------------------------------------------------
Lowest               0    0.0    1.0    0.9    0.9    0.9    1.0    1.9    4.2
25th Percentile      0    0.0    1.9    1.4    1.0    1.8    2.5    4.9    7.8
Median               0    0.0    2.5    1.9    1.2    2.5    3.4    6.0    9.2
75th Percentile      0    0.0    3.4    2.7    1.9    3.5    4.5    7.3   10.8
Highest              0   12.7   13.4   13.6   12.3   16.0   20.2   20.1   20.5

It's interesting to note that puzzles in the SE4/5 range tend to be easier than SE3 puzzles as far as minisat is concerned. But after that the relationship trends as expected. Here is an extension of this relationship using puzzles from the hardest DB instead of the controlled-bias generator since this samples few very hard puzzles (all of these were evaluated over 1000 puzzle permutations):

Code: Select all
Distribution of Minisat-Hardness of Hardest DB Puzzles by SE Range
                          SE10.5-11.0     SE11.0-11.5       SE11.5+
-------------------------------------------------------------------
Lowest                        19.2            20.8            52.6
25th Percentile               45.5            47.9            73.8
Median                        52.4            56.3            78.7
75th Percentile               60.1            65.4            86.1
Highest                      117.5           125.9           123.9

So there is at the same time significant correlation between SE rating and minisat rating, but also significant variation. At the moment I have the hunches in the bullets above to explain the differences, but no great ideas for how to validate them.

Note that the highest minisat-rated puzzle in the hardest DB has a rating of 125.9 as shown above, but it's not terribly hard to find puzzles with still-higher minisat ratings. For example, here are some exemplars from minisat-hard clusters:

Code: Select all
                                                                                    Minisat       SER
38.1.....16...8.....2..........4..9....5.13.........458..6.35......2..79.5.......     126.1   11.1/1.2/1.2
..........839......69..3.......5..41...2......96..85......7...4.5.8..3......2..1.     127.0   9.8/9.8/2.6
98.76....7...9.6....6..5...5..9..........4.3...865.9...2......4..754.8.........1.     130.2   11.3/11.3/10.5
.....8..7....1..9....6...28.19....4.6.........54...93....2.6....95.3........5..1.     150.0   11.2/11.2/2.6

It's possible that at these extremes I'm just selecting puzzles that interact really badly with minisat's heuristics. But it would be interesting to hear what expert puzzle analysts have to say about them!
tdillon
 
Posts: 52
Joined: 14 June 2019

Re: Minisat hardness

Postby tarek » Wed Jul 22, 2020 1:27 pm

You are correct that with Uniqueness techniques inclusion in most resolution rules, that there will be puzzles that appear hard for mini-sat which fall to what appears to be much simpler technique.

You got got correctly as well when you said that there is (despite the above) there is a correlation between you mini-sat hardness and SER.

If our goal is (Which it has been for the past 14 years) is to defeat SER as we consider it the current "yardstick" to rate puzzle difficulty, then mini-sat will be a fresh & new angle take on this.

suxrat9, suxratt, q1,q2 were all different ways of attacking the problem and would usually screen a large group of candidates prior to rating them by SE with a great degree of success.

The observations that made us get closer were certainly over the years allowing us to find a different way to find these monsters. From a high Peral rating to a high diamond rating, passing through to diagonal clue layout & finally Mladen's observation which I mentioned in the other thread which eleven found success with. All of these puzzles will need a thorough neighborhood search as well as we have encountered several related puzzles with higher ratings in the vicinity.

Not very methodical but gives you an idea of what has been done to this date. With your success with pencilmark sudokus, I'm hoping that by combining some of the previous ideas/observations with your mini-sat approach that we would have a fresh batch of hardest.

tarek
User avatar
tarek
 
Posts: 3748
Joined: 05 January 2006

Re: Minisat hardness

Postby dobrichev » Wed Jul 22, 2020 1:32 pm

Being not an expert puzzle analyst I see a contradiction in
  • ratings are fast
    and
  • ratings need to be averaged over many random permutations.

Surely Minisat can be used as a fast filter in hard puzzles searching but it isn't capable to be used as reference due to the reasons you mentioned.

Imagine e new improved version of Minisat appears. All accumulated ratings become useless.

The learned conflict clauses should be covered by the sudoku solving techniques but I lost interest in this area due to the existing aggressive and self-reproductive mess in the terminology.
dobrichev
2016 Supporter
 
Posts: 1807
Joined: 24 May 2010

Re: Minisat hardness

Postby tdillon » Wed Jul 22, 2020 5:15 pm

dobrichev wrote:I see a contradiction between ratings being fast and needing lots of averaging

Agree there's a trade-off between precision and performance. I've generally used a small number of rating iterations during search where some noise is OK, and I've turned it up to 1000 iterations for re-rating interesting candidates. That said, even with 1000 iterations it's still a lot faster than SE for hard puzzles. For example, it took a few days on a laptop to rate all of the hardest DB puzzles at 1000 iterations, while this would have taken months using SE.

dobrichev wrote:Isn't capable of being used as a reference ... Imagine a new version of Minisat appears ...

Agree that non-determinism stands in the way of using the minisat rating as a reference, though I suppose you could first canonicalize the puzzle and then generate a sequence of permutations from a fixed seed. Maybe that's not so bad. Any rating system will have some arbitrary element; any rating system will encourage trophy hunting; and any rating system will have inertia once it's been widely used to rate and compare puzzles.

dobrichev wrote:Learned conflict clauses should be covered by the sudoku solving techniques

I'd put this the other way around: the backtracking of any DPLL or CDCL-based solver can be rewritten from the bottom up as a resolution-refutation proof, so in this sense any set of sudoku resolution rules is subsumed by the one resolution rule of propositional logic. This is one reason why I share your lack of interest in wading into nomenclature.

Overall I guess my aim with this post was not really to recommend minisat rating as a reference scheme, but rather (1) to point out that it's been useful to me in fast puzzle search, (2) to give some stats for context, and (3) to raise a subjective question that may be difficult to answer: is a rating scheme that's independent of our priors and not based on the hardest step likely to find satisfying puzzles in a fresh way? (e.g., puzzles that present new challenges, diverse challenges, are not one-trick ponies, etc.)
tdillon
 
Posts: 52
Joined: 14 June 2019

Re: Minisat hardness

Postby coloin » Thu Jul 23, 2020 11:10 pm

I could be completely wrong but this puzzle has the highest suexratt rating .....
Code: Select all
 
 . . 3 | . . . | . . . 
 4 . . | . 8 . | . 3 6 
 . . 8 | . . . | 1 . . 
-------+-------+------
 . 4 . | . 6 . | . 7 3 
 . . . | 9 . . | . . . 
 . . . | . . 2 | . . 5 
-------+-------+------
 . . 4 | . 7 . | . 6 8 
 6 . . | . . . | . . . 
 7 . . | 6 . . | 5 . .

I Wonder what you make of it !!!!
coloin
 
Posts: 1983
Joined: 05 May 2005

Re: Minisat hardness

Postby tdillon » Fri Jul 24, 2020 12:06 am

coloin wrote:I could be completely wrong but this puzzle has the highest suexratt rating .....
Code: Select all
 
 . . 3 | . . . | . . . 
 4 . . | . 8 . | . 3 6 
 . . 8 | . . . | 1 . . 
-------+-------+------
 . 4 . | . 6 . | . 7 3 
 . . . | 9 . . | . . . 
 . . . | . . 2 | . . 5 
-------+-------+------
 . . 4 | . 7 . | . 6 8 
 6 . . | . . . | . . . 
 7 . . | 6 . . | 5 . .

I Wonder what you make of it !!!!

It's not the most extreme by minisat, but certainly in rarefied air:
Code: Select all
tdoku$ build/run_benchmark -a -b -n1000 -sminisat_augmented \
  <(echo ..3......4...8..36..8...1...4..6..73...9..........2..5..4.7..686........7..6..5..)
       105.4   
tdillon
 
Posts: 52
Joined: 14 June 2019

Re: Minisat hardness

Postby RSW » Fri Jul 24, 2020 6:35 am

coloin wrote:I could be completely wrong but this puzzle has the highest suexratt rating .....
Code: Select all
 
 . . 3 | . . . | . . . 
 4 . . | . 8 . | . 3 6 
 . . 8 | . . . | 1 . . 
-------+-------+------
 . 4 . | . 6 . | . 7 3 
 . . . | 9 . . | . . . 
 . . . | . . 2 | . . 5 
-------+-------+------
 . . 4 | . 7 . | . 6 8 
 6 . . | . . . | . . . 
 7 . . | 6 . . | 5 . .

I Wonder what you make of it !!!!


Nice puzzle. Solvable with 3 (or perhaps fewer?) steps:
JE2: (1259)R13C1, r4c3 r7c2 => -3r789c2 -7r56c3
JE2: (1259)R89C3, r2c2 r4c1 => -3r56c1 -7r123c2 -8r4c1
Sashimi X-Wing: (9):r38c125 => -9r1c2
stte

<Edit>
Explanation of JE2 eliminations (ref rule no.s from David P. Bird's compendium):
Code: Select all
 +--------------------+----------------------+-------------------+
 | 1259  125679 3     | 12457  12459  145679 | 24789 24589 2479  |
 | 4     12579  12579 | 1257   8      1579   | 279   3     6     |
 | 259   25679  8     | 23457  23459  345679 | 1     2459  2479  |
 +--------------------+----------------------+-------------------+
 | 12589 4      1259  | 158    6      158    | 289   7     3     |
 | 12358 12357  12567 | 9      1345   134578 | 2468  1248  124   |
 | 1389  1379   1679  | 13478  134    2      | 4689  1489  5     |
 +--------------------+----------------------+-------------------+
 | 12359 12359  4     | 1235   7      1359   | 239   6     8     |
 | 6     123589 1259  | 123458 123459 134589 | 23479 1249  12479 |
 | 7     12389  129   | 6      12349  13489  | 5     1249  1249  |
 +--------------------+----------------------+-------------------+

JE2: (1259)R13C1, r4c3 r7c2 => -3r789c2 -7r56c3
Rule 3: Nonbase candidate 3 is invalid in target cells => -3r7c2.
Rule 9: Mirror node 1 has locked nonbase digit 8. All other nonbase digits can be eliminated from the mirror node => -3r89c2 -7r56c3

JE2: (1259)R89C3, r2c2 r4c1 => -3r56c1 -7r123c2 -8r4c1
Rule 3: Nonbase candidates 7, 8 are invalid in target cells => -7r2c2 -8r4c1
Rule 9: Mirror node 1 has locked nonbase digit 8. All other nonbase digits can be eliminated from the mirror node => -3r56c1 -7r13c2
Last edited by RSW on Sat Sep 05, 2020 8:49 am, edited 3 times in total.
RSW
 
Posts: 89
Joined: 01 December 2018
Location: Western Canada

Re: Minisat hardness

Postby SpAce » Fri Jul 24, 2020 4:33 pm

coloin wrote:I could be completely wrong but this puzzle has the highest suexratt rating .....
I Wonder what you make of it !!!!

Well... Don't ask me to justify this too hard. Let's just pretend it makes sense.

Code: Select all
.------------------------------.------------------------------.------------------------.
|  a1259      125679    3      |   47-125   12459     467-159 |   478-29  24589  2479  |
|   4       bg1259-7  be7-1259 | cf1257*    8       cf1579*   | cf279*    3      6     |
|  a259       25679     8      |   347-25   23459     3467-59 |   1       2459   2479  |
:------------------------------+------------------------------+------------------------:
| bg1259-8    4       ce1259   | cf158*     6       cf158*    | cf289*    7      3     |
|   12358     37-125    12567  |   9        1345      3478-15 |   468-2   1248   124   |
|   1389      37-19     1679   |   3478-1   134       2       |   468-9   1489   5     |
:------------------------------+------------------------------+------------------------:
| be3-1259  ce1259-3    4      | cf1235*    7       cf1359*   | cf239*    6      8     |
|   6         123589   d1259   |   348-125  123459    348-159 |   347-29  1249   12479 |
|   7         12389    d129    |   6        12349     348-19  |   5       1249   1249  |
'------------------------------'------------------------------'------------------------'

Step 1. A fishy AALS-loop (same caveats as with an SK-loop):

1259 = wxyz

(w|x=yz)r13c1 - (y|z)r2c23|r47c1 = {(yz)R247 \ (yz)c467#(2+2) 4n3 7n2} - (y|z=wx)r89c3 - (w|x)r7c12|r24c3 = {(wx)R247 \ (wx)c467#(2+2) 2n2 4n1} - loop
=> -wxyz(1259) c467(~r247), r2c3, r56c2, r7c1, -3 r7c2, -7 r2c2, -8 r4c1 (43 eliminations, basics)

Step 2: RP(29)r1c9,r9c3 => -29 r9c9; stte

--
Edit. Corrected typo: r13c13 -> r13c1
Last edited by SpAce on Fri Jul 31, 2020 7:55 pm, edited 1 time in total.
-SpAce-: Show
Code: Select all
   *             |    |               |    |    *
        *        |=()=|    /  _  \    |=()=|               *
            *    |    |   |-=( )=-|   |    |      *
     *                     \  ¯  /                   *   

"If one is to understand the great mystery, one must study all its aspects, not just the dogmatic narrow view of the Jedi."
User avatar
SpAce
 
Posts: 2673
Joined: 22 May 2017

Re: Minisat hardness

Postby yzfwsf » Sun Jul 26, 2020 10:53 am

My solver found a double JE.
Double JE.png
Double JE.png (43.42 KiB) Viewed 362 times
yzfwsf
 
Posts: 328
Joined: 16 April 2019

Re: Minisat hardness

Postby SpAce » Fri Jul 31, 2020 9:25 pm

Hi yzfwsf,

yzfwsf wrote:My solver found a double JE.

I think it also found some non-standard JE4 eliminations: -7r2c46, -8r4c46, -3rc46. Nice! They looked a bit out of place as direct eliminations at first, but I think they're actually quite clever. What was the exact justification it used for them? Does it see that (125)r247 and (159)r247 get locked as hidden triples because there has to be 2x1259 in r247c467 (and r247c7 can only hold 1x29, locking the other 6 candidates in r247c46)? Or something else?

In any case, that's a whopping 57 eliminations in a single step. Do you have examples of even more?

PS. Of course this is purely academic, as the 5-6 basic target+mirror eliminations of either JE2 trivialize the puzzle on their own. (Thus RSW's solution only needs two steps, too. One of the JE2s is redundant.)
User avatar
SpAce
 
Posts: 2673
Joined: 22 May 2017

Re: Minisat hardness

Postby yzfwsf » Mon Sep 07, 2020 9:03 am

Do you have examples of even more?

Code: Select all
........1..1..2..3.4..1..5...3.6.5.7.6....3..7....4.6..3..4..7..5..76.3.8..9.....
........1.....2.3..45.3.6.....7.......8.9.3.64.9.8.5....6.......5.9..8..8...5..63
........1.....2.....3.4.25...26..74..3......89...7......4..756..7.46....3...5....
yzfwsf
 
Posts: 328
Joined: 16 April 2019

Re: Minisat hardness

Postby Leren » Tue Sep 08, 2020 7:12 am

RSW wrote : Rule 9: Mirror node 1 has locked nonbase digit 8. All other nonbase digits can be eliminated from the mirror node => -3r89c2 -7r56c3

As written that looks a bit confusing. I think what you meant to say was something like :

Rule 9: Mirror node 1 has locked nonbase digit 6. All other nonbase digits can be eliminated from the mirror node => -7r56c3
Rule 9: Mirror node 2 has locked nonbase digit 8. All other nonbase digits can be eliminated from the mirror node => -3r89c2

Hope this helps. Leren
Leren
 
Posts: 4040
Joined: 03 June 2012

Re: Minisat hardness

Postby SpAce » Wed Sep 09, 2020 2:46 am

Leren wrote:
RSW wrote : Rule 9: Mirror node 1 has locked nonbase digit 8. All other nonbase digits can be eliminated from the mirror node => -3r89c2 -7r56c3

As written that looks a bit confusing. I think what you meant to say was something like :

Rule 9: Mirror node 1 has locked nonbase digit 6. All other nonbase digits can be eliminated from the mirror node => -7r56c3
Rule 9: Mirror node 2 has locked nonbase digit 8. All other nonbase digits can be eliminated from the mirror node => -3r89c2

I concur. The same is true for the other pair of mirror cells as well. Glad you caught that. When I said "perfect" (in another thread), I was just happy to see the correct rule references but didn't check they made complete sense.

Personally I think a picture is always better than words anyway. Thus I'd be happiest to see a grid with relevant markings. Then the simplest kinds of eliminations (like here) don't necessarily need any verbal explanations, as long as the different types are separated:

Code: Select all
.-------------------------------.--------------------------.----------------------.
| b1259     M1259.6-7   3       |  12457   12459    145679 |  24789  24589  2479  |
|  4        T1259-7     12579   | s1257    8       s1579   | s279    3      6     |
| b259      M259.6-7    8       |  23457   23459    345679 |  1      2459   2479  |
:-------------------------------+--------------------------+----------------------:
| T1259-8    4         t1259    | s158     6       s158    | s289    7      3     |
| M125.8-3   12357     m125.6-7 |  9       1345     134578 |  2468   1248   124   |
| M19.8-3    1379      m19.6-7  |  13478   134      2      |  4689   1489   5     |
:-------------------------------+--------------------------+----------------------:
|  12359    t1259-3     4       | s1235    7       s1359   | s239    6      8     |
|  6        m1259.8-3  B1259    |  123458  123459   134589 |  23479  1249   12479 |
|  7        m129.8-3   B129     |  6       12349    13489  |  5      1249   1249  |
'-------------------------------'--------------------------'----------------------'

Code: Select all
JE2:(1259)r13c1,r4c3,r7c2 => -3r7c2; -7r56c3, -3r89c2

JE2:(1259)r79c3,r2c2,r4c1 => -7r2c2, -8r4c1; -7r13c2, -3r56c1

Or a bit more verbose:

Code: Select all
JE2:(1259)r13c1,r4c3,r7c2 => -3r7c2 (target); -7r56c3, -3r89c2 (mirrors with locked 6, 8)

JE2:(1259)r79c3,r2c2,r4c1 => -7r2c2, -8r4c1 (targets); -7r13c2, -3r56c1 (mirrors with locked 6, 8)

(The marked grid should also make it obvious that the two could be combined into a JE4 with many more eliminations.)
-SpAce-: Show
Code: Select all
   *             |    |               |    |    *
        *        |=()=|    /  _  \    |=()=|               *
            *    |    |   |-=( )=-|   |    |      *
     *                     \  ¯  /                   *   

"If one is to understand the great mystery, one must study all its aspects, not just the dogmatic narrow view of the Jedi."
User avatar
SpAce
 
Posts: 2673
Joined: 22 May 2017

Re: Minisat hardness

Postby Leren » Wed Sep 09, 2020 9:36 am

RSW wrote : Nice puzzle. Solvable with 3 (or perhaps fewer?) steps:

JE2: (1259)R13C1, r4c3 r7c2 => -3r789c2 -7r56c3
JE2: (1259)R89C3, r2c2 r4c1 => -3r56c1 -7r123c2 -8r4c1
Sashimi X-Wing: (9):r38c125 => -9r1c2

Hopefully I'm not getting too off topic here but either of the JE2's can be replaced by some simple basics, 4 HP's and 3 intersections. This often happens when you have 2 locked non-base digit mirror node eliminations. Leren
Leren
 
Posts: 4040
Joined: 03 June 2012


Return to General