The New Sudoku Players' Forum

by **Mathimagics** » Mon Oct 04, 2021 3:27 am

1to9only wrote:The thread mentions JigsawExplainer used to rate the jigsaws, it is no longer distributed as the project has been inactive for some time.

Thanks for mentioning this, and the "hardest JS" thread info ...

by **Mathimagics** » Mon Oct 04, 2021 4:33 am

I have completed the process of verification and re-org for my full catalog of (symmetric and "nice") jigsaw layouts (JL's). The re-org was made to separate the valid JL's from the invalid JL's.

The catalog contains:

3,571,610 valid JL's

1,182,316 JL's which are also valid for JSB (Sudoku Boxes)

7,415 invalid JL's (JL's with no solutions)

The JL verification used SAT, and took about 7 hours ( 140 JLs/sec) ...

by **sigh** » Sun Oct 31, 2021 9:55 am

Thanks to Mathimagics's data, I could play around with methods of validating JS and JSB layouts. Here's what I found:

- All the valid layouts (and most invalid ones) are fast to check if you find the correct region to seed with 1-9. Choosing the wrong region can lead to significantly worse times.
- To find a good region, I did a very short search bounded search from every region. If any of these complete, then we are done. If not, then do a full search using the region which progressed the furthest - i.e. the one which was able to reject the most branches.
- In general, I found this progress calculation could provided a reasonable estimate of how much time was remaining for a search. (I display this in my solver, and was surprised at how useful and accurate it was).

For searching, what I found effective (above the basics) were:

- Automatically deriving extra constraints in advance, such as determining where the law-of-leftovers can be applied. The actual constraints generated check whether two sets of cells have the same values, so are quick to evaluate during the search.
- Choose the next candidate cell based on which ones have lead to the most conflicts previously in the search. This allows the search to very quickly reject later branches once a conflict is found.

The timings I got (on the layouts provided by Mathimagics) were:

Checking 2522 valid JS layouts: 7s
Checking 7415 invalid JS layouts: 69s
Checking 2500 valid JSB layouts: 16s

The most difficult layout for my solver, taking 3.2 seconds, was this invalid layout (link to layout):

Code: Select all: AAAAABBBBCCCAADBBBCCAADDDDBCDDDDEEEBCCCEEEFFFGEEEHHHHFGHHHHIIFFGGGHIIFFFGGGGIIIII

By contrast the hardest JS layout to validate only took 13ms. This makes me suspect that all invalid layouts should have simple refutations. I would be interested if anyone can see one for the one above, as it might gives clues as how to improve the slower cases.

by **Mathimagics** » Sun Oct 31, 2021 7:29 pm

Some general observations:

you might also consider rows and columns also as "seeds" for the validator?

checking the JL for diagonal-symmetry, you can then reduce the house (seed) choices from 9 to 5

sigh wrote:The most difficult layout for my solver, taking 3.2 seconds, was this invalid layout ... by contrast the hardest [valid] JS layout to validate only took 13ms. This makes me suspect that all invalid layouts should have simple refutations.

Surely testing valid layouts is generally going to be quicker than testing invalid layouts, since the solver can return immediately a solution is found. So I'm not so sure about your last conjecture! I'd suggest that, while most IJL's will have simple refutations, some will not.

The IJL in your example is a case in point. It takes my DFS solver anywhere from 3s to 30s depending on the seed, so even the best case still reflects an
inherent difficulty wrt detecting contradictions.

Interestingly, the best time of all for verifying the invalidity of this JL (other than the SAT method) is ~0.050s, which was obtained by a "find disjoint transversals" method using DLX. This method gives similarly good results for most IJL's, but there are many pathological cases in the 7415 file for which it takes 30s+, and of course it doesn't really shed any light on the general invalid JL recognition problem.

Cheers
MM

by **sigh** » Mon Nov 01, 2021 8:39 am

Mathimagics wrote:you might also consider rows and columns also as "seeds" for the validator?

Yes, sorry I was being imprecise. I check all 9-cell all-different constraints. So this would include things like sudoku-x diagonals, or windoku boxes if they exist also.

Mathimagics wrote:checking the JL for diagonal-symmetry, you can then reduce the house (seed) choices from 9 to 5

True, but the phase of finding a good seed is short, so would only make a difference to cases which are already fast.

Mathimagics wrote:Surely testing valid layouts is generally going to be quicker than testing invalid layouts, since the solver can return immediately a solution is found. So I'm not so sure about your last conjecture! I'd suggest that, while most IJL's will have simple refutations, some will not.

If there was a small bound on the amount of effort required to find solution to a valid layout, then detecting invalid layouts would be easy: just keep searching until you've exceeded the bound. Likewise, if there exists layouts which have hard to find solutions, it would make more sense that invalid layouts would also be hard. Even deep into a search, it could be the case that an obscure solution exists. If not, then what exactly is the search still trying to rule out?

A simple example of wasted work is handling the symmetry due to permutation of digits. Without breaking this symmetry the solver would do 9! times as much work, despite the fact the it has already done enough to disprove validity after checking one permutation. If one did not realise this, it would seem like checking for invalidity was at least 9! times harder than checking for validity. This doesn't mean that proving invalidity was difficult, just that the solver did not have the correct tools (symmetry breaking) to make the proof.

Mathimagics wrote:Interestingly, the best time of all for verifying the invalidity of this JL (other than the SAT method) is ~0.050s, which was obtained by a "find disjoint transversals" method using DLX.

Ah, I don't know this technique. I'll look into that when I get some time.

The New Sudoku Players' Forum

Jigsaw Layouts (Generate / Test)

Re: Jigsaw Layouts (Generate / Test)

Re: Jigsaw Layouts (Generate / Test)

Re: Jigsaw Layouts (Generate / Test)

Re: Jigsaw Layouts (Generate / Test)

Re: Jigsaw Layouts (Generate / Test)