The New Sudoku Players' Forum

by **dobrichev** » Tue Apr 16, 2013 5:22 pm

Here is my elementary code for backdoors.

Hidden Text: Show

Code: Select all: unsigned long long solveSingles(const char* in) { game g = defaultGame; //start from a copy of the empty game g.maxSolutions = 1; init(g, in); //set all known digits if(g.mode & MODE_STOP_PROCESSING) //invalid or solved by direct eliminations return g.nSolutions; checkForLastOccurenceInGroup(g); if(g.mode & MODE_STOP_PROCESSING) //invalid or solved by singles return g.nSolutions; return 0; //no solution } extern int solverBackdoor(char* in, const bool verbose) { char sol[256]; int found = 0; if(solve(in, 2, sol) != 1) return -1; if(solveSingles(in)) { //solved with singles, no need of backdoor examination return 0; } for(int i = 0; i < 81; i++) { if(in[i]) continue; in[i] = sol[i]; if(solveSingles(in)) { //cell at i is a backdoor found = 1; goto level1_done; } in[i] = 0; } level1_done: if(found) return 1; for(int i = 0; i < 80; i++) { if(in[i]) continue; in[i] = sol[i]; for(int j = i + 1; j < 81; j++) { if(in[j]) continue; in[j] = sol[j]; if(solveSingles(in)) { //cells at i,j are a backdoor found = 1; goto level2_done; } in[j] = 0; } in[i] = 0; } level2_done: if(found) return 2; for(int i = 0; i < 79; i++) { if(in[i]) continue; in[i] = sol[i]; for(int j = i + 1; j < 80; j++) { if(in[j]) continue; in[j] = sol[j]; for(int k = j + 1; k < 81; k++) { if(in[k]) continue; in[k] = sol[k]; if(solveSingles(in)) { //cells at i,j,k are a backdoor found = 1; goto level3_done; } in[k] = 0; } in[j] = 0; } in[i] = 0; } level3_done: if(found) return 3; for(int i = 0; i < 78; i++) { if(in[i]) continue; in[i] = sol[i]; for(int j = i + 1; j < 79; j++) { if(in[j]) continue; in[j] = sol[j]; for(int k = j + 1; k < 80; k++) { if(in[k]) continue; in[k] = sol[k]; for(int l = k + 1; l < 81; l++) { if(in[l]) continue; in[l] = sol[l]; if(solveSingles(in)) { //cells at i,j,k,l are a backdoor found = 1; //dump the backdoor if(verbose) { printf("\n%d\t%d\t%d\t%d",i, j, k, l); } else goto level4_done; } in[l] = 0; } in[k] = 0; } in[j] = 0; } in[i] = 0; } level4_done: if(found) return 4; return 5; }

Yes, some functions not included here are used, but they are in the published source of gridchecker.
Function input is array with char(0) for non-givens, char(1) for "1", etc. Note that the input is destroyed so first print the puzzle and then call the function.

Website · by **denis_berthier** » Tue Apr 16, 2013 5:42 pm

eleven wrote: -qFN -f'%#bM'

This seems to be working. I'll let it run overnight and see tomorrow morning what I get.

by **dobrichev** » Tue Apr 16, 2013 5:56 pm

eleven wrote:Here is my proof, that this 33 clue puzzle (and all it's minimals) has backdoor size 4....

Very nice. Congratulations!

eleven wrote:Backdoors and T&E are indeed very different....

What is the minimal number of jokers you need to solve a puzzle knowing only singles as techniques? That is the backdoor size in singles. A joker eliminates all wrong candidates in the cell.
The moment of asking for a joker doesn't matter. It could be at the beginning (blindly add clues from the known solution as in my code) or at the middle of the
solving process (when stuck, use a joker, but try all yet unsolved cells in the same way to ensure there is no shorter path if you use the joker over some of them).
Denis, as I understand, tries to pass over this logic by examining all but the right candidate in the questionable cell, hoping that all these candidates need less jokers later. Depth is counted, which is equal to counting the joker only once. This could work.
I am still interesting in an example. If my understanding is correct, then from the millions of bd3 puzzles I have in hands, possibly there is one which requires T&E(3).

by **eleven** » Tue Apr 16, 2013 6:58 pm

Mladen, thanks for appreciating my proof.

When you look at the 3 backdoor needing patterns in the bd3 proof, you can see, that each one can be resolved by trying at most 2 wrong candidates and using singles only. So this bd3 puzzle is T&E(1).

When i searched for hard puzzles, i adopted Brian Turners program to see, if the puzzles are in T&E(2) (but i don't know, if i have it anymore, maybe it's in a big zip file). The only changes i had to make were, that solution digits are not tried, and the recursion is limited to depth 2.

by **dobrichev** » Tue Apr 16, 2013 9:05 pm

Hmm. At this point I am totally lost.
It seems that the goal of T&E is to eliminate one by one the candidates, and a successful try is this which leads to contradiction, else is error. The guessing is the opposite. Furthermore guessing has nothing in common to T&E and a successful guess isn't an errored try.
Making 2 T&E leads to conclusion that the puzzle can be solved in T&E(1). Does 0 T&E lead to T&E(-1) in this notation?
Brian Turner's solver chooses a cell with minimal number of candidates for T&E (or guessing?). Statistically it is expected less T&E to be done working in this way. One pseudorandom cell from those with least candidates is chosen, and no other cells are tried at the same level. Isn't so measured T&E depth always the shortest one, or it is almost always?
A candidate elimination in the middle of the process adds more value than fixing the value of the same cell in the beginning.
Have to follow the example, hoping that it lies in the bd4 or bd3 proof. Still a chance to open my eyes. Will do it soon, but not today.
Else I will fly to Mars seeking kindred spirits.

Anyway, thanks for the comments.

by **eleven** » Tue Apr 16, 2013 10:41 pm

Think, it's just one of these naming and terminology problems, we have so often here.

dobrichev wrote:It seems that the goal of T&E is to eliminate one by one the candidates, and a successful try is this which leads to contradiction, else is error. The guessing is the opposite. Furthermore guessing has nothing in common to T&E and a successful guess isn't an errored try.

No, a succesful guess (which solves a puzzle) is an errored try (thats what i found funny). But normally an errored try is just getting stuck.

Making 2 T&E leads to conclusion that the puzzle can be solved in T&E(1). Does 0 T&E lead to T&E(-1) in this notation?

T&E(n) means, that the highest recursion level for singles chains to solve the puzzle is n (iow the hardest step needed level n), no matter how much or how long (recursice) chains you needed.

Brian Turner's solver chooses a cell with minimal number of candidates for T&E (or guessing?).

Both. If a candidate in the cell leads to a contradiction, he eliminates it, if it leads to a solution, he reports it.

Statistically it is expected less T&E to be done working in this way. One pseudorandom cell from those with least candidates is chosen, and no other cells are tried at the same level. Isn't so measured T&E depth always the shortest one, or it is almost always?

I don't know your understanding/definition of "T&E depth", may it be the total number of tries, or the maximum/average recursion level for this algorithm. But in this context we mean, what i described above (i hope i can say that, because Denis did not correct my previous interpretation).

A candidate elimination in the middle of the process adds more value than fixing the value of the same cell in the beginning.

Not for the progress of solving, but a candidate added counts one backdoor, and a candidate eliminated nothing, if it did not need a higher recursion level than the candidates eliminated before.

Else I will fly to Mars seeking kindred spirits.

Oh, if there is a free seat, Mars will be interesting anyway

Website · by **denis_berthier** » Wed Apr 17, 2013 3:33 am

dobrichev wrote:If my understanding is correct, then from the millions of bd3 puzzles I have in hands, possibly there is one which requires T&E(3).

I bet my bike against your seat to Mars that you won't find any.

AFAIK, there had never been any systematic search for large backdoors before you tried it. So, nothing really made it unlikely that you could find a bd4.

But there has been much work on hard puzzles (wrt SER), using various approaches. My BpB classification results (that take all this work into account and that required many CPU days) show not only that all the known puzzles are in T&E(2) but that they are in B7B. So, before finding a 9x9 puzzle in T&E(3), you'd have to find one in B8B, .... B36B, ... [T&E(2) = BpB with p = infinity]. I explicitly state 9x9 because, for larger puzzles, we have examples in T&E(3), T&E(4), ...
As there is some correlation between the BpB rating and SER, you'd have to find a puzzle with SER >> 12.
Of course, this is not a proof of impossibility - but ... I'm already preparing my backpack for Mars.

Website · by **denis_berthier** » Wed Apr 17, 2013 4:16 am

This thread is probably not the best place for this, but the discussion came out here...

There has been much discussion in the past about T&E and it seems to remain a difficult topic. Maybe a problem of terminology, as eleven says, but probably a deeper problem of misunderstanding what's at stake.

As for me, I always refer to my definition - AFAIK the only available formal definition. You can find it in many places (e.g. on this forum and in my last book) and I recommend you read it very carefully before going further on this topic. There are a few subtleties that one may overlook, such has having to try the same candidate several times. As eleven mentions it above, he easily adapted Brian Turner's solver to comply with it.

I claim that my definition is the proper formalisation of Ruud's original one (I don't know if you've known Ruud, but he was the manager of the main 3 forums):
"try a candidate and if it leads to a contradiction delete it" (as far as I can remember - I can't find a copy of it).

Of course, this was very vague:

- how may it lead to a contradiction ? After all, any candidate that is not the solution is in a logical contradiction with the givens, so if you don't say how, you haven't said much. This is IMO the main point at stake with T&E. However, it seems obvious that what he and many people had in mind was using only Singles and Elementary Constraints Propagation. So I call it T&E(Singles).

- Could one use more rules ? I.e. could this be generalised to any resolution theory T so as to define T&E(T) ? Yes ... provided that T has the confluence property: the result mustn't depend on the order the rules are applied, otherwise the procedure is not well defined. There's no problem with T&E(Singles) because the confluence property is obviously true. This should answer the questions about how to choose which candidates to try: it has no influence on the final result. As for the computation times, I'm not sure at all that choosing candidates in bivalue (rc, rn, cn or bn) cells is the best thing.

- what do you do if it doesn't lead to a contradiction ? Ruud didn't evoke this. But here again it seems obvious that the answer was: nothing.

- how can one iterate T&E, i.e. how can one define T&E(n) ? Ruud didn't evoke the question.

I don't think it is useful to speak of T&E in terms of success and failure. In stats also, these terms may be misleading: e.g. for a Bernouilli distribution, it is often the case that the "success" of a random variable (i.e. value 1) may be a failure from the application POV.

A definition is useful for what it allows to prove and to do:
- basically, my definition shows the importance of the confluence property;
-it allows to prove the T&E(T) vs T-braids theorem, making a clear link between T&E and pattern-based solutions;
- [added:] As shown in my last book, the above two remarks remain valid for any finite Constraint Satisfaction Problem;
- and it allows a rough classification of puzzles in 3 levels: the trivial T&E(0), T&E(1), T&E(2), with detailed sub-classifications for the last two.

Website · by **denis_berthier** » Wed Apr 17, 2013 5:20 am

The calculations of backdoor size (bd) for the controlled-bias collection (cb) are over.
The hardest parts were:
- to find the right command for gsf's program (thanks eleven)
- to find my old stats programs and cb collections.
Here are the conclusions.

Preliminaries:
The collection (produced a few years ago with the controlled-bias generator) contains 5,926,343 puzzles, all of which are in T&E(1), i.e. solvable by braids (and, in this case, even by whips).
I therefore use the W rating as a measure of complexity of a puzzle. As I've shown long ago, taking instead the B rating (or the gW, or the gB, or adding Subsets in the rules) wouldn't change the stats.

Results:
At first sight, I was surprised to find a high correlation coefficient between W and bd. However, a closer look at the results showed that this is completely meaningless:
- as might be expected, there is no bd 3 or bd 4 puzzle;
- there are only 7,726 bd 2 puzzles (i.e. 0,13%), which entails that they have almost no impact on the correlation coefficient;
- the apparently high correlation is therefore the mere result of the trivial equivalences: bd = 0 <=> W = 0 (always), bd = 1 <=> W > 0 (in this collection).

[added] I also computed the correlation coefficient between backdoor size and number of clues: 0.14 (no correlation).

Conclusions:
- hard [in the broad sense of not in T&E(1)] puzzles are very rare (estimated less than 1 in 30,000,000); "hardest" puzzles (in the vague meaning this word has in the "hardest" thread) are still much rarer;
- puzzles with large bd are very rare (and we have no estimate of rareness in this case);
- there's no sign of any correlation between the two properties (all the available partial results point to more or less contradictory conjectures).

One of the problems that appear every time we try to talk about puzzles with rare properties is the lack of a random (unbiased or with known bias) uncorrelated collection of puzzles not in T&E(1) (and I have no idea how this could be generated in reasonable computation times).

by **eleven** » Wed Apr 17, 2013 7:26 am

Thanks for the results, Denis.
At least it is nice to know, how rare bd2 puzzles are.

Website · by **denis_berthier** » Wed Apr 17, 2013 8:04 am

eleven wrote:At least it is nice to know, how rare bd2 puzzles are.

I realise I should have also given the following distribution for the cd sample:

Code: Select all: bd0: 2,079,210 35,08 % bd1: 3,839,407 64,79 % bd2: 7,726 0,13 %

by **dobrichev** » Thu Apr 18, 2013 8:44 am

Some words about the generation process.
I started by filtering backdoor-3-in-singles (bd3) from the hardest sudokus. Added some stuff from a cache with not extermely hard (~se 10.0). These were several thousands of bd3 puzzles.
Next I tried variety of vicinity search methods and after about 10 generations a set of ~50K puzzles is formed.
At this point I saw that sufficiently large amount of new puzzles is found at each next generation even with simple {-+1} transformations (remove one given in all possible ways, then add one clue in all possible positions with all possible values, then ignore invalid, non-unique, and non-bd3+ puzzles).
After few more generations, the new puzzles with small and large number of givens disappeared. But still large amount of 23-24 given puzzles appeared.
Here are the counts of the last generations showing still very stable exponential growth with factor ~2, even using simple mutations only.

Code: Select all: gen# puz_count 19 3,025,000 20 6,084,448 21 12,354,974 22 25,210,983

by **eleven** » Thu Apr 18, 2013 8:59 pm

Interesting.
Could you please make a test on a set of these puzzles, to try to add solution digits and look, if the puzzle still has the same bd size ?
I wonder, if it was an exception with the bd4's, that there was a (non mimimal) 33 clue, which already had the property, and a 42 clue with all (more than 20000) bd3's below.
Also i would like to see (other) small patterns, which need a backdoor.

by **dobrichev** » Fri Apr 19, 2013 3:04 am

eleven wrote:Could you please make a test on a set of these puzzles, to try to add solution digits and look, if the puzzle still has the same bd size ?
I wonder, if it was an exception with the bd4's, that there was a (non mimimal) 33 clue, which already had the property, and a 42 clue with all (more than 20000) bd3's below.

Here is one 42 clue with 1716 bd3.

Code: Select all: 1234567.9.5.18923..89237.51246..8...37.6...4...5.4..6.53.9.1..8..2.......6..24.7.

It is coming from another fruitful grid with ~50k minimal bd3 puzzles, maybe a candidate for a secondary bd4.

eleven wrote:Also i would like to see (other) small patterns, which need a backdoor.

There is plenty of bd1 puzzles. Maybe small bd patterns are not rare there?

Website · by **denis_berthier** » Fri Apr 19, 2013 4:35 am

Eleven, Mladen,

What do you call a small pattern? I don't remember seeing this expression before. Do you mean a "small" number of givens?

The New Sudoku Players' Forum

One flew over the backdoors

Re: One flew over the backdoors

Re: One flew over the backdoors

Re: One flew over the backdoors

Re: One flew over the backdoors

Re: One flew over the backdoors

Re: One flew over the backdoors

Re: One flew over the backdoors

Just a few words about T&E

backdoor size vs W rating

Re: One flew over the backdoors

Re: One flew over the backdoors

the method

Re: One flew over the backdoors

bd patterns

Re: One flew over the backdoors