Hi sdk_fan,
though its really strange to write a program, which tries to solve a sudoku using a way, which is more complicated and a lot slower than public ones, without delivering an output, which can manually be followed, and though i am very short of time, ... i could not resist to make a try to mimic your program in the way i scetched above.
In the time i had, i did not manage to fix all the bugs. When i ran it against the first 200 in champagne's list today, it failed for one puzzle with 0 solutions, while the other ones were correct. So i guess its just a minor bug concerning a special property of this puzzle. Maybe i'll fix that, but probably not. I think the questions i had, are answered sufficiently.
Btw you inspired me to another favorite method to
solve hardest puzzles on paper without any computer help
Instead of
guessing a number, making a copy and trying to show a solution/
contradiction, until all copies are resolved, i now would select a unit and gusss a full assignment. This would give me say 30 copies (i noticed that there are between 20 and 1000 possible assignments for the 27 units in the puzzles i looked at) of the puzzle with this unit resolved. They would have about newspaper puzzles difficulty, most singles puzzles and a few harder ones. But for 29 of them i would not have to find a solution, but a contradiction. This could be done in a day or two and i am sure its more interesting than to mimic Brian's method on paper. However when i changed the bb_solver to do that, it was 60 to 120% slower, when i selected the smallest and an average size assignment set fo start with.
Now the questions i was interested in were mainly, how big the set of possible solutions per level would become and how strong it is influenced by the selected order of units to add (in your case groups, but for the hardest each group contains of all unsolved cells of a unit).
I dont know an upper limit, because on my 1 GB RAM notebook i had to stop with 4,5 mio solutions. But to find a good order is very important. Since i guess, that orders other than making alternatively a full band and stack until all are resolved, only can be worse, i tried all 36 ways of band/stack combinations (with a fixed order of units within the band/stack). Though all these 199 puzzles could be solved with this 4.5 mio limit, there were big differences in the maximum size, e.g. i saw about 250000 as the minimum and some other orders, which could not make it within the limit.
The bad thing is, that i dont have an idea how to find a good order. Allways to take the band/stack with the least possible assignments in the units is often a bad strategy. I have the suspicion, that no better heuristics can be found, which dont take long to evaluate. So to define the order is a kind of
guessing.
My only alternative to avoid heavy memory problems (at least on the notebook) was to use a
backtracking method. Try the first order, if it exceeds the 4,5 mio, try the next. (Of course here i can stop, when i am through with all units. All solutions are known then - at least when the bug is fixed). This took about 40 minutes for the 200 puzzles @2,67Gh.
You asked for the
data structures:
The solutions are stored in a STL list. When i add a unit, i first make a list of all possible assignments and from that an array of vectors, which contain the possible assignments for a specific pattern of the intersecting (engaging) cells.
Of course you cou
avoid any memory problems by doing this depth first. Take the first assignment for the first unit, then the first of the combination with the next assignment, etc, until you have nothing or a solution. And then ...
backtrack. All you would need in memory is, to save how to get to the next solution per level. This can be easily done, when all these vector arrays are kept in memory for all levels (units to combine) - you can calculate them, as soon you have defined the unit order. Then you just need a pointer to where you are.
Then i cant see any performance advantage for the breath first method.