tarek wrote:The need to guess & how the solver guesses is what makes a difference in these bigger puzzles. The need to guess is reduced with intersections & locked sets. there will a trade off between Reduced guessing & Reduced performance with an unpredictable influence on speed.
tarek wrote:The how is down to how clever your BF algorithm is designed to attack the puzzle space.
Well, the BF algorithm is in fact not very sophisticated, it attacks the puzzle space with all the subtlety of a sledgehammer, but there is a very good reason for this.
It's worth reviewing the architecture of this solver, which I call
dSolver16, as I think this will explain much, and it won't take long because it's really quite simple!
- dSolver16 is essentially Mladen dobrichev's fsss2 solver, as used in GridChecker.
- ST's are elementary - Singles only - so it really is very much like a DLX solver, and in fact typically makes the same number of guesses for a given puzzle. But 20 times faster!
- It gets a massive speed advantage over DLX by using bit-mask operations to process these singles. This aspect is highly optimisable, particularly with modern CPU's that have very fast instructions for doing operations on 128-bit and 256-bit variables. For standard Sudoku the entire puzzle state can be packed into 9 x 128-bit variables (144 bytes!). For 16x16 we need 32 x 256-bit variables, but that is still only 1024 bytes. When represented this way, the most heavily performed "tasks", such as eliminating candidates when a digit is placed in a cell, are achieved with a handful of logical bit-mask operations.
- when no singles are available, it goes into guess mode, and the NCS (next candidate selection) is essentially the same as for DLX, namely pick a cell with the least number of possible values, then pick the first of these values to make the guess.
I have done extensive experiments with alternate strategies, and have come to the conclusion that it is pointless trying to find a better one. For simple strategies (where no analysis of the puzzle state is done), it really is a case of "your guess is as good as mine", the solution space below is unknown, and we can't know which guesses will turn out to be good or bad until we make them.
Any attempt to introduce analysis (other ST's) is pointless here because it will always take far longer to perform than the time it will save. That is certainly true for 9x9 Sudoku, and probably true also for 16x16. Another possibility is "look-ahead" - if there are several cells with the least possible values, and generally this is always the case, we can sneak-preview each one to see which choice eliminates the most candidates. This reduces the guess count, but on avarage doubles the actual time!
So
tarek's point about reduced guessing and reduced performance is bang on the money here.
On the other hand, this solver model is very flexible - it can solve in any variant mode that is based on additional/modified houses, we simply change the bit tables that tell the solver which cells are affected when we make set a value in any cell.
For larger puzzles, 25x25, etc, the BF solver becomes pretty useless, as the solution space explodes super-exponentially, and the number of "registers" we need to use in our data model becomes unwieldy. I tend to agree that SAT and similar approaches will become more attractive/effective for these.
Fortunately I have no interest whatsoever in anything beyond 16x16 - if it doesn't fit on a piece of A4 paper it's quite useless to me. I only do P&P puzzles, it's one way to get me off this computer!