@denis_berthier: thank you!
@eleven Big question! Hope you don't mind a somewhat long answer.
I am a researcher in cognition and my interest in artificial neural nets comes from that standpoint -- I want to understand how they mimic/differ from human cognition. I'm using Sudoku as a testbed to analyse that. Accordingly, I set up a net that solves Sudoku like a human would, one move at a time. You feed in a board state and it returns a board state with one more entry filled in.* You then feed that state back into the net and repeat N times, where N is the no. empty squares in the initial puzzle.
*This is a bit of a simplification -- it returns a probability distribution on possible moves for each empty square. I take the highest-probability move and throw away the rest.
Now, the net has to learn everything from first principles -- during training it gets given millions of Sudoku problems, and feedback on whether it makes correct moves or not. So it's not at any point being told what an X-wing is, or even given puzzles that specifically have X-wings in; it just gets faced with Sudoku and has to make progress somehow.
In order to create a 'fair' comparison point for the net, I tackled Sudoku myself the same way -- I solved the 320 problems in the book 'The Original Sudoku' from scratch, without looking up anything about standard solving techniques. [So e.g. I have no idea what an X-wing actually
is -- please don't 'spoil' me yet as it will mess up my methodology.] Then I compared the net's step-by-step solutions to my own ones and asked questions like 'did we hit the same "bottlenecks"?' I.e. did we find we the same problem states 'hard'? The answer was yes -- you can see that in this image:
In terms of my goal, I wanted to point out to researchers that it doesn't mean much to say that e.g. a net gets 95% accuracy on the 17-clues problem set. It's just a trivial bar. It would be nice to have an alternative dataset to point people at, and after talking to you all I think that the Patterns Game dataset is a good one for covering the spread of difficulties. I should say though that I'm not currently trying to get the net to solve the hardest problems possible -- I want it to solve hard enough problems that there is some substance to analyse, but beyond that I'm aiming to have it be as small as possible, so that I can try to understand
how it's solving problems. Right now it's better at Sudoku than me, which is good enough!
Edit: graagh, the image is huge again. What am I doing wrong?