Creating a balanced test set for Sudoku software

Programs which generate, solve, and analyze Sudoku puzzles

Re: Creating a balanced test set for Sudoku software

Postby eleven » Mon Apr 18, 2022 8:23 pm

Very interesting.
Hope, someone can help you with graph/image formatting. Not too bad, all can be seen.
Interesting for me, that it seems to see the 8 before the single 9, which then leads to the 8 for a manual solver.

What i am curious about: Is it consistent for equivalent puzzles, that is those, which you can get by switching bands or stacks, or rows in a band, columns in a stack, or mirroring at the diagonal ? I think, it should be, or the training would be biased.
eleven
 
Posts: 3186
Joined: 10 February 2008

Re: Creating a balanced test set for Sudoku software

Postby Clouded » Mon Apr 18, 2022 8:30 pm

Not too bad, all can be seen.

Feels antisocial to post huge images like that, though!

Is it consistent for equivalent puzzles, that is those, which you can get by switching bands or stacks, or rows in a band, columns in a stack, or mirroring at at the diagonal ?


All versions of the net have built into their structure an 'understanding' of the symmetries that arise from "switching bands or stacks, or rows in a band, columns in a stack". (I'm assuming band = 3 rows, stack = 3 columns!) They don't understand mirroring at the diagonal, as that's technically tricky, so it's genuinely the case that mirroring the problem can make the net more/less able to solve it.

There's one more symmetry you don't mention, which is the symmetry of 1,2,3,4,5,6,7,8,9. The early versions of the net, including the one I illustrated, did not understand that symmetry.* There were therefore stages where they just became over-keen on a number; I have a pic somewhere of the net going mad and filling in as many 6s as possible (often in incorrect locations). The versions of the net I've constructed in the last few months do understand that 1-9 symmetry.

*This was deliberate; I wanted to understand whether how well net would 'learn' about the symmetry on its own. Those versions were trained on ~100 million problems to minimise possible bias.
Clouded
 
Posts: 18
Joined: 10 December 2020

Re: Creating a balanced test set for Sudoku software

Postby eleven » Mon Apr 18, 2022 8:49 pm

Ah, nice.
Another point is about candidate grids. For harder puzzles manual solvers would use them. They allow to eliminate single digits in a cell, which does not give a solution number, but increases the chance for finding one.
[Added:]The question is: does it impliciteley (in some way) use a similar strategy in finding a number, in a state, where it is hard to proceed ?
Last edited by eleven on Mon Apr 18, 2022 9:02 pm, edited 1 time in total.
eleven
 
Posts: 3186
Joined: 10 February 2008

Re: Creating a balanced test set for Sudoku software

Postby Clouded » Mon Apr 18, 2022 8:59 pm

@eleven Do you mean pencilmarks/Sukaku? I deliberately avoided using those as a human, but it turned out to be difficult to stop the net from storing that kind of information. [There is a way to do it, but it requires a lot more computational power than I have access to.]
Clouded
 
Posts: 18
Joined: 10 December 2020

Re: Creating a balanced test set for Sudoku software

Postby eleven » Mon Apr 18, 2022 9:04 pm

Ah thanks, i just added my question above, which you already answered.
eleven
 
Posts: 3186
Joined: 10 February 2008

Re: Creating a balanced test set for Sudoku software

Postby Clouded » Mon Apr 18, 2022 9:10 pm

BTW, do you (or anyone) know whether there is a cutoff SE rating above which you need to use candidate grids to solve problems?
Clouded
 
Posts: 18
Joined: 10 December 2020

Re: Creating a balanced test set for Sudoku software

Postby eleven » Mon Apr 18, 2022 9:27 pm

It starts about ER 7, where you normally need chains to solve it. Of course, at that rating the chain often will give a number directly, but the rating does not distinguish, if it is a helping move or one, which already solves a cell (for manual solvers this is a big difference).
eleven
 
Posts: 3186
Joined: 10 February 2008

Re: Creating a balanced test set for Sudoku software

Postby eleven » Mon Apr 18, 2022 9:47 pm

I don't know, if you are aware, that the PG puzzles all have to be symmetric, which is a hard limitation. There are open source programs, which can generate also hard puzzles (without that restrictions) very quickly. if you are interested, i think, that PG people can help you to do that yourself.
eleven
 
Posts: 3186
Joined: 10 February 2008

Re: Creating a balanced test set for Sudoku software

Postby Clouded » Mon Apr 18, 2022 9:51 pm

@eleven Thanks for the info. I think I'm fine with symmetry. The only thing I didn't understand when I started this thread was that there was a reason that (e.g.) 5.1 was a rare rating. I thought that the PG dataset must have been biased in some odd way.
Clouded
 
Posts: 18
Joined: 10 December 2020

Re: Creating a balanced test set for Sudoku software

Postby eleven » Mon Apr 18, 2022 10:00 pm

Well i am not fine with a solver, which is specialized for symmetric puzzles only ;)
I think the ER 5.1 question is answered: some ratings (solution techniques) are common in puzzles, some rare (but often the rare ones are easier for manual solvers).
eleven
 
Posts: 3186
Joined: 10 February 2008

Re: Creating a balanced test set for Sudoku software

Postby Clouded » Mon Apr 18, 2022 10:06 pm

@eleven My solver was trained entirely on asymmetric problems, so it can't have learned to exploit symmetry -- but I agree that is a problem in general. If there was an existing asymmetric dataset that was as comprehensive as PG, I'd use it! But PG has a certain status which a dataset I created would not have, plus time is short and I want to focus on the net!
Clouded
 
Posts: 18
Joined: 10 December 2020

Re: Creating a balanced test set for Sudoku software

Postby denis_berthier » Tue Apr 19, 2022 2:29 am

Clouded wrote: I set up a net that solves Sudoku like a human would, one move at a time. You feed in a board state and it returns a board state with one more entry filled in.* You then feed that state back into the net and repeat N times, where N is the no. empty squares in the initial puzzle.

This is not what human solvers do, except maybe for the easiest puzzles. It sounds much like guessing. You say you solve puzzles this way. How do you choose the next step?

In human solving, or more generally in logic solving, a step is rarely the assertion of a value; it is the elimination of a candidate. If you consider all the known resolution rules, they are almost always of the elimination type. There are puzzles that require 20 or more eliminations before any value can be asserted.
[Edit 1:] I see this topic has already been talked about, but I'm not satisfied with the answers.[/Edit]


Clouded wrote: during training it gets given millions of Sudoku problems, and feedback on whether it makes correct moves or not. So it's not at any point being told what an X-wing is, or even given puzzles that specifically have X-wings in; it just gets faced with Sudoku and has to make progress somehow.

If the feedback consists of giving the correct move as (one of) the assertion(s) that eliminates the largest number of candidates(*), what you teach your net is a steepest descent method. And if this is also your way of manually solving the puzzles, no wonder that the net finds the same hard points as you.
Notice that, if you took elimination of a candidate instead of assertion of a value as a basic step, this remark would still apply.
I think we need to know more about your solving method.

(*) I can't see any other local criterion of choice. But even this is not enough to eliminate inconsistent choices; so you must somehow guess what's right or wrong.

[Edit 2:] I hadn't checked before, but your exemple solves with Singles. In this case, I would say no step is harder than any other and the resolution path depends on the solver's way of scanning the grid.[/Edit]
denis_berthier
2010 Supporter
 
Posts: 4275
Joined: 19 June 2007
Location: Paris

Re: Creating a balanced test set for Sudoku software

Postby Clouded » Tue Apr 19, 2022 4:04 am

@denis_berthier I know that beyond a certain difficulty it's necessary to use candidate grids to solve problems. I decided when I started that I (as a human) was not going to do that, or give the net access to the same information. It's just a constraint I put on what I was allowed to do. (I did let myself write down some limited information, but not the full candidate grid.) One reasons for this was that what was interesting from a cognitive perspective was what techniques a human would come up with to work around the 'limited memory'. That's often a key factor shaping cognitive problems of all kinds. I'm sure the problems I solved are very easy indeed by your standards (the hardest puzzle in the 'Original Sudoku' book is a SE 3.4), but some of them were hard for me when I was coming to it without knowing any Sudoku techniques or, indeed, ever having solved a Sudoku problem before!

Incidentally, this is why I asked @eleven whether there was a SE cutoff above which you needed candidate grids to progress.

How do you choose the next step?


Whatever I spot? I don't follow a predetermined algorithm, if that's what you mean. As I worked I built up a collection of heuristics for where to look next. E.g. early on, look for pairs of rows (in the same group of 3 rows) with the same number in. Let's say you spot a pair of 7s; that tells you that another 7 must be in one of three squares. Or another heuristic is, if you spot a 3x3 block with this...

xy.
z..
...

Then it's worth looking in the row and col that have '...' in; if any number that is not one of xyz occurs in both row and column, you know where it must go.

But even this is not enough to eliminate inconsistent choices; so you must somehow guess what's right or wrong.


I don't guess as a human; I only fill in a number when I can give you a proof of why it must be there. The net does guess, in that it doesn't generate a proof. There is a specific technical sense in which I penalise the net for guessing, though. One of the things I was interested in was the degree to which the reasoning and the net's approach converged.

BTW, it would indeed have been more interesting to allow the net to store data between stages -- but I would not want to feed in a candidate grid; instead I'd want it to be able to mark arbitrary 'annotations' on the grid, and have those fed from stage to stage. Seeing what information the net chose to pass would have been fascinating. Such an approach would have needed reinforcement learning, though, and I didn't have the computational power for that. So I went with the more tractable problem.

---

I've thought of an analogy which might help. The way a human does algebra is completely unlike the way a computer algebra system does algebra -- the human just spots things and tries whatever seems best next, whereas the CAS system uses a predetermined procedure. When you talk about Sudoku, you sound like you're using an approach more like the CAS system. But that's not what I'm interested in studying -- I'm interested in the more naive, untrained, fumbling human approach, because it tells us more about cognition.
Clouded
 
Posts: 18
Joined: 10 December 2020

Re: Creating a balanced test set for Sudoku software

Postby denis_berthier » Tue Apr 19, 2022 5:25 am

Clouded wrote:@denis_berthier I know that beyond a certain difficulty it's necessary to use candidate grids to solve problems. I decided when I started that I (as a human) was not going to do that, or give the net access to the same information. It's just a constraint I put on what I was allowed to do. (I did let myself write down some limited information, but not the full candidate grid.) One reasons for this was that what was interesting from a cognitive perspective was what techniques a human would come up with to work around the 'limited memory'. That's often a key factor shaping cognitive problems of all kinds.

Writing the pencilmarks is the main method for limiting the memory load. But let's take your constraints as such.

Clouded wrote:Incidentally, this is why I asked @eleven whether there was a SE cutoff above which you needed candidate grids to progress.

I have no reason not to take eleven's word about his personal limit being about 7. But:
- I think eleven has an additional limit: not too many candidates after Singles (and Whips[1]). Eleven, can you confirm this?
- there are very few people in the world that can solve such Sudokus without writing the PM (and I'm not one of them; actually I can't remember ever trying).

Clouded wrote:
denis_berthier wrote:How do you choose the next step?

Whatever I spot? I don't follow a predetermined algorithm, if that's what you mean. As I worked I built up a collection of heuristics for where to look next. E.g. early on, look for pairs of rows (in the same group of 3 rows) with the same number in. Let's say you spot a pair of 7s; that tells you that another 7 must be in one of three squares. Or another heuristic is, if you spot a 3x3 block with this...
xy.
z..
...
Then it's worth looking in the row and col that have '...' in; if any number that is not one of xyz occurs in both row and column, you know where it must go.

Looks like elementary braids analysis, but I'm no expert on this.


Clouded wrote:I've thought of an analogy which might help. The way a human does algebra is completely unlike the way a computer algebra system does algebra -- the human just spots things and tries whatever seems best next, whereas the CAS system uses a predetermined procedure. When you talk about Sudoku, you sound like you're using an approach more like the CAS system. But that's not what I'm interested in studying -- I'm interested in the more naive, untrained, fumbling human approach, because it tells us more about cognition.

I've always been extremely clear about the difference between the global strategies used by CSP-Rules or by a human solver: simplest-first (unless modified by different preferences) vs first-found-first used - even though both rely on the same resolution rules. But this is far from your topic.

I remain curious to know what you'll find about human cognition.
denis_berthier
2010 Supporter
 
Posts: 4275
Joined: 19 June 2007
Location: Paris

Re: Creating a balanced test set for Sudoku software

Postby Clouded » Tue Apr 19, 2022 7:34 am

Writing the pencilmarks is the main method for limiting the memory load.


I'm told there is something called 'Snyder notation' which is simpler than full pencilmarks? I have not yet looked up what it is, because (as mentioned) I wanted to see what I came up with without looking at the literature.

Edit: There's also solving-without-pencilmarks-t3640.html although I haven't read it (for the usual reason of not wanting to see 'spoilers').
Clouded
 
Posts: 18
Joined: 10 December 2020

PreviousNext

Return to Software