The New Sudoku Players' Forum

by **eleven** » Fri Jun 14, 2013 10:08 pm

denis_berthier wrote:
Red Ed wrote:What are "p. 155" and "p. 160"? A reference to a book?

My last book

Red Ed wrote:That's one impressive tome!

Unfortunately i could not afford it - there were so much other books, i had to buy.

Website · by **denis_berthier** » Sat Jun 15, 2013 4:32 am

eleven wrote:Unfortunately i could not afford it - there were so much other books, i had to buy.

There's a free pdf version here: http://arxiv.org/abs/1304.1628

by **Red Ed** » Sat Jun 15, 2013 1:04 pm

I have some results to share from comparing subtree enumeration to what you might call linear probing (like suexg-cb).

I used a set of 1000 randomly-generated puzzles, looping over this repeatedly to build up the statistics. The results will be roughly representative of running the estimation algorithms for real, which is good enough for the purposes of this post. I used Brian Turner's bb_solve as the core solving engine for both subtree enumeration and linear probing, to ensure parity.

The linear probing algorithm was implemented as: pick a random 30-clue subgrid; confirm it has a unique solution; delete clues at random until you get multiple solutions; replace the last clue and then check minimality (removal of any other clue also gives multiple solutions); increment the relevant counters as you go. The results after ten minutes:

Code: Select all: Number of solution grids: 21457975 Number of 30-clue subgrids that were proper: 215645 Total number of minimal puzzles in those 215645 30-clue subgrids: +----+----------+--------------+------------+ | Cl | Count | E(nr/grid) | E(rel err) | +----+----------+--------------+------------+ | 23 | 2 | 8.8918e+012 | 70.71% | | 24 | 8 | 8.5954e+013 | 35.36% | | 25 | 29 | 7.1041e+014 | 18.57% | | 26 | 21 | 1.1080e+015 | 21.82% | | 27 | 14 | 1.5047e+015 | 26.73% | | 28 | 7 | 1.4510e+015 | 37.80% | +----+----------+--------------+------------+

Next, I ran subtree enumeration using the depth-first searcher posted on the forum on 23 Dec 2009. After ten minutes, the following results were reported (and I've added the linear probing figures for comparison):

Code: Select all: Number of solution grids: 694676 Number of 30-clue subgrids that were proper: 7154 Total number of minimal puzzles in those 7154 30-clue subgrids: +----+----------+--------------+------------+ | Cl | Count | E(nr/grid) | E(rel err) | +----+----------+--------------+------------+ Linear probing | 20 | 20 | 4.4984e+006 | 36.06% | E(rel err) for | 21 | 1124 | 1.5421e+009 | 11.55% | comparison ... | 22 | 19257 | 1.7614e+011 | 5.42% | | 23 | 106227 | 7.1658e+012 | 2.80% | 70.71% | 24 | 190937 | 1.0672e+014 | 1.94% | 35.36% | 25 | 119587 | 6.3499e+014 | 1.73% | 18.57% | 26 | 25332 | 1.5065e+015 | 2.15% | 21.82% | 27 | 1797 | 1.4694e+015 | 4.70% | 26.73% | 28 | 41 | 6.0348e+014 | 19.66% | 37.80% | 29 | 2 | 7.8010e+014 | 70.71% | +----+----------+--------------+------------+

To see how much faster subtree enumeration is than linear probing, divide the squares of the relative errors. For c=28 clues, for example, subtree probing's around 3 or 4 times faster. At c=25 clues, it looks like subtree enumeration's about 100 times faster. At c=22 clues, extrapolating back from Denis' full results, subtree numeration is about 3500 times faster than linear probing.

This isn't even the best we can achieve with subtree enumeration. Starting at 30 clues is best only for the upper end of the distribution. For c=25 to 27, it's better to start at 29 clues. For c=20 to 24, it's better to start at 28 clues. It wouldn't be hard to adapt the depth-first searcher to run a combined strategy (you just prune the search space appropriately), giving the best of all worlds. I hope to find time to do that soon.

Website · by **denis_berthier** » Sat Jun 15, 2013 3:15 pm

Red Ed - Posted: Wed Oct 14, 2009 6:59 am p. 37 of the pdf files - wrote:On the parts of the distribution that we're each focussing on, it seems we each have a method tuned for the purpose: my supersets method has no hope of beating yours below 30 clues (nor does it try to); and your paths method has no hope of beating mine above 30 clues (nor do you proclaim any interest in doing).

How miraculous that, after only 10 minutes of computation with the same program, everything is changed!

by **Red Ed** » Sat Jun 15, 2013 4:06 pm

denis_berthier wrote:
Red Ed - Posted: Wed Oct 14, 2009 6:59 am p. 37 of the pdf files - wrote:On the parts of the distribution that we're each focussing on, it seems we each have a method tuned for the purpose: my supersets method has no hope of beating yours below 30 clues (nor does it try to); and your paths method has no hope of beating mine above 30 clues (nor do you proclaim any interest in doing).
How miraculous that, after only 10 minutes of computation with the same program, everything is changed!

Really? In what sense does anything I've just said relate to the supersets method? :roll:

In other news ... a quick update on this:

Red Ed wrote:It wouldn't be hard to adapt the depth-first searcher to run a combined strategy (you just prune the search space appropriately), giving the best of all worlds.

Sadly, this doesn't help, due to the overhead of reaching the prune-points in the first place. Oh well.

Website · by **denis_berthier** » Sat Jun 15, 2013 4:52 pm

Red Ed wrote:
denis_berthier wrote:
Red Ed - Posted: Wed Oct 14, 2009 6:59 am p. 37 of the pdf files - wrote:On the parts of the distribution that we're each focussing on, it seems we each have a method tuned for the purpose: my supersets method has no hope of beating yours below 30 clues (nor does it try to); and your paths method has no hope of beating mine above 30 clues (nor do you proclaim any interest in doing).
How miraculous that, after only 10 minutes of computation with the same program, everything is changed!
Really? In what sense does anything I've just said relate to the supersets method?

If it doesn't, what does it relate to in this thread? What do these tables try to prove?

by **Red Ed** » Sat Jun 15, 2013 5:07 pm

denis_berthier wrote:If it doesn't, what does it relate to in this thread?

Let that be an exercise for the reader. Please, Denis, just read the opening post.

Website · by **denis_berthier** » Sat Jun 15, 2013 5:31 pm

Red Ed wrote:
denis_berthier wrote:If it doesn't, what does it relate to in this thread?
Let that be an exercise for the reader

OK, a post with no purpose.

Red Ed wrote:Please, Denis, just read the opening post.

I already said what I think: it requires a serious re-writing after Blue's major contribution. But this is your thread. If you don't want to update your post, it's your decision, I won't harass you about it.

Red Ed - Posted: Wed Oct 14, 2009 6:59 am p. 37 of the pdf files - wrote:On the parts of the distribution that we're each focussing on, it seems we each have a method tuned for the purpose: my supersets method has no hope of beating yours below 30 clues (nor does it try to); and your paths method has no hope of beating mine above 30 clues (nor do you proclaim any interest in doing).

Do you have anything new that could change this old conclusion ?

by **Red Ed** » Sat Jun 15, 2013 5:42 pm

I hope Blue won't mind me opining that the major contribution might be more the method than the commentary.

Onward ...

denis_berthier wrote:(quoting observations on the supersets method ...) Do you have anything new that could change this old conclusion ?

Nope, I've not revisited supersets since 2009, so I suppose the conclusions about >30 clue minimals are unchanged; but I don't understand why you think that part of the problem space is relevant to the results I just reported.

Please read the opening post. :lol:

by **blue** » Sun Jun 16, 2013 1:18 am

Red Ed wrote:I hope Blue won't mind me opining that the major contribution might be more the method than the commentary.

Not at all, and I do agree. I've been away, or I would have remarked that my posts were only meant to shed some light on some of the things that Denis was objecting to, so that he would read on and embrace the big picture.

Best Regards,
Blue.

Website · by **denis_berthier** » Sun Jun 16, 2013 4:46 am

Red Ed wrote:
denis_berthier wrote:
Red Ed - Posted: Wed Oct 14, 2009 6:59 am p. 37 of the pdf files - wrote:On the parts of the distribution that we're each focussing on, it seems we each have a method tuned for the purpose: my supersets method has no hope of beating yours below 30 clues (nor does it try to); and your paths method has no hope of beating mine above 30 clues (nor do you proclaim any interest in doing).

Do you have anything new that could change this old conclusion ?

Nope, I've not revisited supersets since 2009

OK, let me know if some day you have anything new.

by **Red Ed** » Sun Jun 16, 2013 7:27 am

denis_berthier wrote:OK, let me know if some day you have anything new.

The new bit, my dear inattentive correspondent, is the quantification of just how much better one can do than the path-based method in the main part of the distribution (below 30 clues). If this isn't new to you then what prompted your "everything is changed!" comment? Look, don't worry about it, I know you're not interested, so I'll just summarise the results in the opening post and we can move on.

by **Red Ed** » Thu Jun 20, 2013 7:31 pm

in his Wed Jun 19, 2013 3:55 pm edit, denis_berthier wrote:
Red Ed wrote:These methods were described originally back in 2009, in the context of counting minimal puzzles

This is an obvious falsification of history. If the "subset method" had been described four years ago, how could anyone believe you didn't run it 10 minutes

Hello again. May I refer you to my post of Sat Jul 18, 2009 2:15 pm on RDMP page 11?

Website · by **denis_berthier** » Fri Jun 21, 2013 3:27 am

Red Ed wrote:
in his Wed Jun 19, 2013 3:55 pm edit, denis_berthier wrote:
Red Ed wrote:These methods were described originally back in 2009, in the context of counting minimal puzzles

This is an obvious falsification of history. If the "subset method" had been described four years ago, how could anyone believe you didn't run it 10 minutes, instead of speaking only of the "superset method" and admitting that
Red Ed - Posted: Wed Oct 14, 2009 6:59 am p. 37 of the pdf files - wrote:On the parts of the distribution that we're each focussing on, it seems we each have a method tuned for the purpose: my supersets method has no hope of beating yours below 30 clues (nor does it try to); and your paths method has no hope of beating mine above 30 clues (nor do you proclaim any interest in doing).

Hello again. May I refer you to my post of Sat Jul 18, 2009 2:15 pm on RDMP page 11?

OK, you had some definition, but it doesn't explain why you didn't run it and didn't even mention it in your overall conclusion recalled above. Part of your current algorithm must be recent.
Moreover, you didn't have your "generic method".
So there is indeed a falsification of history in your post and the absurd comparisons you present there.

BTW, your "generic method" remains devoid of any content without a probability distribution being first defined on P(M). This is so basic probability theory that I can't understand you persist refusing to see it.

by **Red Ed** » Fri Jun 21, 2013 7:04 am

denis_berthier wrote:OK, you had some definition, but it doesn't explain why you didn't run it and didn't even mention it in your overall conclusion recalled above. Part of your current algorithm must be recent.

Denis, I am running code from 2009. It starts:

Code: Select all: /* minsubs.c : simple DFS subsets method for the minimals-counting problem * * This version placed on www.sudoku.com on 23 December 2009

(perhaps someone else has a copy?) plus a few new lines of suexg-cb simulation to allow comparison with that approach. The results from running the 2009 code would have been reported in the other minimals-counting thread, which appears to have been lost to the server crash. Snippets from the 2009 results are present in the RDMP thread. There's no "falsification of history".

BTW, your "generic method" remains devoid of any content without a probability distribution being first defined on P(M). This is so basic probability theory that I can't understand you persist refusing to see it.

The "generic method" is just a convenient framework within which to explain the practical applications (subsets method, etc) and, in and of itself, needs no further definition of the distribution underpinning A.

The New Sudoku Players' Forum

Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.