The New Sudoku Players' Forum

Website · by **denis_berthier** » Fri Jun 21, 2013 7:27 am

Red Ed wrote:
denis_berthier wrote:OK, you had some definition, but it doesn't explain why you didn't run it and didn't even mention it in your overall conclusion recalled above. Part of your current algorithm must be recent.
Denis, I am running code from 2009. It starts:
Code: Select all
/* minsubs.c : simple DFS subsets method for the minimals-counting problem * * This version placed on www.sudoku.com on 23 December 2009
(perhaps someone else has a copy?) plus a few new lines of suexg-cb simulation to allow comparison with that approach. The results from running the 2009 code would have been reported in the other minimals-counting thread, which appears to have been lost to the server crash. Snippets from the 2009 results are present in the RDMP thread. There's no "falsification of history".

Red Ed wrote:
denis_berthier wrote:OK, let me know if some day you have anything new.

The new bit, my dear inattentive correspondent, is the quantification of just how much better one can do than the path-based method in the main part of the distribution (below 30 clues).

You can't even be consistent with yourself. The "new bit" that allows your absurd comparisons of timings rely on major pieces of code you didn't have in 2009 and that drastically change the efficiency of your algorithm.
The minimum you could do is state it clearly in your post. This wouldn't change the absurdity of the comparison, but at least it would be honest.

Generally speaking, I don't think it's a good practice to modify posts so many times after they have been commented. It can only lead to modifications of comments, modifications of modifications...

Red Ed wrote:
BTW, your "generic method" remains devoid of any content without a probability distribution being first defined on P(M). This is so basic probability theory that I can't understand you persist refusing to see it.
The "generic method" is just a convenient framework within which to explain the practical applications (subsets method, etc) and, in and of itself, needs no further definition of the distribution underpinning A.

As writing Pr(m) ≡ ΣA∋m Pr(A=A) before defining Pr(A=A) is pure nonsense, it can't be a "framework" for explaining anything.

by **Red Ed** » Fri Jun 21, 2013 8:18 am

denis_berthier wrote:The "new bit" ... rely on major pieces of code you didn't have in 2009 and that drastically change the efficiency of your algorithm.

No, the subsets algorithm code and core solver are unchanged from 2009. It has always been drastically more effective than path-probing for low c.

you wrote:As writing Pr(m) ≡ ΣA∋m Pr(A=A) before defining Pr(A=A) is pure nonsense, it can't be a "framework" for explaining anything.

Given a random variable, A, ranging over subsets of M, and some m in M, Pr(m) is defined to be the probability that A contains m. This is just a definition. I don't know how I can change the language to make it easier to understand

I'll return to this this evening, if necessary.

Website · by **denis_berthier** » Fri Jun 21, 2013 8:55 am

Red Ed wrote:
denis_berthier wrote:The "new bit" ... rely on major pieces of code you didn't have in 2009 and that drastically change the efficiency of your algorithm.
No, the subsets algorithm code and core solver are unchanged from 2009. It has always been drastically more effective than path-probing for low c.

The question isn't about "low c" but about the main part of the distribution, i.e. for c between 21 and 29. If, in 2009, you had had a fast algorithm for the subset method, you would have mentioned it in your overall conclusion, recalled a few posts above.

Red Ed wrote:
you wrote:As writing Pr(m) ≡ ΣA∋m Pr(A=A) before defining Pr(A=A) is pure nonsense, it can't be a "framework" for explaining anything.
Given a random variable, A, ranging over subsets of M, and some m in M, Pr(m) is defined to be the probability that A contains m. This is just a definition. I don't know how I can change the language to make it easier to understand

Using Pr(A=A) in a "definition" of Pr(m), before Pr(A=A) has been defined, is pure nonsense.

by **blue** » Fri Jun 21, 2013 5:27 pm

Hi Red Ed,

Red Ed wrote:
denis_berthier wrote:OK, you had some definition, but it doesn't explain why you didn't run it and didn't even mention it in your overall conclusion recalled above. Part of your current algorithm must be recent.
Denis, I am running code from 2009. It starts:
Code: Select all
/* minsubs.c : simple DFS subsets method for the minimals-counting problem * * This version placed on www.sudoku.com on 23 December 2009
(perhaps someone else has a copy?) plus a few new lines of suexg-cb simulation to allow comparison with that approach.

I don't have your code, but I have my code from 2009, implementing both methods. It produces similar results.
I modified the code that writes the output table, to make for format match yours. The rest is from 2009.
I also modified the "controlled bias" code to start with a 30-clue subset, rather than doing what the original code did
-- on the assumption that for such a short run, no 30 clue minimals would be produced. (For Denis: that change leads to
a higher sample rate, and to better results -vs- the original code, that started with a 35 clue subset).

This is 10 min. granted to each method, again.

Code: Select all: CB_MODE 47309157 samples 481966 size 30 hits +----+----------+--------------+------------+ | Cl | Count | E(nr/grid) | E(rel err) | +----+----------+--------------+------------+ | 22 | 1 | 7.8610e+011 | 100.00% | | 23 | 4 | 8.0661e+012 | 50.00% | | 24 | 14 | 6.8225e+013 | 26.73% | | 25 | 64 | 7.1110e+014 | 12.50% | | 26 | 64 | 1.5316e+015 | 12.50% | | 27 | 33 | 1.6087e+015 | 17.41% | | 28 | 10 | 9.4016e+014 | 31.62% | +----+----------+--------------+------------+ SUBSET_MODE 4650845 samples 47286 size 30 hits +----+----------+--------------+------------+ | Cl | Count | E(nr/grid) | E(rel err) | +----+----------+--------------+------------+ "CB_MODE" | 20 | 117 | 3.9307e+006 | 26.82% | E(rel err) for | 21 | 6060 | 1.2419e+009 | 3.96% | comparison ... | 22 | 116056 | 1.5856e+011 | 1.77% | | 23 | 679751 | 6.8490e+012 | 1.05% | 50.00% | 24 | 1265701 | 1.0567e+014 | 0.75% | 26.73% | 25 | 788319 | 6.2522e+014 | 0.67% | 12.50% | 26 | 166912 | 1.4827e+015 | 0.84% | 12.50% | 27 | 12368 | 1.5106e+015 | 1.76% | 17.41% | 28 | 302 | 6.6395e+014 | 7.62% | 31.62% | 29 | 6 | 3.4956e+014 | 40.82% | +----+----------+--------------+------------+

Regards,
Blue.

by **Red Ed** » Fri Jun 21, 2013 6:22 pm

Thank you for the confirmation, Blue. It's quite a relief to see that I've not hallucinated all this stuff!

To Denis, that's not an "overall" conclusion you're referencing -- it reflects only the work done on the 30+ clues problem.

Website · by **denis_berthier** » Sat Jun 22, 2013 3:37 am

Hi Blue,

So now, we also have the white rabbit jumping out of the hat !
You joined this forum in March 2013 and the Programmer's forum in May 2010, but in 2009 you had a program doing the same computations as Red Ed.

And we now have two different people with (independent ?) programs supposedly able to do fast computations that could have led to an estimate of the distribution of clues, but, in 2009, none of them thought of running it 10 minutes and stating his results.

Website · by **denis_berthier** » Sat Jun 22, 2013 3:44 am

Red Ed wrote:To Denis, that's not an "overall" conclusion you're referencing -- it reflects only the work done on the 30+ clues problem.

One more lie. Just see the content and the context of this post.

Red Ed - Posted: Wed Oct 14, 2009 6:59 am p. 37 of the pdf files - wrote:On the parts of the distribution that we're each focussing on, it seems we each have a method tuned for the purpose: my supersets method has no hope of beating yours below 30 clues (nor does it try to); and your paths method has no hope of beating mine above 30 clues (nor do you proclaim any interest in doing).

Moreover

Red Ed - Posted: Sat Jan 09, 2010 2:01 pm p.43 (the last one) of the pdf files - wrote:As explained on the original minimal puzzles thread, non-correlation of puzzles is not necessary when
estimating the relative error of the (experimental) mean number of minimals. Link: <here>. There's a
computational cost associated with this "bonus" feature of non-correlation which makes suexg-cb less effective than subsets/supersets for minimal count estimation when n<26ish or n>29ish.

No claim at that time about the main part of the distribution.

And if your computation time estimates are not new, why did you write:

Red Ed, 16 Jun 2013, 09:27 wrote:The new bit, my dear inattentive correspondent, is the quantification of just how much better one can do than the path-based method in the main part of the distribution (below 30 clues).

So, you should decide: is there anything new or not?

by **Red Ed** » Sat Jun 22, 2013 7:39 am

Question my technical work all you like, Denis, but you can quit calling me a liar. Stop now.

No claim was made in 2009 about the main part of the distribution because I hadn't done the test. At the tails, it was obvious that suexg-cb was inferior, so I could claim that with confidence. In terms of hard evidence (a side-by-side test on the same platform), well, take a look at the last five posts of the RDMP thread: I offered, but you weren't interested.

by **blue** » Sat Jun 22, 2013 7:44 am

Hi Denis,

denis_berthier wrote:You joined this forum in March 2013 and the Programmer's forum in May 2010, but in 2009 you had a program doing the same computations as Red Ed.

I have been following both forums since late 2005.

Sometime in December 2006, shortly after this post by ocean, I had code implementing the methods and math in your "controled bias" method, (which, you'll recall, I said I preferred to cell the "known bias" method). It estimated both the number of minimal puzzles of each size (as you do), and the number of valid (minimal or non-minimal) puzzles of each size. It was not optimized to focus exclusively on the "minimal puzzle" estimates, and thier convergence rate was poor. I remember the "~257000 itterations per minimal puzzle counted" number from that time. That number, you'll recall, is specific to the "known/controlled bias" method. When the discussions in 2009 started, I dug out the old code and modified it to focus on the minimal puzzle estimates, and optimized it accordingly. I also played around with code implementing Red Ed's subset and superset methods.

denis_berthier wrote:And we now have two different people with (independent ?) programs supposedly able to do fast computations that could have led to an estimate of the distribution of clues, but, in 2009, none of them thought of running it 10 minutes and stating his results.

Independent programs, yes.
As you say, I didn't join this forum until recently.

Regards,
Blue.

Website · by **denis_berthier** » Sat Jun 22, 2013 7:58 am

Red Ed wrote:No claim was made in 2009 about the main part of the distribution because I hadn't done the test.

So, the fact is, you didn't have in 2009 the results you are now presenting in the introductory post in a way suggesting that you had them.

Red Ed wrote:At the tails, it was obvious that suexg-cb was inferior, so I could claim that with confidence. In terms of hard evidence (a side-by-side test on the same platform), well, take a look at the last five posts of the RDMP thread: I offered, but you weren't interested.

Since when do you need my authorisation to make a test?

It's true that I wasn't interested and I'm no more now. As you wrote it, our goals are different. I need uncorrelated samples. And I think that your comparisons of computing times are totally absurd; why don't you compare your algorithm with a top-down generator?

by **Red Ed** » Sat Jun 22, 2013 10:14 pm

Denis> So, the fact is, you didn't have in 2009 the results you are now presenting in the introductory post ...
Denis> ... It's true that I wasn't interested and I'm no more now.
:roll:

Denis> And I think that your comparisons of computing times are totally absurd
The relative error of an estimate from pretty much any sensible program will be k/sqrt(time), where k is a constant, so estimating the relative error after ten minutes gives you k/sqrt(600). Lower k -- equivalently lower k/sqrt(600) -- is better. Doesn't seem absurd to me at all. Perhaps you could explain your objection.

Website · by **denis_berthier** » Sun Jun 23, 2013 4:57 am

Red Ed wrote:Denis> And I think that your comparisons of computing times are totally absurd; why don't you compare your algorithm with a top-down generator?
Perhaps you could explain your objection.

If you didn't truncate my objections, you'd need no more explanations.

Moreover, comparison can't be limited to #clues:

Good tactics not to answer my objections and to ask me instead to explain mine, but it doesn't work. You haven't yet given any sensible explanation why you didn't run the program you claim you had 4 years ago. Who could believe you had a program that would have proven your technique was "better" (in your limited view) than mine and you wouldn't have run it 10 minutes? Anyone reading the "real distribution" thread will see how motivated you were for showing you were the best (which may not be the best motivation of all, but I don't consider it so illegitimate either).

by **Red Ed** » Sun Jun 23, 2013 6:41 am

Denis, this is a thread about counting minimal puzzles. Do you not agree that the timings presented here prove that your "controlled bias" approach converges drastically slower than the "subsets" method for low c? If yes, there's nothing more to say about my "absurd" ten-minute experiment; if no, then are you able to suggest a better approach to comparing the two methods?

Moreover, comparison can't be limited to #clues:

Look at the thread title. That limitation is exactly the right one.

... the program you claim you had 4 years ago. Who could believe you ...

Do not call me a liar. Final warning. If the original minimals-counting thread (containing the code and sample output) had not been lost then I wouldn't have to put up with this sniping. Focus your responses on technical matters, not on who-did-what-when, and we might get somewhere.

Website · by **denis_berthier** » Sun Jun 23, 2013 6:54 am

by **ronk** » Sun Jun 23, 2013 2:59 pm

Red Ed wrote:
denis_berthier wrote:... the program you claim you had 4 years ago. Who could believe you ...
Do not call me a liar. Final warning. If the original minimals-counting thread (containing the code and sample output) had not been lost then I wouldn't have to put up with this sniping. Focus your responses on technical matters, not on who-did-what-when, and we might get somewhere.

Red Ed, I believe you. Over the years you have established a reputation for integrity that is unquestionable IMO.

The New Sudoku Players' Forum

Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.

Re: Counting minimal puzzles: subsets, supersets, etc.