High clue tamagotchis

Everything about Sudoku that doesn't fit in one of the other sections

Re: High clue tamagotchis

Postby gsf » Sun Sep 05, 2010 3:54 pm

if you look further in the that thread
the algorithm posts tie distance/similarity to the validity-preserving transforms
so we are talking about a new distance measure here
maybe tour distance or neighborhood distance:

definition wrote:given two puzzles A and B
the neighborhood distance between A and B is the the smallest n where {-n+n} applied to A includes an isomorph of B


does this describe what you are doing dobrichev?

ronk wrote:
dobrichev wrote:
gsf wrote:all of the hamming/similarity discussions on the players and programmers forums only allow the 9 sudoku transforms (or their equivalent)

9 sudoku validity-preserving transformations, which is not the case.

I agree with your usage of the Hamming distance term, as my terse 2007 post implies.

A percentage of 39s, maybe even a significant percentage, were found with "neighborhood searches" of known 39s, as with -go{-4+4} using gsf's program. Were none or only one of two such neighbors subsequently (iso)morphed, I believe gsf's -C option would find the correct Hamming distance. However, when both are morphed, as with canonicalization, the probability of its finding the correct Hamming distance may be quite low.
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Re: High clue tamagotchis

Postby gsf » Sun Sep 05, 2010 3:56 pm

Red Ed wrote:It's really only my port of someone else's code.

The old version of that now-fixed line of code was something like:
Code: Select all
        tscore += ((gpermed[i/9][i%9]==0) ^ (gtarget[i/9][i%9]==0));

I look forward to MD's next post. I he'll have a bug that is bigger and hairer than the tscore one ...

can you post the puzzles that exposed the bug?
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Re: High clue tamagotchis

Postby Red Ed » Sun Sep 05, 2010 4:36 pm

It was the pair posted by MD. My code permutes one 3359232 ways whilst keeping the other fixed; and for each, it solves the assignment problem (using the Munkres code). The choice of which is fixed vs. which was permuted is what made the difference. The difference is only present in the wrapper code, not the Munkres part.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: High clue tamagotchis

Postby Red Ed » Sun Sep 05, 2010 9:18 pm

Ah-ha. MD edited his post without telling us; now it says
dobrichev wrote:If there is no bug in my program, the Hamming distance between the following two puzzles is <= 5.
[EDIT: There was a bug and the result is INCORRECT. See the posts below.]
I call that round to the Old School :ugeek:
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: High clue tamagotchis

Postby ronk » Sun Sep 05, 2010 9:22 pm

ronk wrote:A percentage of 39s, maybe even a significant percentage, were found with "neighborhood searches" of known 39s, as with -go{-4+4} using gsf's program. Were none or only one of two such neighbors subsequently (iso)morphed, I believe gsf's -C option would find the correct Hamming distance. However, when both are morphed, as with canonicalization, the probability of its finding the correct Hamming distance may be quite low.

RETRACTION: My last sentence above makes absolutely no sense to me now, so I await the info from dobrichev.
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Re: High clue tamagotchis

Postby Red Ed » Sun Sep 05, 2010 9:45 pm

Bugs aside, it looks like MD's calculating the same distance that gsf is and I am. The distance comes from finding the right: validity-preserving permutation of cells (3359232 choices), consistent relabelling (9! choices) and single-cell edits (81c5 x 9^5 choices if there are five edits). MD works this out by looping over the edits and then using canonicalisation to solve for permutation & relabelling. Gsf and I loop over permutations, use the Munkres assignment algorithm to solve for relabelling, and then just read off the edits. The latter approach is much faster.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: High clue tamagotchis

Postby dobrichev » Sun Sep 05, 2010 9:50 pm

ronk wrote:...
dobrichev, many of the 39s are not close neighbors. What caused you to choose this close pair :?:
Red Ed wrote:
dobrichev wrote:Can repeat and obtain transformations if there is interest.
Yes please.

I'm interested as well.

Unfortunately the example was wrong. Sorry.
I didn't choose a pair, but run the program against a list of 75 known 39s. It should process each puzzle and print the similar ones. I messed up the puzzle lists so the program processed the second puzzle and found its isomorph but wrongly displayed it as a transformation of the first puzzle (which is in another list, ordered differently).

Nevertheless I got some similar results.

I ran the "relabel 3 clues", w/o +1/-1, against the same list. Here are the results.
Code: Select all
103056709056009002709210605210030500307065200065120308000000000000091807071000903
103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=9

103056709056009002709210605210030006307065200065120308000091807000000000071000903
103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=7

103056780056080020708201560210003050307065200065102390000000000000018970071000830
103056780400780100708013064205870046004035870800604050302000610501360000000000000
c[18]=8,c[20]=7,c[45]=7

103056780056080020708201560210003600307065200065102390000018970000000000071000830
103056780400780100708013064205870046004035870007604050302000610501360000000000000
c[18]=8,c[20]=7,c[47]=8

103056780400780100708013064205870046004035870800604050302000610501360000000000000
103056780056080020708201560210003050307065200065102390000000000000018970071000830
c[24]=6,c[25]=5,c[34]=6

103056780400780100708013064205870046004035870007604050302000610501360000000000000
103056780056080020708201560210003600307065200065102390000018970000000000071000830
c[24]=6,c[25]=5,c[33]=5

120400780406700100780102064208305610000000000630201058370004800804003070062007040
120400780406700100780102064208301650000000000630205018370004800804003070062007040
c[0]=5,c[15]=5,c[21]=5

120400780406700100780102064208301650000000000630205018370004800804003070062007040
120400780406700100780102064208305610000000000630201058370004800804003070062007040
c[0]=5,c[15]=5,c[21]=5

103056709406709103790100500065004900304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903
c[24]=5,c[26]=6,c[35]=5

103056709406709103790100500065004007304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903
c[24]=5,c[26]=6,c[33]=6

The transformations are in ugly form (sorry again) and must be read as follows.
Apply the transformation from the third row to the min-row-lex normalized puzzle on the second row.
Then min-row-lex normalize the transformed puzzle and you will get the result on first row.
c[6]=9 means relabel cell at zero based position 6 to value of 9 .
In other words, the puzzle at first row can be obtained from the puzzle from second row, applying the transformation at third row. Both puzzles are in row-min-lex format.

So, these pairs are at Hamming distance of 3. Up to errors :oops: .

Next, I ran "relabel 3 then apply -1/+1". This time against a list of 82 known 39s.
Calculation is still in progress, ~14 minutes per puzzle @ 2.8 GHz.
Here are the results for the first 8 puzzles.
Hidden Text: Show
Code: Select all
003400700450780020780023460038090640600308950900000000070800290090032070802970530
c[10]=1,c[43]=1,c[78]=1,c[67]=0,c[54]=3
003400700410780020780023460038090640600308910900000000370800290090002070802970130
003400700450780020780023460038090640600308950900000000370800290090002070802970530

003400700450780020780023460038090640600308950900000000070800290090032070802970530
c[25]=1,c[33]=1,c[36]=1,c[67]=0,c[54]=3
003400700450780020780023410038090140100308950900000000370800290090002070802970530
003400700450780020780023460038090640600308950900000000370800290090002070802970530
.
003400700450780020780023460038090640600308950900000000370800290090002070802970530
c[10]=1,c[43]=1,c[78]=1,c[54]=0,c[67]=3
003400700410780020780023460038090640600308910900000000070800290090032070802970130
003400700450780020780023460038090640600308950900000000070800290090032070802970530

003400700450780020780023460038090640600308950900000000370800290090002070802970530
c[25]=1,c[33]=1,c[36]=1,c[54]=0,c[67]=3
003400700450780020780023410038090140100308950900000000070800290090032070802970530
003400700450780020780023460038090640600308950900000000070800290090032070802970530
.
003400700450780020780023460038090640600308950900000000090800270070032090802970530
c[10]=1,c[43]=1,c[78]=1,c[67]=0,c[54]=3
003400700410780020780023460038090640600308910900000000390800270070002090802970130
003400700450780020780023460038090640600308950900000000390800270070002090802970530

003400700450780020780023460038090640600308950900000000090800270070032090802970530
c[25]=1,c[33]=1,c[36]=1,c[67]=0,c[54]=3
003400700450780020780023410038090140100308950900000000390800270070002090802970530
003400700450780020780023460038090640600308950900000000390800270070002090802970530
.
003400700450780020780023460038090640600308950900000000390800270070002090802970530
c[10]=1,c[43]=1,c[78]=1,c[54]=0,c[67]=3
003400700410780020780023460038090640600308910900000000090800270070032090802970130
003400700450780020780023460038090640600308950900000000090800270070032090802970530

003400700450780020780023460038090640600308950900000000390800270070002090802970530
c[25]=1,c[33]=1,c[36]=1,c[54]=0,c[67]=3
003400700450780020780023410038090140100308950900000000090800270070032090802970530
003400700450780020780023460038090640600308950900000000090800270070032090802970530
...
103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=8,c[35]=0,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903

103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903

103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=1,c[35]=0,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903

103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=2,c[35]=0,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903

103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=3,c[35]=0,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903

103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=4,c[35]=0,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903

103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=5,c[35]=0,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903

103056709406709103790100500065004007304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[35]=6,c[35]=0,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903
.
103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=1,c[33]=0,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903

103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=2,c[33]=0,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903

103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=3,c[33]=0,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903

103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=4,c[33]=0,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903

103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=5,c[33]=0,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903

103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=6,c[33]=0,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903

103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=7
103056907406709103790100500065004700304907005970065204000000000630000401000601302
103056709056009002709210605210030006307065200065120308000091807000000000071000903

103056709406709103790100500065004900304907005970065204000000000630000401000601302
c[6]=9,c[8]=7,c[33]=8,c[33]=0,c[35]=9
103056907406709103790100500065004009304907005970065204000000000630000401000601302
103056709056009002709210605210030500307065200065120308000000000000091807071000903
.

The rows are: take the puzzle from line 1, apply transformation from line 2, and you get the new puzzle at line 3, which is isomorph of the puzzle at line 4.

Now some fun from the first puzzle.
c[10]=1,c[43]=1,c[78]=1,c[67]=0,c[54]=3 actually means "replace all occurrences of "5" with "1", then apply -1 at pos 67 and +1 at pos 54". The relabeling is unnecessary, and the actual Hamming distance is 2. That is side effect because I assumed a cheaper transformations are already done at previous steps, and a "black list" with knowns is formed. For these experiments I excluded the black list and each puzzle is compared only to itself to avoid outputting the isomorphs.

MD
dobrichev
2016 Supporter
 
Posts: 1871
Joined: 24 May 2010

Re: High clue tamagotchis

Postby dobrichev » Sun Sep 05, 2010 9:54 pm

Red Ed wrote:Ah-ha. MD edited his post without telling us; now it says
dobrichev wrote:If there is no bug in my program, the Hamming distance between the following two puzzles is <= 5.
[EDIT: There was a bug and the result is INCORRECT. See the posts below.]
I call that round to the Old School :ugeek:

Sorry for the delay between editing and posting what actually was wrong.
dobrichev
2016 Supporter
 
Posts: 1871
Joined: 24 May 2010

Re: High clue tamagotchis

Postby dobrichev » Sun Sep 05, 2010 9:58 pm

Red Ed wrote:Bugs aside, it looks like MD's calculating the same distance that gsf is and I am. The distance comes from finding the right: validity-preserving permutation of cells (3359232 choices), consistent relabelling (9! choices) and single-cell edits (81c5 x 9^5 choices if there are five edits). MD works this out by looping over the edits and then using canonicalisation to solve for permutation & relabelling. Gsf and I loop over permutations, use the Munkres assignment algorithm to solve for relabelling, and then just read off the edits. The latter approach is much faster.


Sure. It is pretty good to know what you are doing.
My goal was not distance calculation.
dobrichev
2016 Supporter
 
Posts: 1871
Joined: 24 May 2010

Re: High clue tamagotchis

Postby dobrichev » Sun Sep 05, 2010 10:47 pm

gsf wrote:if you look further in the that thread
the algorithm posts tie distance/similarity to the validity-preserving transforms
so we are talking about a new distance measure here
maybe tour distance or neighborhood distance:

definition wrote:given two puzzles A and B
the neighborhood distance between A and B is the the smallest n where {-n+n} applied to A includes an isomorph of B


does this describe what you are doing dobrichev?

No.
From Wikipedia, "Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. Put another way, it measures the minimum number of substitutions required to change one string into the other, or the number of errors that transformed one string into the other."
Since in this definition the word "symbol" is most questionable, it is the value of the cell. In the puzzles, there is a symbol marking the cell as "unknown", and it is not different in any way to the rest symbols [1..9].

What am I doing? I am trying to collect some observations on puzzle similarities. Later I'll try to form some criteria about the promising unavoidable sets which are converting one valid puzzle to another valid puzzle. You are close.
My first observations are that small sized UA don't transform so frequently the minimal 17s to valid ones. There are some peaks in U6, U12, U18, U20, but this can be an effect of the UA generation process.
From other side, collecting more UA (which increases the average size) causes decreasing of the proportion between "good" and "bad" UA.

PS. Actually UA are transforming solutions, not puzzles. One more variable - how many clues from a puzzle are hitting UA set which transforms a fruitful grid to another fruitful grid?
dobrichev
2016 Supporter
 
Posts: 1871
Joined: 24 May 2010

Re: High clue tamagotchis

Postby dobrichev » Sun Sep 12, 2010 3:44 am

Are these 39s already known?
Code: Select all
.....8.53.853.497.39....8...428...95.........95.4.2187.21....39.792.351.53.1..72.
.....8..5..894.6.34163.5.892..4.95.6.........6.45.2..114.2.3.68.638.41..8.21..3..
...........3..6..2.1..8.639....73..5.758.1.63.3865..71..71...2..81.27.96.29568.17
...........3..8..4.9..1.835....73..2.721.9.83.3182..97..79...4..19.47.58.45281.79

Edit: updated to isomorphic equivalents not in minlex form.
Last edited by dobrichev on Sun Sep 12, 2010 7:30 pm, edited 1 time in total.
dobrichev
2016 Supporter
 
Posts: 1871
Joined: 24 May 2010

Re: High clue tamagotchis

Postby ronk » Sun Sep 12, 2010 11:45 am

dobrichev wrote:Are these 39s already known?

Not to my knowledge. Congratulations!

Can I persuade you to [edit your post to] "scramble" those 39s to avoid publishing in canonicalized form?
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Re: High clue tamagotchis

Postby Red Ed » Sun Sep 12, 2010 12:17 pm

ronk wrote:Can I persuade you to [edit your post to] "scramble" those 39s to avoid publishing in canonicalized form?

I can't ignore this ... what's wrong with "publishing" (to this site?) in (some) canonical form?
Personal confusion level 6.67e21 and rising ... :?
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby Pat » Sun Sep 12, 2010 1:02 pm

Red Ed wrote:what's wrong with "publishing" (to this site?) in (some) canonical form?


this
Code: Select all
1.3..6789..7..9..6.......1.2.196..75..85.19.2..5.28.613.28.56975..6.....7...92.5.
tends to spoil the puzzle for people wishing to solve it (notice row 1)

prefer this --
Code: Select all
12345.6..56.7.142...46..5..21786.3...3.1..862.........35.....4.7.154.23..42...7.5
User avatar
Pat
 
Posts: 4056
Joined: 18 July 2005

Re: High clue tamagotchis

Postby Red Ed » Sun Sep 12, 2010 3:04 pm

ack
Red Ed
 
Posts: 633
Joined: 06 June 2005

PreviousNext

Return to General