ravel wrote:When comparing these 38's
- Code: Select all
.........9..26...44.6.8...3....5..1.14..985.25.9.124382.5..6.416.4.213.531.5....6
.7.......9.1.87.4.4.6.91.871......2..4.9.2.5.5...1.4..2.5.7...4614.29.7578.15.26.
i got a distance of 30
I think that similarity is a more interesting measure than distance. Now wait! That's not just a "subtract from 81"-type comment. The two measures are quite different creatures (perhaps this has already been mentioned). Let ...
E = nr cells that are empty in both puzzles
S = nr occupied cells that are the same in both puzzles
D = nr occupied cells that are different in both puzzles
T = nr cells that have different type (occupied/empty) in the two puzzles
... so
E +
S +
D +
T = 81. You've been working with
distance = D + T. Now I can see that that's interesting from an edit-distance / neighbourhood tour point of view, but I say that you should also be interested in the measure
similarity = S.
Here's why. We can regard a puzzle as a graph with each vertex = a value in a cell and each edge = relationship between them (e.g. same value, same band; or different value, same row, different stack). Viewed that way, two puzzles are isomorphic iff their graphs are equal (modulo labelling) -- in fact this is how Gordon said that he checks whether or not a 17 is new. If two puzzles are not isomorphic then we should be interested in the size of the largest subgraph (= sub-puzzle) in common to both. That's "similarity" as defined above.
For example, comparing ravel's two 38s:
- Code: Select all
Target: .........9..26...44.6.8...3....5..1.14..985.25.9.124382.5..6.416.4.213.531.5....6
Permed (new): .3...5..99..21..3.1.......5.......6.3...685.28.6.524.3..3..69.16.9.2135441.5.9.86
distance = 30 ***
similarity = 16
Target: .........9..26...44.6.8...3....5..1.14..985.25.9.124382.5..6.416.4.213.531.5....6
Permed (new): ....6..9.94.2...8...6.8.2...6835.91...5.98..24.9612.585.3.46.2.6.4.21.35...5.....
distance = 31
similarity = 20 ***
The distance measure in this case says that you can get from one puzzle to (an isomorph of) the other by editing 30 cells; whereas the similarity measure says that, up to (a different) isomorphism, the largest sub-puzzle in common has 20 clues.
Both measures are interesting: we shouldn't just limit ourselves to "distance".
Now a challenge:
- The biggest distance I've seen between a pair of 17s is 16.
- The smallest similarity I've seen between a pair of 17s is 8.
Can anyone do better? (I only checked a smallish random sample.)
~~~
PS: regarding this ...
gsf wrote:I'm hoping there's a better way than running p! combinations on p matching positions,
but I don't see it right now
There is a
much better way: it's called the Hungarian (or Munkres')
Assignment Algorithm.