High density files for solution grids and 18 clues puzzles

Programs which generate, solve, and analyze Sudoku puzzles

Re: High density files for solution grids and 18 clues puzzl

Postby champagne » Fri Oct 25, 2024 9:16 am

Assuming the virtual catalog tested and cleaned, I am back to the topic "high density files..."

Although I see low interest currently, I am convinced that one day, the problem will come back.

Having in mind the last data base extensions in the field of the tridagon pattern, I think more and more that a canonical form

rank of the solution
81 bit field of the minimal sudoku pattern in this solution grid


is a good target.

This is in text mode something as

10 characters for the rank,
81/6 14 character for the pattern expressed in text mode, 6 bits per printable character


with then a potential for a .zip compressor.

Such a pattern is very easy to decompress using the virtual catalog.
Compression is a little harder with several steps:

Code: Select all
find the solution of the puzzle
morphs it to min lexical (and solve auto morphs if any)
Get the rank.


All tools are there now , but my brute force solver is not in DLL mode. I intend to build this DLL as next step, with all entries used in my programs.
champagne
2017 Supporter
 
Posts: 7654
Joined: 02 August 2007
Location: France Brittany

Re: High density files for solution grids and 18 clues puzzl

Postby coloin » Fri Oct 25, 2024 6:03 pm

champagne wrote:81 bit field of the minimal sudoku pattern in this solution grid ....
81/6 14 character for the pattern expressed in text mode, 6 bits per printable character....

thats a big leap again !!! but maybe it can be done to go from minlex solution grid and minlex pattern -> actual puzzle.

starting from a min lex pattern [of the puzzle] ... what would be the best way to describe and then define the 6^8 x 2 permutes ....[3,359,232] ?

identify a box to be box1 [1 - 9]
+/- transpose - 0 or 1
row swap 1-3 [012345]
col swap 1-3 [012345]
identify box to be box5 [0123]
row swap 4-6 [012345]
col swap 4-6 [012345]
row swap 7-9 [012345]
col swap 7-9 [012345]

9x2x6x6x4x6x6x6x6 = 3,359,232

I guess which ever is easiest to program efficiently ....
coloin
 
Posts: 2603
Joined: 05 May 2005
Location: Devon

Re: High density files for solution grids and 18 clues puzzl

Postby champagne » Fri Oct 25, 2024 7:33 pm

coloin wrote:
champagne wrote:81 bit field of the minimal sudoku pattern in this solution grid ....
81/6 14 character for the pattern expressed in text mode, 6 bits per printable character....

thats a big leap again !!! but maybe it can be done to go from minlex solution grid and minlex pattern -> actual puzzle.

starting from a min lex pattern [of the puzzle] ... what would be the best way to describe and then define the 6^8 x 2 permutes ....[3,359,232] ?

identify a box to be box1 [1 - 9]
+/- transpose - 0 or 1
row swap 1-3 [012345]
col swap 1-3 [012345]
identify box to be box5 [0123]
row swap 4-6 [012345]
col swap 4-6 [012345]
row swap 7-9 [012345]
col swap 7-9 [012345]

9x2x6x6x4x6x6x6x6 = 3,359,232

I guess which ever is easiest to program efficiently ....


In my DLL minlexing a given, the permutation is defined in the following way

Code: Select all
Transpose first or not
reorder rows
reorder columns
1296 * 1296 * 2 = 3 359 232 permutations


And the first rows gives the digit map.

The minlexing operation delivers also the list of auto morphs.

So where I am, I don't see this as a big leap. The only new point is to apply grid auto morphs to get the canonical view.

But a DLL for the brute force would be welcome, to have a relatively simple process.

Another issue is to have it accepted as the canonical form of any puzzle;
champagne
2017 Supporter
 
Posts: 7654
Joined: 02 August 2007
Location: France Brittany

Re: High density files for solution grids and 18 clues puzzl

Postby RichardGoodrich » Sat Jun 21, 2025 11:00 pm

dobrichev wrote:Hi Champagne,

A 10 bytes/grid implementation is here.
You need a 50GB lookup file if you want to extract a grid by its number. gridchecker laso has functionalities to get range of grids by index.

There is one free bit in the grid catalogue which can be used to store one of the 81-bit uncompressed bitmask of the givens. In total 20 bytes/puzzle, w/o 50GB catalogue, and each record is self-consistent.
Using grids catalogue you can store the grid index (17 bits) + somehow compressed givens bitmap.

On the other hand you can ignore the solution and represent the 18 givens in base-9 format (< 56 bits = 7 bytes assuming min-lex relabelling; < 10 bytes for the solution), again plus the pattern.

The pattern could be compressed by storing for each given the distance to previous one and somehow encoding into variable number of bits depending on statistics. If the distance can be encoded in less than 4.5 bits in average (81 bits uncompressed / 18 = 4.5) then pattern compression wins.
Assuming pattern min-lex, several positions at the beginning are guaranted empty and could be excluded from encoding - both uncompressed or compressed.

Note that applying such acrobatics would result in a file that isn't well compressable by standard utilities like ZIP.


[Aside]
Compression upon Compression is probably a path of diminishing returns? I have Python code that compresses and encodes/decodes a puzzle to a 58 char [I think] line with a special aphabet that encodes both the original puzzle and its solution. Could send a reference to the paper. However, that is NOT really relavent here!
Big1952
RichardGoodrich
 
Posts: 113
Joined: 12 December 2012
Location: Josephine, TX (north of Dallas, TX)

Previous

Return to Software