catalog of all essentially different grids

Everything about Sudoku that doesn't fit in one of the other sections

Re: catalog of all essentially different grids

Postby dukuso » Wed Jun 09, 2010 5:13 pm

you mentioned the issue above:
e.g. my method or (sadly lost in the crash) holdout's.

I have no idea, what it is.
dukuso
 
Posts: 479
Joined: 25 June 2005

Re: catalog of all essentially different grids

Postby dukuso » Wed Jun 09, 2010 5:18 pm

M. T. Jacobson and P. Matthews, Generating uniformly distributed random Latin squares, J. Combinatorial Design 4 (1996), 405-437

http://www.stasbusygin.org/programs/jacmat.c

copyright ?


----------edit-----------

see this thread:
http://www.setbb.com/phpbb/viewtopic.ph ... rum=sudoku

unfortunately red Ed didn't post it to that forum

I may have understood that algo once, but now forgot
it would be neat to have something like that for sudokus
Last edited by dukuso on Wed Jun 09, 2010 5:59 pm, edited 1 time in total.
dukuso
 
Posts: 479
Joined: 25 June 2005

Re: catalog of all essentially different grids

Postby Red Ed » Wed Jun 09, 2010 5:43 pm

dukuso wrote:I have no idea, what it is.
Unfortunately, no one can be told what The Matrix is. You have to see it for yourself. :ugeek:
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: catalog of all essentially different grids

Postby gsf » Wed Jun 09, 2010 5:53 pm

Red Ed wrote:
gsf wrote:I was thinking about the ~400K S-class generators (we don't have them yet)
Oh, that. Okay. If X-classes generate T-classes then maybe W-classes generate S-classes ... Whatever; it doesn't look like a win because of the overhead of keeping a cumulative version of the anti-dup bitstream that tells you how many new S-classes the given X- (or W- ... I can't keep up) class generates.

I'm not seeing any problem with my method or holdout's. What issue with those are you trying to solve?

for me the main issue is how to efficiently distribute the catalog of essentially different grids
I have it down to 5Gib (~8 bits/grid) => still too painful to download
but dukuso's idea of using ~400000 generator grids and a duplicate grid bitstream could get it under 500Mib
I brought up random grid generation as a fleeting thought
but after writing that down the idea of using it to generate random grids doesn't fly
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Re: catalog of all essentially different grids

Postby Red Ed » Wed Jun 09, 2010 7:01 pm

You can trade download pain for CPU pain.

For example:

  • Let {c1,...,cN} be a subset of {1,...,81}. N=81 is the extreme "download pain" option; N=0 is the extreme "CPU pain" option.

  • Let C (for catalogue) be an empty set.

  • For each of the 5.4e9 S-classes:
    • Get a representative grid in a convenient canonical form (min-lex might not be convenient; the form in the Sudopedia article's probably better).
    • Add the N-long string of values at positions {c1,...,cN} to C
  • Publish a compressed version of C.
The recipient then loops over N-tuples in C, 'solves' each like a puzzle, and keeps the results which are equal to their own canonical form.

I don't know what N would make sense, though; nor how to choose {c1,...,cN}, or indeed how to compress C. I think that just means the method's sufficiently ill-defined that you can't contradict me immediately ;)
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: catalog of all essentially different grids

Postby dukuso » Wed Jun 09, 2010 7:19 pm

> but dukuso's idea of using ~400000 generator grids and a duplicate grid bitstream
> could get it under 500Mib

we have the 306693 G-classes which should generate all S-classes by mini-column permutation
at least once within 1hour.
But they are not minimal, although it may be possible to make them minimal quickly (~5 hours)
using the 416-sixtupel.

And why 500MB ? You have to filter from 1.2e10 grids, so 1.5GB.
It's possible that these can be further compressed, that there are regions where
almost each of the 1.2e10 are new.
Or that there are regions where so few of the 1.2e10 are new that it's better to lit them directly.
dukuso
 
Posts: 479
Joined: 25 June 2005

Re: catalog of all essentially different grids

Postby gsf » Wed Jun 09, 2010 8:14 pm

dukuso wrote:And why 500MB ? You have to filter from 1.2e10 grids, so 1.5GB.
It's possible that these can be further compressed, that there are regions where
almost each of the 1.2e10 are new.

right, I incorrectly counted bits for the 5e9, not 1.2e10
that bit stream should compress, it might be good too depending on the length of same-bit runs
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Re: catalog of all essentially different grids

Postby dukuso » Fri Jun 11, 2010 9:58 am

for any triple of 416-classes , calculate the number of T-classes with that combination

for any pair of triples of 416-classes , calculate (+ print representative) the number
of S-clases with that combination

for any triple of 44-classes (10895 do occur), calculate the number of T-classes with that combination

difficult to implement ?
dukuso
 
Posts: 479
Joined: 25 June 2005

Re: catalog of all essentially different grids

Postby dukuso » Fri Jun 11, 2010 10:00 am

G-classes 279791,279960,279961 do create 82944 grids (with sorted rows in bands) by mcp

the next one generates only 27648 , minimum is 256
dukuso
 
Posts: 479
Joined: 25 June 2005

Re: catalog of all essentially different grids

Postby gsf » Wed Jun 16, 2010 9:46 pm

dukuso wrote:> so I'm thinking of a pay it forward scheme where I mail an 8Gi
> usb stick containing the catalog to one recipient, and that recipient mails it to the next etc.
>
> I assume it will mostly be forum oldies who want the catalog
> we can exchange snail mail addrs via pm
>
> post here if you want on the list
> include your country so the mailing can be optimized for distance

yes, I want it. These small (~1cm*1cm) memory chips are easy to send in a letter or even card,
but usb-stick is also OK. I pay by paypal ? Address by PM

I just got an 8Gib micro-SD with SD adapter, and copied the catalog, with md5 sums to it
1.7Gib unused
will put it in the mail tomorrow

"payment" is the current recipient paying the post for the next recipient
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Re: catalog of all essentially different grids

Postby gsf » Thu Jun 17, 2010 11:52 pm

the catalog is off to the first recipient (dukuso)
should be there by Mittwoch
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Re: catalog of all essentially different grids

Postby dukuso » Sat Jun 19, 2010 6:16 pm

thanks, I'll tell you when it's here.
I hope I can decompress it.
Can I (easily) generate the T-classes from it

What else could we put on that card ? ... (sudoku only ?)

this memory-card-rotation seems to be a good idea and should be being practiced by other
groups, in other areas - yet I never heard about it



----------------------
G-classes
latin squares, main classes
44 gangsters,416 gangsters
classes of 4*16 bands
magic squares, cubes,

wikipedia on memory-chip to be circulated like this
dukuso
 
Posts: 479
Joined: 25 June 2005

Re: catalog of all essentially different grids

Postby dobrichev » Fri Dec 03, 2010 11:56 pm

For those who have only 54GB disk space for the catalog of all essentially different grids, are interested in running own applications over the grids, and are lazy to generate the catalog using gsf's tools.

A 54GB catalog (compressed to 10 bytes per grid) is currently available at
ed2k://|file|grids.bin|54727305380|D2CE27215EE3E7CFD50C7D1F73D20DC8|h=RCHPLX67WPCIZO3AK776G4INGLJWKZX6|/

A source code in C with on the fly decompressor and example generic routines for grid processing is available at
ed2k://|file|gridcatalog.zip|10215|890F3B7B1179130628C6CBE7B3207B87|h=5E6SG5XIPFGPQURE7B4CH3WXUF2USRWO|/

I am sharing these files by friend's request and will keep them available for the next week.

Note that it is faster to uncompress the catalog from gsf's archive than to download 54GB. If the disk space is limited, it is possible to extract chunks from gsf's archive (one band at a time), then compress chunks to 10 B/grid format, erase the uncompressed data, and concatenate re-compressed chunks.

The download process will significantly speed up if sufficient people are interested and leave the file in their ed2k clients (eMule) for several days.

MD
dobrichev
2016 Supporter
 
Posts: 1850
Joined: 24 May 2010

Re: catalog of all essentially different grids

Postby gsf » Sun Dec 05, 2010 9:19 am

dobrichev wrote:For those who have only 54GB disk space for the catalog of all essentially different grids, are interested in running own applications over the grids, and are lazy to generate the catalog using gsf's tools.

A 54GB catalog (compressed to 10 bytes per grid) is currently available at
ed2k://|file|grids.bin|54727305380|D2CE27215EE3E7CFD50C7D1F73D20DC8|h=RCHPLX67WPCIZO3AK776G4INGLJWKZX6|/

A source code in C with on the fly decompressor and example generic routines for grid processing is available at
ed2k://|file|gridcatalog.zip|10215|890F3B7B1179130628C6CBE7B3207B87|h=5E6SG5XIPFGPQURE7B4CH3WXUF2USRWO|/

I am sharing these files by friend's request and will keep them available for the next week.

Note that it is faster to uncompress the catalog from gsf's archive than to download 54GB. If the disk space is limited, it is possible to extract chunks from gsf's archive (one band at a time), then compress chunks to 10 B/grid format, erase the uncompressed data, and concatenate re-compressed chunks.

The download process will significantly speed up if sufficient people are interested and leave the file in their ed2k clients (eMule) for several days.

the 5.7Gib compressed catalog decompression rate is ~100K grids/sec/Ghz
what is the decompression rate for the 54Gib catalog?
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Re: catalog of all essentially different grids

Postby dobrichev » Sun Dec 05, 2010 10:11 am

gsf wrote:the 5.7Gib compressed catalog decompression rate is ~100K grids/sec/Ghz
what is the decompression rate for the 54Gib catalog?

~210K grids/sec/GHz for sequential access, and could be slightly improved, say by 20%. For random access the bottleneck is disk.
They are in the same order. Furthermore, the processing of the obtained grid is expected to be in order of magnitude slower than decompressing.
Implementing API with getGridByIndex() and findGridIndex() in your catalog would make the 54GB one obsolete.
dobrichev
2016 Supporter
 
Posts: 1850
Joined: 24 May 2010

PreviousNext

Return to General