Using DLX for finding base/cover set logic eliminations

Programs which generate, solve, and analyze Sudoku puzzles

Using DLX for finding base/cover set logic eliminations

Postby PIsaacson » Tue Apr 28, 2009 6:07 am

Rather than using Knuth's Dancing Links (DLX) implementation of his Algorithm X to competely solve a sudoku puzzle, I am instead going to discuss how to use DLX to generate all possible covers sets for a given base set.

Some preliminary requirements:

1) Download libexact and read the PDF manual. http://www.cs.helsinki.fi/u/pkaski/libexact/libexact-1.0.tar.gz
2) Read Allan Barker's brief introduction to set/link-set logic. http://sudokuone.com/sweb/guide.htm
3) Basic knowledge of C/C++ and STL is important, but not necessarily critical.

The base/cover algorithm requires some outside method to discover/generate a base set. Currently, I'm using doubly-linked ALSs to generate a vector describing all the cells of both ALSs, but you can use whatever technique you want to generate such lists. However, the following code only works well for rank-0 logical structures, so if you want rank-1 chains etc, you'll have to adjust the filter to allow base+1 sized solutions, modify the elimination code to correctly test the overlap counts, and probably switch from cell base sets to lists of the actual sets used to link candidates.

Basically the technique works like this:

1) Find any interesting set of cells to examine. This becomes the base set.
2) Setup the DLX problem using the libexact (modified) API.
3) Execute libexact exact_solve to find all valid combinations of rows/cols/boxes that exactly cover the base cells.
4) For each solution, any outside candidate contained in a cover set but not in a base set can be eliminated.

This is similar to fishing algorithms that use base/cover sets for single digit analysis. In fact, I adapted it from my experimental fishing engine that also uses DLX in this manner. Here, it is being used in a more general fashion.

So, on to the code (written in C++ and depending heavily on STL):
Code: Select all
     1   int base_filter (void *param, int sz, const int *soln, int col)
     2   {
     3       bitset<m_cols> *set_map = (bitset<m_cols> *) param;
     4   
     5       return (set_map->_Unchecked_test (col) == 0);
     6   }
     7   
     8   int level_filter (void *param, int sz, const int *soln)
     9   {
    10       int max_sz = (int) param;
    11   
    12       return (sz <= max_sz);     // exact level_filter skips any solutions when sz > max_size from param when registered
    13   }
    14   
    15   int do_base_cover (bitset<m_cells> *sloc_map, bitset<m_rows> *updt_map, grid_t *grid, char *caller)
    16   {
    17       exact_t *e = ge;           // pre-allocated in initialization code exact_t *ge = exact_alloc ();
    18       exact_reset (e);           // new call for a fast re-initialization of the exact_t structure
    19   
    20       cand_map.reset ();         // STL bitset<729> for defining possible candidates to eliminate
    21       base_set.reset ();         // STL bitset<729> for defining our input bitset based on the bitset<81> sloc_map cells
    22   
    23       set_map.reset ();          // STL bitset<324> for testing whether or not we've encountered a new cover set
    24   
    25       int nh = 0;                // counter for row/col headers used for new call exact_declare to setup our DLX problem
    26       int ne = 0;                // counter for entry headers used for new call exact_declare
    27       int bc = sloc_map->count (); // Get the number of cells (base count)
    28   
    29       int update = 0;            // track how many updates (eliminations) were performed
    30   
    31       // Scan our current PM grid to initialize the base_set and cand_set, plus build the row/col headers and entries
    32   
    33       for (int n = 0; n < m_cells; n++)            // m_cells = 81
    34       {
    35           if (grid->cell[n/m_n][n%m_n].digit > 0)  // check to see if the cell is already solved
    36               continue;
    37   
    38           int r = n/m_n;                           // compute row
    39           int c = n%m_n;                           // compute col
    40           int b = rc2b[r][c];                      // lookup box
    41           
    42           int mbits = grid->cell[r][c].mbits;      // fetch the current PM candidates for this cell
    43   
    44           for (int d = 0; d < m_n; d++)            // m_n = 9 so scan all the digits
    45           {
    46               if ((mbits & (1 << d)) == 0)         // test if candidate d is set
    47                   continue;
    48   
    49               int vx = NRC2V (d, r, c);            // convert digit/row/col to NRC[729] address vx = r*81+c*9+d
    50   
    51               cand_map[vx] = 1;                    // assign this as a valid candidate for potential elimination
    52   
    53               if (sloc_map->_Unchecked_test (n) == 0) // test if this is one of our base cells
    54                   continue;                           // no - then skip the DLX row/col/entry setup
    55   
    56               base_set[vx] = 1;                       // yes - assign this as a valid base set candidate
    57   
    58               // Compute constraint indexes for this candidate
    59   
    60               int nx = r*m_n+c;                       // RC space index as a constraint (first 81)
    61               int rx = r*m_n+d + m_cells;             // RN space index as a constraint (second 81)
    62               int cx = c*m_n+d + m_cells*2;           // CN space index as a constraint (third 81)
    63               int bx = b*m_n+d + m_cells*3;           // BN space index as a constraint (forth 81)
    64   
    65               if (set_map[nx] == 0)                   // Test if this is the first time we saw this RC cell
    66               {
    67                   set_map[nx] = 1;
    68                   // e_headers[nh++] = (TYPE_COL << 16 | nx); // Do nothing - may need this later for bases other than cells
    69               }
    70               if (set_map[rx] == 0)                   // Test is this is the first time we saw this RN entry
    71               {
    72                   set_map[rx] = 1;
    73                   e_headers[nh++] = (TYPE_COL << 16 | rx);  // Yes - setup the correct TYPE_COL header
    74               }
    75               if (set_map[cx] == 0)                   // Test if this is the first time we saw this CN
    76               {
    77                   set_map[cx] = 1;
    78                   e_headers[nh++] = (TYPE_COL << 16 | cx);  // Yes - setup the correct TYPE_COL header
    79               }
    80               if (set_map[bx] == 0)                   // Test if this is the first time we saw this BN
    81               {
    82                   set_map[bx] = 1;
    83                   e_headers[nh++] = (TYPE_COL << 16 | bx);  // Yes - setup the correct TYPE_COL header
    84               }
    85   
    86               e_headers[nh++] = (TYPE_ROW << 16 | vx); // Now setup the correct TYPE_ROW header
    87   
    88               // e_entries[ne++] = (vx << 16) | nx;    // Do nothing - may need this later for bases other than cells
    89               e_entries[ne++] = (vx << 16) | rx;       // Add the ROW/COL RN entry
    90               e_entries[ne++] = (vx << 16) | cx;       // Add the ROW/COL CN entry
    91               e_entries[ne++] = (vx << 16) | bx;       // Add the ROW/COL BN entry
    92           }
    93       }
    94   
    95       // The DLX row/col headers and entries have been initialized - call exact_declare to setup the problem
    96   
    97       int status = exact_declare (e, nh, ne, e_headers, e_entries); // new call that eliminates the multiple exact_declare_xxx

calls
    98   
    99       if (status)                                                   // all the calls (most of them anyway) now return status
   100       {
   101           printf ("do_base_cover - exact_declare failed with error %d\n", status);
   102           exit (1);
   103       }
   104   
   105       int sc;                                           // exact_solve count for number of entries the following vector
   106       const int *s;                                     // exact_solve solution vector
   107   
   108       // exact_filter (e, base_filter, &set_map);       // Do nothing - base filter not currently used
   109       exact_level (e, level_filter, (void *) bc);       // Install the level filter to restrict solutions to same sized base and

cover sets
   110   
   111       int ec = 0;                                       // Track solution counter
   112   
   113       // Execute the DLX and find all exact cover solutions
   114   
   115       while (s = exact_solve (e, &sc))                  // Standard exact_solve loop - returns NULL for no/last solution
   116       {
   117           link_set.reset ();                            // STL bitset<729> used for the union of all the cover sets
   118   
   119           for (int n = 0; n < sc; n++)                  // sc contains the number of entries in the solution stack s
   120               link_set |= m_covers->cover[s[n]].index;  // m_cover is a table of 324 bitset<729> defining the 9 peers for the s[n]

cover set
   121   
   122           int changed = 0;                              // initialize the changed switch
   123   
   124           memset (c_counts, 0, sizeof (c_counts));      // initialize the 729 cover counters
   125   
   126           // Compute the cover counts for each NRC entry in the base_set and cand_set maps
   127   
   128           for (int vx = 0; vx < m_rows; vx++)           // m_rows = 729       
   129           {
   130               if (base_set[vx])                         // decrement cover count if this is in the base set
   131                   c_counts[vx]--;
   132   
   133               if (link_set[vx] && cand_map[vx])         // increment cover count if this is in the candidate set and in the

solution cover set
   134                   c_counts[vx]++;
   135           }
   136   
   137           // Test the cover counts for any eliminations due to this solution
   138   
   139           for (int vx = 0; vx < m_rows; vx++)
   140           {
   141               if (c_counts[vx] != 0)                    // used to be (c_counts[vx] < 0 || c_counts[vx] > (sc - bc)) where sc - bc

= fin count
   142               {
   143                   int r = V2R (vx);                     // compute row = vx/81
   144                   int c = V2C (vx);                     // compute col = vx/9 & 0x1ff
   145                   int d = V2N (vx);                     // compute digit = vx%9
   146   
   147                   int change = update_mbit (d, r, c, grid);  // see if we can eliminate this candidate (may already be done if

change = 0)
   148                   update += change;                          // increment our update count (if any)
   149                   changed += change;                         // increment out changed count (if any)
   150   
   151                   if (change)                                // set our updated map for generating an xsudo sud case
   152                       updt_map->_Unchecked_set (vx);
   153   
   154                   if (sw_xverbose || sw_verbose && change)   // generate a log entry for the elimination, if requested
   155                   {
   156                       printf ("%s - reducing r%dc%d.%s by <%d> base/cover\n", caller, r+1, c+1,
   157                           int2bin (grid->cell[r][c].mbits | (1 << d)), d+1);
   158                   }
   159               }
   160           }
   161   
   162           if (sw_xverbose || sw_verbose && changed)          // generate a log entry for the base/cover, if requested and any

eliminations
   163           {
   164               printf ("%s - base/cover %s %s\n", caller, bs2bin (sloc_map, bc), cs2bin (s, sc));
   165           }
   166       }
   167   
   168       return (update);                                   // return the number of updates along with the updated map positions
   169   }

If there is of any interest, I'll post the modified libexact code as well as the driver code that takes a set of cells and generates multiple calls to do_base_cover for all possible combinations.

I'm hoping other programmers will become likewise interested in coding better base/cover analysis code that we can all share. Obviously there are many parts missing, but this is just a "teaser" and I'll release all the code embedded in my testbed ALS engine along with the code to generate sud case collections.

Cheers,
Paul
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby PIsaacson » Wed Apr 29, 2009 2:07 am

Updated build files for my testbed ALS engine:

Makefile.txt needs to be renamed to Makefile and all the others need to be placed in the same directory for correct building. I'm now using the mingw compilers under cygwin, so this should build correctly on just mingw.

http://pisaacson.fileave.com/Bernhard/Makefile.txt
http://pisaacson.fileave.com/Bernhard/combination.h
http://pisaacson.fileave.com/Bernhard/exact.h
http://pisaacson.fileave.com/Bernhard/sudoku.h
http://pisaacson.fileave.com/Bernhard/exact.cpp
http://pisaacson.fileave.com/Bernhard/sudoku.cpp

Pre-built library and executable

http://pisaacson.fileave.com/Bernhard/libexact.a
http://pisaacson.fileave.com/Bernhard/sudoku.exe

You should re-make the executable if you have the correct cygwin/mingw environment, then test using:

sudoku -bvgw -Xprsak -A9 -B0 -K2 -T-1 royle17.txt royle17.sud > royle17.log

This assumes you have downloaded the new 48010 royle17 collection into the current working directory from http://people.csse.uwa.edu.au/gordon/sudoku17
The command will run the ALS engine and locate all doubly-linked ALSs and Death Blossoms and generate an Xsudo collection in royle17.sud with the output logged in royle17.log

If all goes well, you should match (use "tail -16 royle17.log") the following statistics for the updates/call and % eff. The others depend upon the speed of the processor, but should be similar if normalized.
Code: Select all
Puzzle call statistics:

  do_pinnings      updates/calls 64/6              1066.67% eff       0.1539 msec tot       0.0257 msec/call       0.0024 msec/update

Puzzle 48010 took 0.2137 msec for 1 solution(s)

Sudoku final statistics:

  tot_pinnings     updates/calls 3069183/476098     644.65% eff   16025.4457 msec tot       0.0337 msec/call       0.0052 msec/update
  tot_restrictions updates/calls 341490/52473       650.79% eff    2683.6008 msec tot       0.0511 msec/call       0.0079 msec/update
  tot_subsets      updates/calls 139002/20415       680.88% eff    2054.8852 msec tot       0.1007 msec/call       0.0148 msec/update
  tot_alschains    updates/calls 45815/10768        425.47% eff  783604.2690 msec tot      72.7716 msec/call      17.1037 msec/update
  tot_deathblossom updates/calls 77352/4107        1883.42% eff  225936.5771 msec tot      55.0126 msec/call       2.9209 msec/update
  tot_dlx          updates/calls 87/87              100.00% eff      78.2792 msec tot       0.8998 msec/call       0.8998 msec/update

C:\home\paul\bernhard\sudoku.exe -bvgw -Xprsak -A9 -B0 -K2 -T-1 royle17.txt royle17.sud

I don't have a really fast cpu (old AMD X2 4200), so the above batch run takes somewhere between 12 and 15 minutes depending on what else I'm doing while the test is running.
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby PIsaacson » Wed Apr 29, 2009 9:41 pm

For those who may have generate the above royle17.sud file and attempted to directly load it into Xsudo, it won't work because there are 62561 cases in one file. To circumvent this and to match Allan's new 2000 case limit, I have a small program that reads large sud collections, sorts them based on the keys that I have internally stored in the collection (see write_xsudo), then breaks them up into multiple 2000 max chunks.

Download: http://pisaacson.fileave.com/Bernhard/sortsud.cpp

Place it in the same directory as the other files and compile with:
g++ -O -I. sortsud.cpp -o sortsud

Execute as sortsud <filename> as in:
sortsud royle17.sud

Again, if you've been following along, this will generate 32 output files with the following output:
Code: Select all
70> sortsud royle17.sud
there are 62561 cases in this file
royle17 - output root
Opening file royle17_1.sud
Opening file royle17_2.sud
Opening file royle17_3.sud
Opening file royle17_4.sud
Opening file royle17_5.sud
Opening file royle17_6.sud
Opening file royle17_7.sud
Opening file royle17_8.sud
Opening file royle17_9.sud
Opening file royle17_10.sud
Opening file royle17_11.sud
Opening file royle17_12.sud
Opening file royle17_13.sud
Opening file royle17_14.sud
Opening file royle17_15.sud
Opening file royle17_16.sud
Opening file royle17_17.sud
Opening file royle17_18.sud
Opening file royle17_19.sud
Opening file royle17_20.sud
Opening file royle17_21.sud
Opening file royle17_22.sud
Opening file royle17_23.sud
Opening file royle17_24.sud
Opening file royle17_25.sud
Opening file royle17_26.sud
Opening file royle17_27.sud
Opening file royle17_28.sud
Opening file royle17_29.sud
Opening file royle17_30.sud
Opening file royle17_31.sud
Opening file royle17_32.sud
71>


The cases are sorted by descending number of eliminations, ascending number of base cells, ascending puzzle number, and ascending step within puzzle. The first one contains the (debatably) more interesting ALSs/DBs. The eliminations are flagged with a small "x" so it is easy to check whether or not the base/cover algorithm agrees with Allan's Xsudo found eliminations. There is a deliberate "bug" in capturing the exact grid position so sometimes there are many eliminations listed that were found due to prior ALSs/DBs.
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby PIsaacson » Wed May 13, 2009 3:20 am

I've moved this thread to http://www.setbb.com/phpbb/viewtopic.php?t=1747

I'll leave these initial postings for a few more weeks before deleting them since I don't want to track two different forums for the same topic.

Meet you there,
Paul
PIsaacson
 
Posts: 249
Joined: 02 July 2008


Return to Software