3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Programs which generate, solve, and analyze Sudoku puzzles

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Postby zhouyundong_2012 » Tue Apr 07, 2026 2:10 am

# ZSolver: Core Idea — Band-Centered 27-Integer State Representation and Pipeline-Friendly Loop Unrolling

**Author**: Zhou Yundong

## Abstract
This paper presents the core implementation ideas behind ZSolver. The key lies in a compact, band-centered state representation and a unified elimination framework built upon it. For any given digit, its state within a single band requires only 27 bits to describe. With 9 digits and 3 bands, just 27 integers are sufficient to capture all core state information needed during solving. On top of this representation, intra-band elimination, inter-band elimination, and box elimination can each be compressed into just a few lines of code. Furthermore, by fully unrolling the 27-iteration loop, relative index accesses such as F[i] are transformed into fixed absolute address accesses such as F[5], which makes extremely effective use of the CPU pipeline. This results in a significant speedup from approximately 8 microseconds to just over 4 microseconds. This paper explains this implementation path from three perspectives: state representation, elimination code structure, and execution efficiency.

**Keywords**: Sudoku solving; ZSolver; band representation; bitwise operations; loop unrolling; CPU pipeline; high-performance implementation

## 1. Introduction
The performance of a Sudoku solver is often assumed to depend primarily on its search strategy, pruning rules, and constraint system. In practice, however, the underlying state representation and code organization often have a far greater impact on the final speed. When pursuing microsecond-level performance in particular, whether a program can complete state updates, constraint propagation, and contradiction detection in the shortest possible path matters more than high-level strategy.

ZSolver's core experience demonstrates that when the state representation is compact enough, constraint propagation is unified enough, and the code structure is well-suited to the execution characteristics of modern CPUs, overall performance can improve dramatically. The real key is not to accumulate complex logic, but to compress the core problem into a small number of integers and a small amount of short code — and then to convert that structural advantage into an execution advantage at the processor level.

## 2. Band-Centered State Representation
The central foundation of ZSolver is a band-centered state representation.

For any given digit, its state within a single band can be represented using just 27 bits. Since a full Sudoku board has 9 digits and 3 bands, only 27 integers in total are needed to describe all core state. This representation is extremely compact and, crucially, completely uniform.

The significance here goes beyond simply "saving memory." More importantly:

1. The state structure is identical for every digit;
2. The processing method is identical for every band;
3. All subsequent elimination operations can be built on the same integer representation;
4. Data is highly concentrated, making it easier for compilers to optimize and for CPUs to execute efficiently.

In typical implementations, state information is scattered across rows, columns, boxes, and cells, requiring constant switching between different data views and structures. Under this representation, many originally dispersed relationships are folded into a fixed set of integers, allowing subsequent operations to be performed directly as bitwise computations.

## 3. Simplification of Intra-Band Elimination
On the basis of the above representation, intra-band elimination for any given digit requires only a single line of code.

This is critically important. It means that the primary constraint relationships within a band have been heavily absorbed by the state encoding itself. What the program actually needs to do is no longer a complex traversal with many conditional checks — it is just one very short operation.

This shows that the representation is not passively storing state, but actively participating in the solving process. It is precisely because the representation is so tightly aligned with the problem structure that the elimination logic can be written so concisely.

## 4. Unified Implementation of Inter-Band Elimination
On the same basis, inter-band elimination requires only two or three lines of code.

This shows that the representation is not only suitable for local updates within a single band, but also naturally supports constraint propagation between bands. In other words, once the state representation is established, eliminations at different structural levels no longer require separate large blocks of code — they can all be handled within the same unified framework.

For high-performance programs, this kind of uniformity is extremely valuable. Uniformity means:

1. Shorter code;
2. More stable logic;
3. More predictable execution paths;
4. Easier and more efficient processing by the machine.

## 5. Box Elimination as a Natural Extension
Box elimination, building on what has already been established, also requires only two or three lines of code.

This further demonstrates that this representation does not sacrifice the 3×3 box constraint — the most fundamental structural element of Sudoku — but instead incorporates it naturally into the same framework. As a result, intra-band elimination, inter-band elimination, and box elimination are not three separate implementations, but three direct operations all built on the same underlying state structure.

From an engineering perspective, this inheritance and uniformity is extremely valuable. It allows the program to avoid relying on a large layered rule system, depending instead on a single representation that is correct, compact, and consistent.

## 6. Each Full Elimination Pass Requires Only About Ten Lines of Code
Taking all the elimination types together, the total core code for one complete elimination pass is no more than around ten lines.

This is one of ZSolver's most distinctive characteristics. Its speed is not achieved through "vast amounts of code," but through "high-density execution of a small amount of code." The true complexity is no longer visible on the surface of the code — it has been compressed into the state representation and operational structure.

This short-code approach has two direct benefits:

1. The implementation is clear and does not require many intermediate layers;
2. It is well-suited for subsequent unrolling, scheduling, and low-level optimization.

## 7. Unrolling the 27-Iteration Loop
Given this foundation, fully unrolling the 27-iteration loop produces a notable change in the execution characteristics of the program, even though the total code length does not grow very large.

Before unrolling, the program typically accesses state via expressions like F[i] — an indexed access with a variable subscript.
After unrolling, those accesses become fixed expressions like F[5] — absolute address accesses.

Although this may appear to be merely a change in array index notation, its impact at the execution level is significant. Fixed address access means:

1. No repeated updating of loop variables;
2. No frequent boundary checks;
3. Easier register allocation by the compiler;
4. Greater freedom for instruction scheduling;
5. More direct and predictable data access paths.

This step is not simply about "removing the loop." It is about transforming the core code into a fixed, short instruction sequence that is as close as possible to the ideal execution form for the processor.

## 8. Effective Utilization of the CPU Pipeline
This unrolled implementation makes extremely effective use of the CPU's pipeline execution capability.

Based on practical experience, after adopting this style of loop unrolling and fixed address access, program speed improved significantly from approximately 8 microseconds to just over 4 microseconds. The magnitude of this improvement shows that what determines performance is not only "what the logic does," but also "how the code is executed by the machine."

Once code exhibits the following characteristics:

1. Stable execution paths;
2. Fixed memory access patterns;
3. Very few branches;
4. Tight, repetitive core operations;

the pipeline advantages of modern CPUs can be fully utilized. ZSolver's implementation path is precisely the result of unifying state representation, short code structure, and processor execution characteristics.

## 9. Natural Convergence of the Implementation
Looking at the implementation process as a whole, the initial method provided the direction, and the F[27] representation allowed the entire system to converge into a coherent whole.

Once this 27-integer representation was in place:

1. Intra-band elimination could be written in one line;
2. Inter-band elimination could be written in two or three lines;
3. Box elimination could likewise be written in two or three lines;
4. All eliminations combined amounted to only about ten lines of core code;
5. After 27-iteration unrolling, accesses became absolute addresses;
6. The result naturally became a pipeline-friendly, high-performance implementation.

ZSolver's core is therefore not any single isolated technique, but a continuous engineering path: first compress the state, then unify the operations, then unroll the core code, and finally convert the algorithmic advantage into an execution advantage at the machine level.

## 10. Conclusion
ZSolver's core ideas can be summarized as follows: represent all core state using 27 integers organized around bands; complete constraint propagation using extremely short and uniform elimination code; and, through 27-iteration loop unrolling, organize the program into an execution form that is highly friendly to the CPU pipeline.

This implementation path demonstrates that in Sudoku solving and similar constraint-satisfaction problems, the true performance ceiling is determined not only by the solving strategy itself, but also by whether the state representation is compact enough, whether the elimination code is uniform enough, and whether the implementation form is sufficiently aligned with the working characteristics of modern processors. ZSolver's performance improvement is the direct result of this unified design.

**Note**
ZSolver's public source code is available in the software section of www.enjoysoduku.com. The publicly released version is named "3.88-microsecond solver."
zhouyundong_2012
 
Posts: 155
Joined: 10 February 2012

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Postby zhouyundong_2012 » Tue Apr 07, 2026 2:17 am

# ZSolver Guess() Optimization Discussion Report

## Scope

This report summarizes a focused discussion on deep-guess optimization for ZSolver, specifically around the `Guess()` function in the ZSolver codebase. The goal of this report is not to deliver a finished optimization patch, but to record the key ideas, separate authorship clearly, and hand off the next stage of work to other contributors.

Relevant code context:

- `Guess()` is implemented in `ZSolver/sd.cpp`.
- `Rollback()` is the paired backtracking routine.
- `Calculate()` is the main search loop that calls `Guess()`.
- `CGame` in `ZSolver/sd.h` is the snapshot structure used by the explicit stack-based search.

## Confirmed Context

- `Guess()` is the core target of the "deep-guess optimization" discussion.
- The current implementation is an explicit-stack search rather than a recursive search.
- The current heuristic is roughly: prefer branches whose candidate count is `2`; otherwise choose the smallest candidate count available.
- The implementation is naturally organized around `Block + Line`, where each guess step is aligned with the current 27-integer representation.

## Viewpoints From zhouyundong

The following points were proposed by **zhouyundong** during the discussion:

1. **zhouyundong:** Deep-guess optimization specifically means optimizing `Guess()`.
2. **zhouyundong:** When the board contains a large blank area, a larger-step strategy may be better than the current fine-grained guessing. In that context, "guessing nine cells at once" was proposed as an intuitive direction.
3. **zhouyundong:** Once the board is no longer extremely empty, "guessing three cells in one step" is likely better than a larger nine-cell step.
4. **zhouyundong:** The main effort should not be spent on writing detailed optimization code first. The main effort should be spent on correctly distinguishing the three board phases, referred to as `1 / 2 / 3`.
5. **zhouyundong:** In the current representation, `U` is the most practical signal for phase classification. `F` is not convenient or reliable enough as the main basis for this classification.
6. **zhouyundong:** If necessary, it is acceptable to sacrifice a little `Update()` performance in order to expose more useful signals for phase classification.
7. **zhouyundong:** There is no strong need to complete the full concrete optimization now. The more meaningful output at this stage is a clean conceptual handoff so that others can continue the work.

## Viewpoints From AI

The following points were proposed by **AI** during the discussion:

1. **AI:** The current `Guess()` heuristic is strong when the board already contains tight local constraints, especially when many branches have only `2` or `3` candidates. It becomes less effective when the board is still very sparse and candidate counts remain large almost everywhere.
2. **AI:** The real optimization problem is not only "how to rewrite a few lines in `Guess()`", but "how to detect which stage the board is currently in and dispatch to the right strategy".
3. **AI:** A practical three-phase model is:

- Phase `1`: very sparse board, weak propagation, suitable for a larger-step strategy.
- Phase `2`: mid-game board, stronger structure, suitable for a "three-cells-in-one-step" strategy.
- Phase `3`: late-game board, many tight branches, suitable for the current MRV-like small-branch `Guess()`.
4. **AI:** "Guessing nine cells at once" should not mean blindly assigning nine unrelated cells. If such a strategy is ever used, it should be a structured large-step guess used only in very sparse positions.
5. **AI:** When the board is no longer very sparse, "one step, three cells" is more natural than "one step, nine cells" because it matches the existing `Block + Line` organization of the 27-integer representation and is more compatible with the current stack and rollback model.
6. **AI:** The first version of the phase classifier can be built mainly from `U`, without depending heavily on `F`.
7. **AI:** If a small performance cost in `Update()` is acceptable, the best addition is not a heavy new analysis pass, but lightweight counters or statistics generated during propagation.
8. **AI:** Useful signals for distinguishing phases may include:

- how many `U` positions are already in a unique or nearly unique state,
- how much `U` changes during one `Update()` cycle,
- how many new uniquely resolved positions appear after one guess followed by propagation,
- whether propagation is producing strong follow-up effects or almost no shrinkage at all.
9. **AI:** The most valuable next step is to define a stable and cheap `1 / 2 / 3` classifier first. Strategy implementation should come after that.

## Joint Direction Emerging From The Discussion

Although the detailed optimization was intentionally not implemented, the discussion converged on the following direction:

1. The future of `Guess()` optimization should likely be **phase-based**, not one-size-fits-all.
2. The most important research task is to distinguish board phase `1`, `2`, and `3` reliably.
3. The main signal source for this phase classification should begin with `U`.
4. A limited extra cost inside `Update()` is acceptable if it makes phase classification significantly more accurate.
5. Concrete strategy work can be handed off:

- phase `1` strategy can be explored by others,
- phase `2` "one-step three-cells" strategy can be explored by others,
- phase `3` can continue to rely on the current refined `Guess()` style or a close variant.

## Handoff Note

This report intentionally stops at the level of problem framing, phase separation, and heuristic direction. It does not attempt to provide a finished optimization implementation.

That decision is deliberate:

- the conceptual split is more important than premature code changes,
- the next contributor can pick up the baton from a cleaner starting point,
- and the discussion has already produced a useful division of labor between problem definition and later engineering execution.

## Attribution Summary

- All user-originated viewpoints in this report are attributed to **zhouyundong**.
- All assistant-originated viewpoints in this report are attributed to **AI**.
zhouyundong_2012
 
Posts: 155
Joined: 10 February 2012

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Postby zhouyundong_2012 » Tue Apr 07, 2026 2:43 am

Code: Select all
#ifndef __sd_h__
#define __sd_h__

#define TotalBoard          50000 //Total Games

//Answer Attribute
#define NoAnswer          0
#define UniqAnswer          1
#define MoreAnswer          2

//High, Middle, Low 9 bit of a 27-bit number
#define HIGH_9_BIT(v)       ((v) >> 18)
#define MID_9_BIT(v)       (((v) >> 9) & 0x1FF)
#define LOW_9_BIT(v)       ((v) & 0x1FF)

#define HML_9_BIT(v, l)       ((v) >> TblMult9[l] & 0x1FF)

#define FULL_TO_COLUMN(v)    (TblShrinkHigh[(v) >> 18] | TblShrinkMid[((v) >> 9) & 0x1FF] | TblShrinkLow[(v) & 0x1FF]) //Full Mask(27-bit) to 9 bit
#define FULL_TO_SHRINK(v)    (((v) | ((v) >> 9) | ((v) >> 18)) & 0x1FF)

#define BIT_SET_27          0x07FFFFFF

#define AN(v, n)          (v) &= ~(n)

struct CGame {
  public:
   int             F[27]; //27-bit Full Mask
   int             U[27]; //3-bit Unique in Palace
   int             Block; //The Block Number of guess [0, 27)
   int             Line; //The Line Number of guess [0, 3)
   int             Mask; //The 9-bit Mask of guess
   int             Left; //The 9-bit Left Mask of guess
};

//Common Table
extern int             TblMult3[81];
extern int             TblDivide3[81];
extern int             TblRmder3[81];
extern int             TblMult9[9];
//Basis Table
extern int             TblCombineMask[512];
extern int             TblTailZero[512];
extern int             TblNumberOne[512];
extern int             TblAnother1[27];
extern int             TblAnother2[27];
extern int             TblShiftLeft[27];
//Table of Decompose Board
extern int             TblBoard_Palace[81];
extern int             TblBoard_Block[81];
extern int             TblBoard_BlockMask[81];
extern int             TblBoard_GridUniq[81];
//Table of Initial
extern int             TblSelfMask[81];
extern int             TblOtherMask[81];
//Complex Friend Method
extern int             TblMaskSingle[512];
extern int             TblMaskDouble[512];
//Other Table
extern int             TblPalaceMask[8];
extern bool             TblUniqFlag[8];
//Shrink Mask
extern int             TblShrinkLow[512];
extern int             TblShrinkMid[512];
extern int             TblShrinkHigh[512];
//Processed state cache
extern int             CompColumn[27];
extern int             CompShrink[27];
//Complex Method
extern int             TblComplexMask[512];
extern int             TblColumnSingle[512];
extern int             TblShrinkSingle[512];

extern void             CreateTable();

#endif
zhouyundong_2012
 
Posts: 155
Joined: 10 February 2012

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Postby zhouyundong_2012 » Tue Apr 07, 2026 2:44 am

Code: Select all
#include <string.h>
#include "sd.h"

//Common Table
int                   TblMult3[81];
int                   TblDivide3[81];
int                   TblRmder3[81];
int                   TblMult9[9];
//Basis Table
int                   TblCombineMask[512];
int                   TblTailZero[512];
int                   TblNumberOne[512];
int                   TblAnother1[27];
int                   TblAnother2[27];
int                   TblShiftLeft[27];
//Table of Decompose Board
int                   TblBoard_Palace[81];
int                   TblBoard_Block[81];
int                   TblBoard_BlockMask[81];
int                   TblBoard_GridUniq[81];
//Table of Initial
int                   TblSelfMask[81];
int                   TblOtherMask[81];
//Complex Friend Method
int                   TblMaskSingle[512];
int                   TblMaskDouble[512];
//Other Table
int                   TblPalaceMask[8];
bool                TblUniqFlag[8];
//Shrink Mask
int                   TblShrinkLow[512];
int                   TblShrinkMid[512];
int                   TblShrinkHigh[512];
//Complex Method
int                   TblComplexMask[512];
int                   TblColumnSingle[512];
int                   TblShrinkSingle[512];


void CreateTblCommon()
{
   for (int i = 0; i < 81; i++)
   {
      TblMult3[i] = i * 3;
      TblDivide3[i] = i / 3;
      TblRmder3[i] = i % 3;
   }
   for (int i = 0; i < 9; i++)
   {
      TblMult9[i] = i * 9;
   }
}

void CreateTblCombineMask()
{
   for (int i = 0; i < 512; i++)
   {
      TblCombineMask[i] = (i | (i << 9) | (i << 18)) ^ BIT_SET_27;
   }
}

void CreateTblAnother()
{
   for (int i = 0; i < 27; i++)
   {
      switch (i % 3)
      {
      case 0: //At Top
         TblAnother1[i] = i + 1;
         TblAnother2[i] = i + 2;
         break;
      case 1: //At Middle
         TblAnother1[i] = i - 1;
         TblAnother2[i] = i + 1;
         break;
      case 2: //At Bottom
         TblAnother1[i] = i - 2;
         TblAnother2[i] = i - 1;
         break;
      }
   }
}

void CreateTblTailZero()
{
   memset(TblTailZero, 0, sizeof(TblTailZero));
   for (int i = 0; i < 512; i++)
   {
      for (int j = 0; j < 9; j++)
      {
         if (i & (1 << j))
            break;
         TblTailZero[i]++;
      }
   }
}

void CreateTblNumberOne()
{
   memset(TblNumberOne, 0, sizeof(TblNumberOne));
   for (int i = 0; i < 512; i++)
   {
      for (int j = 0; j < 9; j++)
      {
         if (i & (1 << j))
            TblNumberOne[i]++;
      }
   }
}

void CreateTblShiftLeft()
{
   for (int i = 0; i < 27; i++)
   {
      TblShiftLeft[i] = 1 << i;
   }
}

void CreateTblBasis()
{
   CreateTblCombineMask();
   CreateTblAnother();
   CreateTblTailZero();
   CreateTblNumberOne();
   CreateTblShiftLeft();
}

void CreateTblBoard()
{
   for (int i = 0; i < 81; i++)
   {
      TblBoard_Palace[i] = TblMult3[i / 27] + TblDivide3[i % 9];
      TblBoard_Block[i] = i / 27;
      TblBoard_BlockMask[i] = 1 << i % 27;
      TblBoard_GridUniq[i] = 1 << TblBoard_Palace[i] % 3;
   }
}

void CreateTblSelfMask()
{
   int Mask[3] = { 0x7E3F1F8, 0x71F8FC7, 0x0FC7E3F };
   int Row[3] = { 0x7FFFE00, 0x7FC01FF, 0x3FFFF };
   for (int i = 0; i < 81; i++)
   {
      int Palace = TblBoard_Palace[i % 27];
      TblSelfMask[i] = Mask[Palace] & Row[i % 27 / 9] | TblBoard_BlockMask[i];
   }
}

void CreateTblOtherMask()
{
   for (int i = 0; i < 81; i++)
   {
      TblOtherMask[i] = TblCombineMask[1 << i % 9];
   }
}

void CreateTblMaskSingle()
{
   for (int i = 0; i < 512; i++)
   {
      TblMaskSingle[i] = i;
      int v = i >> 6; //High 3 bit
      if (v != 4 && v != 2 && v != 1)
         TblMaskSingle[i] &= 0x3F;
      v = i >> 3 & 0x7; //Middle 3 bit
      if (v != 4 && v != 2 && v != 1)
         TblMaskSingle[i] &= 0x1C7;
      v = i & 0x7; //Low 3 bit
      if (v != 4 && v != 2 && v != 1)
         TblMaskSingle[i] &= 0x1F8;
      TblMaskSingle[i] = TblCombineMask[TblMaskSingle[i]];
   }
}

void CreateTblMaskDouble()
{
   for (int i = 0; i < 512; i++)
   {
      TblMaskDouble[i] = i;
      int v = i >> 6; //High 3 bit
      if (v != 6 && v != 5 && v != 3)
         TblMaskDouble[i] &= 0x3F;
      v = i >> 3 & 0x7; //Middle 3 bit
      if (v != 6 && v != 5 && v != 3)
         TblMaskDouble[i] &= 0x1C7;
      v = i & 0x7; //Low 3 bit
      if (v != 6 && v != 5 && v != 3)
         TblMaskDouble[i] &= 0x1F8;
      TblMaskDouble[i] = TblCombineMask[TblMaskDouble[i]];
   }
}

void CreateTblPalaceMask()
{
   memset(TblPalaceMask, 0, sizeof(TblPalaceMask));
   for (int i = 0; i < 8; i++)
   {
      for (int j = 0; j < 3; j++)
      {
         if (i & (1 << j))
            TblPalaceMask[i] |= 0x1C0E07 << TblMult3[j];
      }
   }
}

void CreateTblUniqFlag()
{
   memset(TblUniqFlag, 0, sizeof(TblUniqFlag));
   for (int i = 0; i < 8; i++)
   {
      if (TblNumberOne[i] >= 2)
         TblUniqFlag[i] = true;
   }
}

void CreateShrinkMask()
{
   int PalaceMask[3] = { 0x7, 0x38, 0x1C0 };

   memset(TblShrinkLow, 0, sizeof(TblShrinkLow));
   for (int i = 0; i < 512; i++)
   {
      for (int j = 0; j < 3; j++)
      {
         if (i & PalaceMask[j])
            TblShrinkLow[i] |= 1 << j;
      }
   }
   for (int i = 0; i < 512; i++)
   {
      TblShrinkMid[i] = TblShrinkLow[i] << 3;
      TblShrinkHigh[i] = TblShrinkLow[i] << 6;
   }
}

void CreateComplexMask()
{
   int MaskH[9] = { 0x7FFFE07, 0x7FFFE38, 0x7FFFFC0, 0x7FC0FFF, 0x7FC71FF, 0x7FF81FF, 0x1FFFFF, 0xE3FFFF, 0x703FFFF }; //A B C
   int MaskV[9] = { 0x7E3F1FF, 0x71F8FFF, 0xFC7FFF, 0x7E3FFF8, 0x71FFFC7, 0xFFFE3F, 0x7FFF1F8, 0x7FF8FC7, 0x7FC7E3F }; //A 0 0

   int index;
   for (int i = 0; i < 512; i++)
   {
      if (!(i & 0x1C0))
      {
         TblComplexMask[i] = 0;
         continue;
      }
      TblComplexMask[i] = BIT_SET_27;
      //Like Locked Candidate
      for (int j = 0; j < 3; j++)
      {
         for (int k = 0; k < 3; k++)
         {
            index = TblMult3[k] + j;
            if ((i & (1 << index)) && !(i & (1 << (TblMult3[TblAnother1[k]] + j))) && !(i & (1 << (TblMult3[TblAnother2[k]] + j))))
               TblComplexMask[i] &= MaskH[index]; //Horizontal
            index = TblMult3[j] + k;
            if ((i & (1 << index)) && !(i & (1 << TblAnother1[index])) && !(i & (1 << TblAnother2[index])))
               TblComplexMask[i] &= MaskV[index]; //Vertical
         }
      }
   }
   for (int i = 0; i < 512; i++)
   {
      if ((i & FULL_TO_COLUMN(TblComplexMask[i])) == 0)
         TblComplexMask[i] = 0;
      else if ((i & FULL_TO_COLUMN(TblComplexMask[i])) == i)
         TblComplexMask[i] = BIT_SET_27;
   }
}

void CreateTblColumnSingle()
{
   int ColumnMask[3] = { 0x49, 0x92, 0x124 };

   for (int i = 0; i < 512; i++)
   {
      TblColumnSingle[i] = 7;
      for (int j = 0; j < 3; j++)
      {
         int v = i & ColumnMask[j];
         if (v != 1 && v != 8 && v != 64)
            TblColumnSingle[i] &= 7 ^ (1 << j);
      }
   }
   for (int i = 0; i < 512; i++)
   {
      TblShrinkSingle[i] = 7;
      int v = i >> 6; //High 3 bit
      if (v != 4 && v != 2 && v != 1)
         TblShrinkSingle[i] &= 0x3;
      v = i >> 3 & 0x7; //Middle 3 bit
      if (v != 4 && v != 2 && v != 1)
         TblShrinkSingle[i] &= 0x5;
      v = i & 0x7; //Low 3 bit
      if (v != 4 && v != 2 && v != 1)
         TblShrinkSingle[i] &= 0x6;
   }
}

void CreateTable()
{
   CreateTblCommon();
   CreateTblBasis();
   CreateTblBoard();
   CreateTblSelfMask();
   CreateTblOtherMask();
   CreateTblMaskSingle();
   CreateTblMaskDouble();
   CreateTblPalaceMask();
   CreateTblUniqFlag();
   CreateShrinkMask();
   CreateComplexMask();
   CreateTblColumnSingle();
}
zhouyundong_2012
 
Posts: 155
Joined: 10 February 2012

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Postby zhouyundong_2012 » Tue Apr 07, 2026 2:45 am

Code: Select all
bool Update()
{
   int Shrink = ~0;
   while (Shrink)
   {
      Shrink = 0;
      if (CompF[0] != F[0])
      {
         Shrink = FULL_TO_SHRINK(F[0]);
         if ((F[0] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[0] = F[0];
         int Column = FULL_TO_COLUMN(F[0]);
         int C1 = Column | FULL_TO_COLUMN(F[1]);
         int C2 = Column | FULL_TO_COLUMN(F[2]);
         F[1] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[2] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[0];
         if (Single)
         {
            U[0] |= Single;
            int S = F[0] & TblPalaceMask[Single];
            AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S);
         }
      }
      if (CompF[1] != F[1])
      {
         Shrink = FULL_TO_SHRINK(F[1]);
         if ((F[1] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[1] = F[1];
         int Column = FULL_TO_COLUMN(F[1]);
         int C1 = Column | FULL_TO_COLUMN(F[0]);
         int C2 = Column | FULL_TO_COLUMN(F[2]);
         F[0] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[2] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[1];
         if (Single)
         {
            U[1] |= Single;
            int S = F[1] & TblPalaceMask[Single];
            AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S);
         }
      }
      if (CompF[2] != F[2])
      {
         Shrink = FULL_TO_SHRINK(F[2]);
         if ((F[2] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[2] = F[2];
         int Column = FULL_TO_COLUMN(F[2]);
         int C1 = Column | FULL_TO_COLUMN(F[0]);
         int C2 = Column | FULL_TO_COLUMN(F[1]);
         F[0] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[1] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[2];
         if (Single)
         {
            U[2] |= Single;
            int S = F[2] & TblPalaceMask[Single];
            AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S);
         }
      }
      if (CompF[3] != F[3])
      {
         Shrink = FULL_TO_SHRINK(F[3]);
         if ((F[3] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[3] = F[3];
         int Column = FULL_TO_COLUMN(F[3]);
         int C1 = Column | FULL_TO_COLUMN(F[4]);
         int C2 = Column | FULL_TO_COLUMN(F[5]);
         F[4] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[5] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[3];
         if (Single)
         {
            U[3] |= Single;
            int S = F[3] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S);
         }
      }
      if (CompF[4] != F[4])
      {
         Shrink = FULL_TO_SHRINK(F[4]);
         if ((F[4] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[4] = F[4];
         int Column = FULL_TO_COLUMN(F[4]);
         int C1 = Column | FULL_TO_COLUMN(F[3]);
         int C2 = Column | FULL_TO_COLUMN(F[5]);
         F[3] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[5] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[4];
         if (Single)
         {
            U[4] |= Single;
            int S = F[4] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S);
         }
      }
      if (CompF[5] != F[5])
      {
         Shrink = FULL_TO_SHRINK(F[5]);
         if ((F[5] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[5] = F[5];
         int Column = FULL_TO_COLUMN(F[5]);
         int C1 = Column | FULL_TO_COLUMN(F[3]);
         int C2 = Column | FULL_TO_COLUMN(F[4]);
         F[3] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[4] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[5];
         if (Single)
         {
            U[5] |= Single;
            int S = F[5] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S);
         }
      }
      if (CompF[6] != F[6])
      {
         Shrink = FULL_TO_SHRINK(F[6]);
         if ((F[6] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[6] = F[6];
         int Column = FULL_TO_COLUMN(F[6]);
         int C1 = Column | FULL_TO_COLUMN(F[7]);
         int C2 = Column | FULL_TO_COLUMN(F[8]);
         F[7] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[8] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[6];
         if (Single)
         {
            U[6] |= Single;
            int S = F[6] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[3], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S);
         }
      }
      if (CompF[7] != F[7])
      {
         Shrink = FULL_TO_SHRINK(F[7]);
         if ((F[7] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[7] = F[7];
         int Column = FULL_TO_COLUMN(F[7]);
         int C1 = Column | FULL_TO_COLUMN(F[6]);
         int C2 = Column | FULL_TO_COLUMN(F[8]);
         F[6] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[8] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[7];
         if (Single)
         {
            U[7] |= Single;
            int S = F[7] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[4], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S);
         }
      }
      if (CompF[8] != F[8])
      {
         Shrink = FULL_TO_SHRINK(F[8]);
         if ((F[8] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[8] = F[8];
         int Column = FULL_TO_COLUMN(F[8]);
         int C1 = Column | FULL_TO_COLUMN(F[6]);
         int C2 = Column | FULL_TO_COLUMN(F[7]);
         F[6] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[7] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[8];
         if (Single)
         {
            U[8] |= Single;
            int S = F[8] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[5], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S);
         }
      }
      if (CompF[9] != F[9])
      {
         Shrink = FULL_TO_SHRINK(F[9]);
         if ((F[9] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[9] = F[9];
         int Column = FULL_TO_COLUMN(F[9]);
         int C1 = Column | FULL_TO_COLUMN(F[10]);
         int C2 = Column | FULL_TO_COLUMN(F[11]);
         F[10] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[11] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[9];
         if (Single)
         {
            U[9] |= Single;
            int S = F[9] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S);
         }
      }
      if (CompF[10] != F[10])
      {
         Shrink = FULL_TO_SHRINK(F[10]);
         if ((F[10] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[10] = F[10];
         int Column = FULL_TO_COLUMN(F[10]);
         int C1 = Column | FULL_TO_COLUMN(F[9]);
         int C2 = Column | FULL_TO_COLUMN(F[11]);
         F[9] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[11] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[10];
         if (Single)
         {
            U[10] |= Single;
            int S = F[10] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S);
         }
      }
      if (CompF[11] != F[11])
      {
         Shrink = FULL_TO_SHRINK(F[11]);
         if ((F[11] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[11] = F[11];
         int Column = FULL_TO_COLUMN(F[11]);
         int C1 = Column | FULL_TO_COLUMN(F[9]);
         int C2 = Column | FULL_TO_COLUMN(F[10]);
         F[9] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[10] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[11];
         if (Single)
         {
            U[11] |= Single;
            int S = F[11] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S);
         }
      }
      if (CompF[12] != F[12])
      {
         Shrink = FULL_TO_SHRINK(F[12]);
         if ((F[12] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[12] = F[12];
         int Column = FULL_TO_COLUMN(F[12]);
         int C1 = Column | FULL_TO_COLUMN(F[13]);
         int C2 = Column | FULL_TO_COLUMN(F[14]);
         F[13] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[14] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[12];
         if (Single)
         {
            U[12] |= Single;
            int S = F[12] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S);
         }
      }
      if (CompF[13] != F[13])
      {
         Shrink = FULL_TO_SHRINK(F[13]);
         if ((F[13] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[13] = F[13];
         int Column = FULL_TO_COLUMN(F[13]);
         int C1 = Column | FULL_TO_COLUMN(F[12]);
         int C2 = Column | FULL_TO_COLUMN(F[14]);
         F[12] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[14] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[13];
         if (Single)
         {
            U[13] |= Single;
            int S = F[13] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S);
         }
      }
      if (CompF[14] != F[14])
      {
         Shrink = FULL_TO_SHRINK(F[14]);
         if ((F[14] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[14] = F[14];
         int Column = FULL_TO_COLUMN(F[14]);
         int C1 = Column | FULL_TO_COLUMN(F[12]);
         int C2 = Column | FULL_TO_COLUMN(F[13]);
         F[12] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[13] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[14];
         if (Single)
         {
            U[14] |= Single;
            int S = F[14] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S);
         }
      }
      if (CompF[15] != F[15])
      {
         Shrink = FULL_TO_SHRINK(F[15]);
         if ((F[15] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[15] = F[15];
         int Column = FULL_TO_COLUMN(F[15]);
         int C1 = Column | FULL_TO_COLUMN(F[16]);
         int C2 = Column | FULL_TO_COLUMN(F[17]);
         F[16] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[17] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[15];
         if (Single)
         {
            U[15] |= Single;
            int S = F[15] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[18], S); AN(F[21], S); AN(F[24], S);
         }
      }
      if (CompF[16] != F[16])
      {
         Shrink = FULL_TO_SHRINK(F[16]);
         if ((F[16] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[16] = F[16];
         int Column = FULL_TO_COLUMN(F[16]);
         int C1 = Column | FULL_TO_COLUMN(F[15]);
         int C2 = Column | FULL_TO_COLUMN(F[17]);
         F[15] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[17] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[16];
         if (Single)
         {
            U[16] |= Single;
            int S = F[16] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[19], S); AN(F[22], S); AN(F[25], S);
         }
      }
      if (CompF[17] != F[17])
      {
         Shrink = FULL_TO_SHRINK(F[17]);
         if ((F[17] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[17] = F[17];
         int Column = FULL_TO_COLUMN(F[17]);
         int C1 = Column | FULL_TO_COLUMN(F[15]);
         int C2 = Column | FULL_TO_COLUMN(F[16]);
         F[15] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[16] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[17];
         if (Single)
         {
            U[17] |= Single;
            int S = F[17] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[20], S); AN(F[23], S); AN(F[26], S);
         }
      }
      if (CompF[18] != F[18])
      {
         Shrink = FULL_TO_SHRINK(F[18]);
         if ((F[18] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[18] = F[18];
         int Column = FULL_TO_COLUMN(F[18]);
         int C1 = Column | FULL_TO_COLUMN(F[19]);
         int C2 = Column | FULL_TO_COLUMN(F[20]);
         F[19] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[20] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[18];
         if (Single)
         {
            U[18] |= Single;
            int S = F[18] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[21], S); AN(F[24], S);
         }
      }
      if (CompF[19] != F[19])
      {
         Shrink = FULL_TO_SHRINK(F[19]);
         if ((F[19] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[19] = F[19];
         int Column = FULL_TO_COLUMN(F[19]);
         int C1 = Column | FULL_TO_COLUMN(F[18]);
         int C2 = Column | FULL_TO_COLUMN(F[20]);
         F[18] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[20] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[19];
         if (Single)
         {
            U[19] |= Single;
            int S = F[19] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[22], S); AN(F[25], S);
         }
      }
      if (CompF[20] != F[20])
      {
         Shrink = FULL_TO_SHRINK(F[20]);
         if ((F[20] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[20] = F[20];
         int Column = FULL_TO_COLUMN(F[20]);
         int C1 = Column | FULL_TO_COLUMN(F[18]);
         int C2 = Column | FULL_TO_COLUMN(F[19]);
         F[18] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[19] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[20];
         if (Single)
         {
            U[20] |= Single;
            int S = F[20] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[23], S); AN(F[26], S);
         }
      }
      if (CompF[21] != F[21])
      {
         Shrink = FULL_TO_SHRINK(F[21]);
         if ((F[21] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[21] = F[21];
         int Column = FULL_TO_COLUMN(F[21]);
         int C1 = Column | FULL_TO_COLUMN(F[22]);
         int C2 = Column | FULL_TO_COLUMN(F[23]);
         F[22] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[23] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[21];
         if (Single)
         {
            U[21] |= Single;
            int S = F[21] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[24], S);
         }
      }
      if (CompF[22] != F[22])
      {
         Shrink = FULL_TO_SHRINK(F[22]);
         if ((F[22] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[22] = F[22];
         int Column = FULL_TO_COLUMN(F[22]);
         int C1 = Column | FULL_TO_COLUMN(F[21]);
         int C2 = Column | FULL_TO_COLUMN(F[23]);
         F[21] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[23] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[22];
         if (Single)
         {
            U[22] |= Single;
            int S = F[22] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[25], S);
         }
      }
      if (CompF[23] != F[23])
      {
         Shrink = FULL_TO_SHRINK(F[23]);
         if ((F[23] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[23] = F[23];
         int Column = FULL_TO_COLUMN(F[23]);
         int C1 = Column | FULL_TO_COLUMN(F[21]);
         int C2 = Column | FULL_TO_COLUMN(F[22]);
         F[21] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[22] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[23];
         if (Single)
         {
            U[23] |= Single;
            int S = F[23] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[26], S);
         }
      }
      if (CompF[24] != F[24])
      {
         Shrink = FULL_TO_SHRINK(F[24]);
         if ((F[24] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[24] = F[24];
         int Column = FULL_TO_COLUMN(F[24]);
         int C1 = Column | FULL_TO_COLUMN(F[25]);
         int C2 = Column | FULL_TO_COLUMN(F[26]);
         F[25] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[26] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[24];
         if (Single)
         {
            U[24] |= Single;
            int S = F[24] & TblPalaceMask[Single];
            AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S);
         }
      }
      if (CompF[25] != F[25])
      {
         Shrink = FULL_TO_SHRINK(F[25]);
         if ((F[25] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[25] = F[25];
         int Column = FULL_TO_COLUMN(F[25]);
         int C1 = Column | FULL_TO_COLUMN(F[24]);
         int C2 = Column | FULL_TO_COLUMN(F[26]);
         F[24] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[26] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[25];
         if (Single)
         {
            U[25] |= Single;
            int S = F[25] & TblPalaceMask[Single];
            AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S);
         }
      }
      if (CompF[26] != F[26])
      {
         Shrink = FULL_TO_SHRINK(F[26]);
         if ((F[26] &= TblComplexMask[Shrink]) == 0)
            return false;
         CompF[26] = F[26];
         int Column = FULL_TO_COLUMN(F[26]);
         int C1 = Column | FULL_TO_COLUMN(F[24]);
         int C2 = Column | FULL_TO_COLUMN(F[25]);
         F[24] &= TblMaskSingle[Column] & TblMaskDouble[C2];
         F[25] &= TblMaskSingle[Column] & TblMaskDouble[C1];
         int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[26];
         if (Single)
         {
            U[26] |= Single;
            int S = F[26] & TblPalaceMask[Single];
            AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S);
         }
      }
   }
   return true;
}

zhouyundong_2012
 
Posts: 155
Joined: 10 February 2012

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Postby zhouyundong_2012 » Tue Apr 07, 2026 2:48 am

Code: Select all
#include <windows.h>
#include <stdio.h>
#include "sd.h"

/*
 * First Step : Compile this project with __ProgramWritter used and Run it
 * Second Step: Replace the function Update in the bottom of this file with all contents of Update.c created by First Step in current directory
 * Third Step : Recompile this project without __ProgramWritter and Run it
 *
 *           Warning: Can NOT comment __ProgramWritter until finish 3 steps!
 *
 * Good Luck! Shanghai China 2012/3/14 Zhou Yun Dong
 */
#define __ProgramWritter

byte                Board[81]; //Board
CGame                G[50]; //Guess Board

int                   Index; //Guess Index
int                   BlockMask[27]; //Block
int                   BlockMaskSum[3]; //Sum of BlockMask
int                   F[27]; //27-bit Full Mask
int                   U[27]; //3-bit Unique in Palace
int                   CompF[27]; //Use for Compare
int                   CompColumn[27]; //Processed 9-bit column mask
int                   CompShrink[27]; //Processed 9-bit shrink mask

byte                AllBoard[TotalBoard][81]; //All Boards
int                   TestCaseNumber; //Actual Number of Test Case

bool                Update();
void                Guess();
bool                Rollback();
byte                Calculate();

void InitSodoku()
{
   Index = 1;
   BlockMaskSum[0] = BlockMaskSum[1] = BlockMaskSum[2] = 0;
   for (int B = 26; B >= 0; --B)
   {
      BlockMask[B] = U[B] = 0;
      CompF[B] = F[B] = -1;
      CompColumn[B] = -1;
      CompShrink[B] = -1;
   }
   //Get Full Mask
   for (int i = 80; i >= 0; --i)
   {
      if (Board[i] == '0')
         continue;

      int B = TblMult3[Board[i] - '1'] + TblBoard_Block[i];
      //Get BlockMask and Sum of BlockMask
      BlockMask[B] |= TblBoard_BlockMask[i];
      BlockMaskSum[TblBoard_Block[i]] |= BlockMask[B];
      //Get Full Mask
      F[B] &= TblSelfMask[i];
      F[TblAnother1[B]] &= TblOtherMask[i];
      F[TblAnother2[B]] &= TblOtherMask[i];
      U[B] |= TblBoard_GridUniq[i];
   }
   for (int B = 26; B >= 0; --B)
      F[B] &= BlockMaskSum[TblRmder3[B]] ^ BlockMask[B] ^ BIT_SET_27;

   memcpy(G[0].F, F, sizeof(F));
}

void Guess()
{
   int Least = 9;
   for (int i = 0; i < 27; i++)
   {
      if (TblUniqFlag[U[i]])
         continue;
     
      G[Index].Block = i;
      if (TblNumberOne[LOW_9_BIT(F[i])] == 2)
      {
         G[Index].Line = 0;
         goto FoundLeast;
      }
      if (TblNumberOne[MID_9_BIT(F[i])] == 2)
      {
         G[Index].Line = 1;
         goto FoundLeast;
      }
      if (TblNumberOne[HIGH_9_BIT(F[i])] == 2)
      {
         G[Index].Line = 2;
         goto FoundLeast;
      }
   }
   Least = 9;
   for (int i = 0; i < 27; i++)
   {
      for (int j = 0; j < 3; j++)
      {
         int NumberOne = TblNumberOne[HML_9_BIT(F[i], j)];
         if (NumberOne > 1 && Least > NumberOne)
         {
            G[Index].Block = i;
            G[Index].Line = j;
            if ((Least = NumberOne) == 2)
               goto FoundLeast;
         }
      }
   }
FoundLeast:
   int Left = HML_9_BIT(F[G[Index].Block], G[Index].Line);
   int Mask = Left & -Left;
   //Save Data
   G[Index].Mask = Mask;
   G[Index].Left = Left;
   for (int B = 26; B >= 0; --B)
   {
      G[Index].F[B] = CompF[B] = F[B];
      G[Index].U[B] = U[B];
   }
   F[G[Index].Block] &= TblSelfMask[TblTailZero[Mask] + TblMult9[G[Index].Line]];
   CompColumn[G[Index].Block] = -1;
   CompShrink[G[Index].Block] = -1;
   Index++;
}

bool Rollback()
{
   if (Index-- == 1)
      return false;
   //Restore Data
   int Block = G[Index].Block;
   int Line = G[Index].Line;
   for (int B = 26; B >= 0; --B)
   {
      CompF[B] = F[B] = G[Index].F[B];
      U[B] = G[Index].U[B];
      CompColumn[B] = FULL_TO_COLUMN(F[B]);
      CompShrink[B] = FULL_TO_SHRINK(F[B]);
   }
   G[Index].Left ^= G[Index].Mask;
   G[Index].Mask = G[Index].Left & -G[Index].Left;
   F[Block] &= TblSelfMask[TblTailZero[G[Index].Mask] + TblMult9[Line]];
   CompColumn[Block] = -1;
   CompShrink[Block] = -1;
   //Whether The Last Guess
   Index += TblNumberOne[G[Index].Left] != 1;
   return true;
}
byte Calculate()
{
   bool FoundAnswer = false;
   while (true)
   {
      if (Update())
      {
         if (U[0] + U[1] + U[2] != 21 || U[3] + U[4] + U[5] != 21 || U[6] + U[7] + U[8] != 21
             || U[9] + U[10] + U[11] != 21 || U[12] + U[13] + U[14] != 21 || U[15] + U[16] + U[17] != 21
             || U[18] + U[19] + U[20] != 21 || U[21] + U[22] + U[23] != 21 || U[24] + U[25] + U[26] != 21)
         {
            Guess();
            continue;
         }
         if (FoundAnswer) //Find Answer
            return MoreAnswer;
         FoundAnswer = true;
      }
      if (!Rollback()) //Backtrack Over
         return FoundAnswer ? UniqAnswer : NoAnswer;
   }
}

#ifdef __ProgramWritter
void ProgramWritter()
{
   char str[256];

   FILE *fp = fopen("Update.c", "w+");
   fputs("bool Update()\n", fp);
   fputs("{\n", fp);
   fputs("\tint Changed = 1;\n", fp);
   fputs("\twhile (Changed)\n", fp);
   fputs("\t{\n", fp);
   fputs("\t\tChanged = 0;\n", fp);
   for (int i = 0; i < 27; i++)
   {
      sprintf(str, "\t\tint Shrink%d = FULL_TO_SHRINK(F[%d]);\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\tint Column%d = FULL_TO_COLUMN(F[%d]);\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\tif (CompShrink[%d] != Shrink%d)\n", i, i);
      fputs(str, fp);
      fputs("\t\t{\n", fp);
      sprintf(str, "\t\t\tint LastColumn%d = CompColumn[%d];\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\tif ((F[%d] &= TblComplexMask[Column%d]) == 0)\n", i, i);
      fputs(str, fp);
      fputs("\t\t\t\treturn false;\n", fp);
      fputs("\t\t\tChanged = 1;\n", fp);
      sprintf(str, "\t\t\tint Column%d = FULL_TO_COLUMN(F[%d]);\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\tCompColumn[%d] = Column%d;\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\tCompShrink[%d] = Shrink%d = FULL_TO_SHRINK(F[%d]);\n", i, i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\tif (LastColumn%d != Column%d)\n", i, i);
      fputs(str, fp);
      fputs("\t\t\t{\n", fp);
      sprintf(str, "\t\t\t\tint S1 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother1[i]);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tint S2 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother2[i]);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S2];\n", TblAnother1[i], i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S1];\n", TblAnother2[i], i);
      fputs(str, fp);
      fputs("\t\t\t}\n", fp);
      sprintf(str, "\t\t\tint Single = (TblShrinkSingle[Shrink%d] & TblColumnSingle[Column%d]) ^ U[%d];\n", i, i, i);
      fputs(str, fp);
      fputs("\t\t\tif (Single)\n", fp);
      fputs("\t\t\t{\n", fp);
      sprintf(str, "\t\t\t\tU[%d] |= Single;\n", i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tint S = F[%d] & TblPalaceMask[Single];\n", i);
      fputs(str, fp);
      fputs("\t\t\t\t", fp);
      for (int j = TblRmder3[i]; j < 27; j += 3)
      {
         if (j != i)
         {
            sprintf(str, "AN(F[%d], S); ", j);
            fputs(str, fp);
         }
      }
      fputs("\n", fp);
      fputs("\t\t\t}\n", fp);
      fputs("\t\t}\n", fp);
      fputs("\t\telse\n", fp);
      fputs("\t\t{\n", fp);
      sprintf(str, "\t\t\tif (CompColumn[%d] != Column%d)\n", i, i);
      fputs(str, fp);
      fputs("\t\t\t{\n", fp);
      sprintf(str, "\t\t\t\tint LastColumn%d = CompColumn[%d];\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tif ((F[%d] &= TblComplexMask[Column%d]) == 0)\n", i, i);
      fputs(str, fp);
      fputs("\t\t\t\t\treturn false;\n", fp);
      fputs("\t\t\t\tChanged = 1;\n", fp);
      sprintf(str, "\t\t\t\tColumn%d = FULL_TO_COLUMN(F[%d]);\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tCompColumn[%d] = Column%d;\n", i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tCompShrink[%d] = Shrink%d = FULL_TO_SHRINK(F[%d]);\n", i, i, i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\tif (LastColumn%d != Column%d)\n", i, i);
      fputs(str, fp);
      fputs("\t\t\t\t{\n", fp);
      sprintf(str, "\t\t\t\t\tint S1 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother1[i]);
      fputs(str, fp);
      sprintf(str, "\t\t\t\t\tint S2 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother2[i]);
      fputs(str, fp);
      sprintf(str, "\t\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S2];\n", TblAnother1[i], i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S1];\n", TblAnother2[i], i);
      fputs(str, fp);
      fputs("\t\t\t\t}\n", fp);
      sprintf(str, "\t\t\t\tint Single = (TblShrinkSingle[Shrink%d] & TblColumnSingle[Column%d]) ^ U[%d];\n", i, i, i);
      fputs(str, fp);
      fputs("\t\t\t\tif (Single)\n", fp);
      fputs("\t\t\t\t{\n", fp);
      sprintf(str, "\t\t\t\t\tU[%d] |= Single;\n", i);
      fputs(str, fp);
      sprintf(str, "\t\t\t\t\tint S = F[%d] & TblPalaceMask[Single];\n", i);
      fputs(str, fp);
      fputs("\t\t\t\t\t", fp);
      for (int j = TblRmder3[i]; j < 27; j += 3)
      {
         if (j != i)
         {
            sprintf(str, "AN(F[%d], S); ", j);
            fputs(str, fp);
         }
      }
      fputs("\n", fp);
      fputs("\t\t\t\t}\n", fp);
      fputs("\t\t\t}\n", fp);
      fputs("\t\t}\n", fp);
   }
   fputs("\t}\n", fp);
   fputs("\treturn true;\n", fp);
   fputs("}", fp);
   fclose(fp);
}
#endif

bool ReadFile()
{
   FILE *file = fopen("TestCase.txt", "r");
   if (!file)
   {
      printf("Can not open TestCase.txt in current directory");
      getchar();
      return false;
   }
   TestCaseNumber = 0;
   while (fgets((char *)AllBoard[TestCaseNumber], 1024, file))
   {
      if (++TestCaseNumber == TotalBoard)
         break;
   }
   fclose(file);
   return true;
}

bool main()
{
   CreateTable();

#ifdef __ProgramWritter
   ProgramWritter();
   return true;
#endif

   if (!ReadFile())
      return false;

   printf("Press Enter to Slove ...");
   getchar();

   //Set Highest Priority
   DWORD dwOldProcessP = GetPriorityClass(GetCurrentProcess());
   DWORD dwOldThreadP  = GetThreadPriority(GetCurrentThread());

   SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS);
   SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);

   //Get Frequency
   LARGE_INTEGER Frequency;
   ::QueryPerformanceFrequency(&Frequency);

   //Get Start Time
   LARGE_INTEGER start;
   ::QueryPerformanceCounter(&start);

   int Answer = 0;
   //Procedure of Calculate
   for (int i = 0; i < TestCaseNumber; i++)
   {
      memcpy(Board, AllBoard[i], 81);
      InitSodoku();
      if ((Answer = Calculate()) != UniqAnswer)
         break;
   }

   //Get Stop Time
   LARGE_INTEGER stop;
   ::QueryPerformanceCounter(&stop);

   double spend = (double)(stop.QuadPart - start.QuadPart) / (double)Frequency.QuadPart;

   printf("Spend   = %f us\n", spend * 1000 * 1000);
   printf("Total   = %d\n", TestCaseNumber);
   printf("Average = %f us\n", spend * 1000 * 1000 / TestCaseNumber);
   if (Answer != UniqAnswer)
      printf("***************Result: %d   ***************\n", Answer);

   //Restore Priority
   SetThreadPriority(GetCurrentThread(), dwOldThreadP);
   SetPriorityClass(GetCurrentProcess(), dwOldProcessP);

   return true;
}

#ifdef __ProgramWritter
bool Update()
{
   return true;
}
#else
//open Update.c and copy to here
#endif
-----------1. Just replace Update() 2. What is different compared to old version? FULL_TO_SHRINK is heavy and FULL_TO_COLUMN is light in old version. FULL_TO_SHRINK is light and FULL_TO_COLUMN is heavy in new version.
zhouyundong_2012
 
Posts: 155
Joined: 10 February 2012

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Postby zhouyundong_2012 » Tue Apr 07, 2026 3:09 am

b26 b17 b08 / b25 b16 b07 / b24 b15 b06
b23 b14 b05 / b22 b13 b04 / b21 b12 b03
b20 b11 b02 / b19 b10 b01 / b18 b09 b00

Shrink: b18 b09 b00
Column: b06 b03 b00
zhouyundong_2012
 
Posts: 155
Joined: 10 February 2012

Previous

Return to Software