The New Sudoku Players' Forum

by **zhouyundong_2012** » Tue Apr 07, 2026 2:10 am

# ZSolver: Core Idea — Band-Centered 27-Integer State Representation and Pipeline-Friendly Loop Unrolling

**Author**: Zhou Yundong

## Abstract
This paper presents the core implementation ideas behind ZSolver. The key lies in a compact, band-centered state representation and a unified elimination framework built upon it. For any given digit, its state within a single band requires only 27 bits to describe. With 9 digits and 3 bands, just 27 integers are sufficient to capture all core state information needed during solving. On top of this representation, intra-band elimination, inter-band elimination, and box elimination can each be compressed into just a few lines of code. Furthermore, by fully unrolling the 27-iteration loop, relative index accesses such as F[i] are transformed into fixed absolute address accesses such as F[5], which makes extremely effective use of the CPU pipeline. This results in a significant speedup from approximately 8 microseconds to just over 4 microseconds. This paper explains this implementation path from three perspectives: state representation, elimination code structure, and execution efficiency.

**Keywords**: Sudoku solving; ZSolver; band representation; bitwise operations; loop unrolling; CPU pipeline; high-performance implementation

## 1. Introduction
The performance of a Sudoku solver is often assumed to depend primarily on its search strategy, pruning rules, and constraint system. In practice, however, the underlying state representation and code organization often have a far greater impact on the final speed. When pursuing microsecond-level performance in particular, whether a program can complete state updates, constraint propagation, and contradiction detection in the shortest possible path matters more than high-level strategy.

ZSolver's core experience demonstrates that when the state representation is compact enough, constraint propagation is unified enough, and the code structure is well-suited to the execution characteristics of modern CPUs, overall performance can improve dramatically. The real key is not to accumulate complex logic, but to compress the core problem into a small number of integers and a small amount of short code — and then to convert that structural advantage into an execution advantage at the processor level.

## 2. Band-Centered State Representation
The central foundation of ZSolver is a band-centered state representation.

For any given digit, its state within a single band can be represented using just 27 bits. Since a full Sudoku board has 9 digits and 3 bands, only 27 integers in total are needed to describe all core state. This representation is extremely compact and, crucially, completely uniform.

The significance here goes beyond simply "saving memory." More importantly:

1. The state structure is identical for every digit;
2. The processing method is identical for every band;
3. All subsequent elimination operations can be built on the same integer representation;
4. Data is highly concentrated, making it easier for compilers to optimize and for CPUs to execute efficiently.

In typical implementations, state information is scattered across rows, columns, boxes, and cells, requiring constant switching between different data views and structures. Under this representation, many originally dispersed relationships are folded into a fixed set of integers, allowing subsequent operations to be performed directly as bitwise computations.

## 3. Simplification of Intra-Band Elimination
On the basis of the above representation, intra-band elimination for any given digit requires only a single line of code.

This is critically important. It means that the primary constraint relationships within a band have been heavily absorbed by the state encoding itself. What the program actually needs to do is no longer a complex traversal with many conditional checks — it is just one very short operation.

This shows that the representation is not passively storing state, but actively participating in the solving process. It is precisely because the representation is so tightly aligned with the problem structure that the elimination logic can be written so concisely.

## 4. Unified Implementation of Inter-Band Elimination
On the same basis, inter-band elimination requires only two or three lines of code.

This shows that the representation is not only suitable for local updates within a single band, but also naturally supports constraint propagation between bands. In other words, once the state representation is established, eliminations at different structural levels no longer require separate large blocks of code — they can all be handled within the same unified framework.

For high-performance programs, this kind of uniformity is extremely valuable. Uniformity means:

1. Shorter code;
2. More stable logic;
3. More predictable execution paths;
4. Easier and more efficient processing by the machine.

## 5. Box Elimination as a Natural Extension
Box elimination, building on what has already been established, also requires only two or three lines of code.

This further demonstrates that this representation does not sacrifice the 3×3 box constraint — the most fundamental structural element of Sudoku — but instead incorporates it naturally into the same framework. As a result, intra-band elimination, inter-band elimination, and box elimination are not three separate implementations, but three direct operations all built on the same underlying state structure.

From an engineering perspective, this inheritance and uniformity is extremely valuable. It allows the program to avoid relying on a large layered rule system, depending instead on a single representation that is correct, compact, and consistent.

## 6. Each Full Elimination Pass Requires Only About Ten Lines of Code
Taking all the elimination types together, the total core code for one complete elimination pass is no more than around ten lines.

This is one of ZSolver's most distinctive characteristics. Its speed is not achieved through "vast amounts of code," but through "high-density execution of a small amount of code." The true complexity is no longer visible on the surface of the code — it has been compressed into the state representation and operational structure.

This short-code approach has two direct benefits:

1. The implementation is clear and does not require many intermediate layers;
2. It is well-suited for subsequent unrolling, scheduling, and low-level optimization.

## 7. Unrolling the 27-Iteration Loop
Given this foundation, fully unrolling the 27-iteration loop produces a notable change in the execution characteristics of the program, even though the total code length does not grow very large.

Before unrolling, the program typically accesses state via expressions like F[i] — an indexed access with a variable subscript.
After unrolling, those accesses become fixed expressions like F[5] — absolute address accesses.

Although this may appear to be merely a change in array index notation, its impact at the execution level is significant. Fixed address access means:

1. No repeated updating of loop variables;
2. No frequent boundary checks;
3. Easier register allocation by the compiler;
4. Greater freedom for instruction scheduling;
5. More direct and predictable data access paths.

This step is not simply about "removing the loop." It is about transforming the core code into a fixed, short instruction sequence that is as close as possible to the ideal execution form for the processor.

## 8. Effective Utilization of the CPU Pipeline
This unrolled implementation makes extremely effective use of the CPU's pipeline execution capability.

Based on practical experience, after adopting this style of loop unrolling and fixed address access, program speed improved significantly from approximately 8 microseconds to just over 4 microseconds. The magnitude of this improvement shows that what determines performance is not only "what the logic does," but also "how the code is executed by the machine."

Once code exhibits the following characteristics:

1. Stable execution paths;
2. Fixed memory access patterns;
3. Very few branches;
4. Tight, repetitive core operations;

the pipeline advantages of modern CPUs can be fully utilized. ZSolver's implementation path is precisely the result of unifying state representation, short code structure, and processor execution characteristics.

## 9. Natural Convergence of the Implementation
Looking at the implementation process as a whole, the initial method provided the direction, and the F[27] representation allowed the entire system to converge into a coherent whole.

Once this 27-integer representation was in place:

1. Intra-band elimination could be written in one line;
2. Inter-band elimination could be written in two or three lines;
3. Box elimination could likewise be written in two or three lines;
4. All eliminations combined amounted to only about ten lines of core code;
5. After 27-iteration unrolling, accesses became absolute addresses;
6. The result naturally became a pipeline-friendly, high-performance implementation.

ZSolver's core is therefore not any single isolated technique, but a continuous engineering path: first compress the state, then unify the operations, then unroll the core code, and finally convert the algorithmic advantage into an execution advantage at the machine level.

## 10. Conclusion
ZSolver's core ideas can be summarized as follows: represent all core state using 27 integers organized around bands; complete constraint propagation using extremely short and uniform elimination code; and, through 27-iteration loop unrolling, organize the program into an execution form that is highly friendly to the CPU pipeline.

This implementation path demonstrates that in Sudoku solving and similar constraint-satisfaction problems, the true performance ceiling is determined not only by the solving strategy itself, but also by whether the state representation is compact enough, whether the elimination code is uniform enough, and whether the implementation form is sufficiently aligned with the working characteristics of modern processors. ZSolver's performance improvement is the direct result of this unified design.

**Note**
ZSolver's public source code is available in the software section of www.enjoysoduku.com. The publicly released version is named "3.88-microsecond solver."

by **zhouyundong_2012** » Tue Apr 07, 2026 2:17 am

# ZSolver Guess() Optimization Discussion Report

## Scope

This report summarizes a focused discussion on deep-guess optimization for ZSolver, specifically around the `Guess()` function in the ZSolver codebase. The goal of this report is not to deliver a finished optimization patch, but to record the key ideas, separate authorship clearly, and hand off the next stage of work to other contributors.

Relevant code context:

- `Guess()` is implemented in `ZSolver/sd.cpp`.
- `Rollback()` is the paired backtracking routine.
- `Calculate()` is the main search loop that calls `Guess()`.
- `CGame` in `ZSolver/sd.h` is the snapshot structure used by the explicit stack-based search.

## Confirmed Context

- `Guess()` is the core target of the "deep-guess optimization" discussion.
- The current implementation is an explicit-stack search rather than a recursive search.
- The current heuristic is roughly: prefer branches whose candidate count is `2`; otherwise choose the smallest candidate count available.
- The implementation is naturally organized around `Block + Line`, where each guess step is aligned with the current 27-integer representation.

## Viewpoints From zhouyundong

The following points were proposed by **zhouyundong** during the discussion:

1. **zhouyundong:** Deep-guess optimization specifically means optimizing `Guess()`.
2. **zhouyundong:** When the board contains a large blank area, a larger-step strategy may be better than the current fine-grained guessing. In that context, "guessing nine cells at once" was proposed as an intuitive direction.
3. **zhouyundong:** Once the board is no longer extremely empty, "guessing three cells in one step" is likely better than a larger nine-cell step.
4. **zhouyundong:** The main effort should not be spent on writing detailed optimization code first. The main effort should be spent on correctly distinguishing the three board phases, referred to as `1 / 2 / 3`.
5. **zhouyundong:** In the current representation, `U` is the most practical signal for phase classification. `F` is not convenient or reliable enough as the main basis for this classification.
6. **zhouyundong:** If necessary, it is acceptable to sacrifice a little `Update()` performance in order to expose more useful signals for phase classification.
7. **zhouyundong:** There is no strong need to complete the full concrete optimization now. The more meaningful output at this stage is a clean conceptual handoff so that others can continue the work.

## Viewpoints From AI

The following points were proposed by **AI** during the discussion:

1. **AI:** The current `Guess()` heuristic is strong when the board already contains tight local constraints, especially when many branches have only `2` or `3` candidates. It becomes less effective when the board is still very sparse and candidate counts remain large almost everywhere.
2. **AI:** The real optimization problem is not only "how to rewrite a few lines in `Guess()`", but "how to detect which stage the board is currently in and dispatch to the right strategy".
3. **AI:** A practical three-phase model is:

- Phase `1`: very sparse board, weak propagation, suitable for a larger-step strategy.
- Phase `2`: mid-game board, stronger structure, suitable for a "three-cells-in-one-step" strategy.
- Phase `3`: late-game board, many tight branches, suitable for the current MRV-like small-branch `Guess()`.
4. **AI:** "Guessing nine cells at once" should not mean blindly assigning nine unrelated cells. If such a strategy is ever used, it should be a structured large-step guess used only in very sparse positions.
5. **AI:** When the board is no longer very sparse, "one step, three cells" is more natural than "one step, nine cells" because it matches the existing `Block + Line` organization of the 27-integer representation and is more compatible with the current stack and rollback model.
6. **AI:** The first version of the phase classifier can be built mainly from `U`, without depending heavily on `F`.
7. **AI:** If a small performance cost in `Update()` is acceptable, the best addition is not a heavy new analysis pass, but lightweight counters or statistics generated during propagation.
8. **AI:** Useful signals for distinguishing phases may include:

- how many `U` positions are already in a unique or nearly unique state,
- how much `U` changes during one `Update()` cycle,
- how many new uniquely resolved positions appear after one guess followed by propagation,
- whether propagation is producing strong follow-up effects or almost no shrinkage at all.
9. **AI:** The most valuable next step is to define a stable and cheap `1 / 2 / 3` classifier first. Strategy implementation should come after that.

## Joint Direction Emerging From The Discussion

Although the detailed optimization was intentionally not implemented, the discussion converged on the following direction:

1. The future of `Guess()` optimization should likely be **phase-based**, not one-size-fits-all.
2. The most important research task is to distinguish board phase `1`, `2`, and `3` reliably.
3. The main signal source for this phase classification should begin with `U`.
4. A limited extra cost inside `Update()` is acceptable if it makes phase classification significantly more accurate.
5. Concrete strategy work can be handed off:

- phase `1` strategy can be explored by others,
- phase `2` "one-step three-cells" strategy can be explored by others,
- phase `3` can continue to rely on the current refined `Guess()` style or a close variant.

## Handoff Note

This report intentionally stops at the level of problem framing, phase separation, and heuristic direction. It does not attempt to provide a finished optimization implementation.

That decision is deliberate:

- the conceptual split is more important than premature code changes,
- the next contributor can pick up the baton from a cleaner starting point,
- and the discussion has already produced a useful division of labor between problem definition and later engineering execution.

## Attribution Summary

- All user-originated viewpoints in this report are attributed to **zhouyundong**.
- All assistant-originated viewpoints in this report are attributed to **AI**.

by **zhouyundong_2012** » Tue Apr 07, 2026 2:43 am

Code: Select all: #ifndef __sd_h__ #define __sd_h__ #define TotalBoard 50000 //Total Games //Answer Attribute #define NoAnswer 0 #define UniqAnswer 1 #define MoreAnswer 2 //High, Middle, Low 9 bit of a 27-bit number #define HIGH_9_BIT(v) ((v) >> 18) #define MID_9_BIT(v) (((v) >> 9) & 0x1FF) #define LOW_9_BIT(v) ((v) & 0x1FF) #define HML_9_BIT(v, l) ((v) >> TblMult9[l] & 0x1FF) #define FULL_TO_COLUMN(v) (TblShrinkHigh[(v) >> 18] | TblShrinkMid[((v) >> 9) & 0x1FF] | TblShrinkLow[(v) & 0x1FF]) //Full Mask(27-bit) to 9 bit #define FULL_TO_SHRINK(v) (((v) | ((v) >> 9) | ((v) >> 18)) & 0x1FF) #define BIT_SET_27 0x07FFFFFF #define AN(v, n) (v) &= ~(n) struct CGame { public: int F[27]; //27-bit Full Mask int U[27]; //3-bit Unique in Palace int Block; //The Block Number of guess [0, 27) int Line; //The Line Number of guess [0, 3) int Mask; //The 9-bit Mask of guess int Left; //The 9-bit Left Mask of guess }; //Common Table extern int TblMult3[81]; extern int TblDivide3[81]; extern int TblRmder3[81]; extern int TblMult9[9]; //Basis Table extern int TblCombineMask[512]; extern int TblTailZero[512]; extern int TblNumberOne[512]; extern int TblAnother1[27]; extern int TblAnother2[27]; extern int TblShiftLeft[27]; //Table of Decompose Board extern int TblBoard_Palace[81]; extern int TblBoard_Block[81]; extern int TblBoard_BlockMask[81]; extern int TblBoard_GridUniq[81]; //Table of Initial extern int TblSelfMask[81]; extern int TblOtherMask[81]; //Complex Friend Method extern int TblMaskSingle[512]; extern int TblMaskDouble[512]; //Other Table extern int TblPalaceMask[8]; extern bool TblUniqFlag[8]; //Shrink Mask extern int TblShrinkLow[512]; extern int TblShrinkMid[512]; extern int TblShrinkHigh[512]; //Processed state cache extern int CompColumn[27]; extern int CompShrink[27]; //Complex Method extern int TblComplexMask[512]; extern int TblColumnSingle[512]; extern int TblShrinkSingle[512]; extern void CreateTable(); #endif

by **zhouyundong_2012** » Tue Apr 07, 2026 2:44 am

Code: Select all: #include <string.h> #include "sd.h" //Common Table int TblMult3[81]; int TblDivide3[81]; int TblRmder3[81]; int TblMult9[9]; //Basis Table int TblCombineMask[512]; int TblTailZero[512]; int TblNumberOne[512]; int TblAnother1[27]; int TblAnother2[27]; int TblShiftLeft[27]; //Table of Decompose Board int TblBoard_Palace[81]; int TblBoard_Block[81]; int TblBoard_BlockMask[81]; int TblBoard_GridUniq[81]; //Table of Initial int TblSelfMask[81]; int TblOtherMask[81]; //Complex Friend Method int TblMaskSingle[512]; int TblMaskDouble[512]; //Other Table int TblPalaceMask[8]; bool TblUniqFlag[8]; //Shrink Mask int TblShrinkLow[512]; int TblShrinkMid[512]; int TblShrinkHigh[512]; //Complex Method int TblComplexMask[512]; int TblColumnSingle[512]; int TblShrinkSingle[512]; void CreateTblCommon() { for (int i = 0; i < 81; i++) { TblMult3[i] = i * 3; TblDivide3[i] = i / 3; TblRmder3[i] = i % 3; } for (int i = 0; i < 9; i++) { TblMult9[i] = i * 9; } } void CreateTblCombineMask() { for (int i = 0; i < 512; i++) { TblCombineMask[i] = (i | (i << 9) | (i << 18)) ^ BIT_SET_27; } } void CreateTblAnother() { for (int i = 0; i < 27; i++) { switch (i % 3) { case 0: //At Top TblAnother1[i] = i + 1; TblAnother2[i] = i + 2; break; case 1: //At Middle TblAnother1[i] = i - 1; TblAnother2[i] = i + 1; break; case 2: //At Bottom TblAnother1[i] = i - 2; TblAnother2[i] = i - 1; break; } } } void CreateTblTailZero() { memset(TblTailZero, 0, sizeof(TblTailZero)); for (int i = 0; i < 512; i++) { for (int j = 0; j < 9; j++) { if (i & (1 << j)) break; TblTailZero[i]++; } } } void CreateTblNumberOne() { memset(TblNumberOne, 0, sizeof(TblNumberOne)); for (int i = 0; i < 512; i++) { for (int j = 0; j < 9; j++) { if (i & (1 << j)) TblNumberOne[i]++; } } } void CreateTblShiftLeft() { for (int i = 0; i < 27; i++) { TblShiftLeft[i] = 1 << i; } } void CreateTblBasis() { CreateTblCombineMask(); CreateTblAnother(); CreateTblTailZero(); CreateTblNumberOne(); CreateTblShiftLeft(); } void CreateTblBoard() { for (int i = 0; i < 81; i++) { TblBoard_Palace[i] = TblMult3[i / 27] + TblDivide3[i % 9]; TblBoard_Block[i] = i / 27; TblBoard_BlockMask[i] = 1 << i % 27; TblBoard_GridUniq[i] = 1 << TblBoard_Palace[i] % 3; } } void CreateTblSelfMask() { int Mask[3] = { 0x7E3F1F8, 0x71F8FC7, 0x0FC7E3F }; int Row[3] = { 0x7FFFE00, 0x7FC01FF, 0x3FFFF }; for (int i = 0; i < 81; i++) { int Palace = TblBoard_Palace[i % 27]; TblSelfMask[i] = Mask[Palace] & Row[i % 27 / 9] | TblBoard_BlockMask[i]; } } void CreateTblOtherMask() { for (int i = 0; i < 81; i++) { TblOtherMask[i] = TblCombineMask[1 << i % 9]; } } void CreateTblMaskSingle() { for (int i = 0; i < 512; i++) { TblMaskSingle[i] = i; int v = i >> 6; //High 3 bit if (v != 4 && v != 2 && v != 1) TblMaskSingle[i] &= 0x3F; v = i >> 3 & 0x7; //Middle 3 bit if (v != 4 && v != 2 && v != 1) TblMaskSingle[i] &= 0x1C7; v = i & 0x7; //Low 3 bit if (v != 4 && v != 2 && v != 1) TblMaskSingle[i] &= 0x1F8; TblMaskSingle[i] = TblCombineMask[TblMaskSingle[i]]; } } void CreateTblMaskDouble() { for (int i = 0; i < 512; i++) { TblMaskDouble[i] = i; int v = i >> 6; //High 3 bit if (v != 6 && v != 5 && v != 3) TblMaskDouble[i] &= 0x3F; v = i >> 3 & 0x7; //Middle 3 bit if (v != 6 && v != 5 && v != 3) TblMaskDouble[i] &= 0x1C7; v = i & 0x7; //Low 3 bit if (v != 6 && v != 5 && v != 3) TblMaskDouble[i] &= 0x1F8; TblMaskDouble[i] = TblCombineMask[TblMaskDouble[i]]; } } void CreateTblPalaceMask() { memset(TblPalaceMask, 0, sizeof(TblPalaceMask)); for (int i = 0; i < 8; i++) { for (int j = 0; j < 3; j++) { if (i & (1 << j)) TblPalaceMask[i] |= 0x1C0E07 << TblMult3[j]; } } } void CreateTblUniqFlag() { memset(TblUniqFlag, 0, sizeof(TblUniqFlag)); for (int i = 0; i < 8; i++) { if (TblNumberOne[i] >= 2) TblUniqFlag[i] = true; } } void CreateShrinkMask() { int PalaceMask[3] = { 0x7, 0x38, 0x1C0 }; memset(TblShrinkLow, 0, sizeof(TblShrinkLow)); for (int i = 0; i < 512; i++) { for (int j = 0; j < 3; j++) { if (i & PalaceMask[j]) TblShrinkLow[i] |= 1 << j; } } for (int i = 0; i < 512; i++) { TblShrinkMid[i] = TblShrinkLow[i] << 3; TblShrinkHigh[i] = TblShrinkLow[i] << 6; } } void CreateComplexMask() { int MaskH[9] = { 0x7FFFE07, 0x7FFFE38, 0x7FFFFC0, 0x7FC0FFF, 0x7FC71FF, 0x7FF81FF, 0x1FFFFF, 0xE3FFFF, 0x703FFFF }; //A B C int MaskV[9] = { 0x7E3F1FF, 0x71F8FFF, 0xFC7FFF, 0x7E3FFF8, 0x71FFFC7, 0xFFFE3F, 0x7FFF1F8, 0x7FF8FC7, 0x7FC7E3F }; //A 0 0 int index; for (int i = 0; i < 512; i++) { if (!(i & 0x1C0)) { TblComplexMask[i] = 0; continue; } TblComplexMask[i] = BIT_SET_27; //Like Locked Candidate for (int j = 0; j < 3; j++) { for (int k = 0; k < 3; k++) { index = TblMult3[k] + j; if ((i & (1 << index)) && !(i & (1 << (TblMult3[TblAnother1[k]] + j))) && !(i & (1 << (TblMult3[TblAnother2[k]] + j)))) TblComplexMask[i] &= MaskH[index]; //Horizontal index = TblMult3[j] + k; if ((i & (1 << index)) && !(i & (1 << TblAnother1[index])) && !(i & (1 << TblAnother2[index]))) TblComplexMask[i] &= MaskV[index]; //Vertical } } } for (int i = 0; i < 512; i++) { if ((i & FULL_TO_COLUMN(TblComplexMask[i])) == 0) TblComplexMask[i] = 0; else if ((i & FULL_TO_COLUMN(TblComplexMask[i])) == i) TblComplexMask[i] = BIT_SET_27; } } void CreateTblColumnSingle() { int ColumnMask[3] = { 0x49, 0x92, 0x124 }; for (int i = 0; i < 512; i++) { TblColumnSingle[i] = 7; for (int j = 0; j < 3; j++) { int v = i & ColumnMask[j]; if (v != 1 && v != 8 && v != 64) TblColumnSingle[i] &= 7 ^ (1 << j); } } for (int i = 0; i < 512; i++) { TblShrinkSingle[i] = 7; int v = i >> 6; //High 3 bit if (v != 4 && v != 2 && v != 1) TblShrinkSingle[i] &= 0x3; v = i >> 3 & 0x7; //Middle 3 bit if (v != 4 && v != 2 && v != 1) TblShrinkSingle[i] &= 0x5; v = i & 0x7; //Low 3 bit if (v != 4 && v != 2 && v != 1) TblShrinkSingle[i] &= 0x6; } } void CreateTable() { CreateTblCommon(); CreateTblBasis(); CreateTblBoard(); CreateTblSelfMask(); CreateTblOtherMask(); CreateTblMaskSingle(); CreateTblMaskDouble(); CreateTblPalaceMask(); CreateTblUniqFlag(); CreateShrinkMask(); CreateComplexMask(); CreateTblColumnSingle(); }

by **zhouyundong_2012** » Tue Apr 07, 2026 2:45 am

Code: Select all: bool Update() { int Shrink = ~0; while (Shrink) { Shrink = 0; if (CompF[0] != F[0]) { Shrink = FULL_TO_SHRINK(F[0]); if ((F[0] &= TblComplexMask[Shrink]) == 0) return false; CompF[0] = F[0]; int Column = FULL_TO_COLUMN(F[0]); int C1 = Column | FULL_TO_COLUMN(F[1]); int C2 = Column | FULL_TO_COLUMN(F[2]); F[1] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[2] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[0]; if (Single) { U[0] |= Single; int S = F[0] & TblPalaceMask[Single]; AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S); } } if (CompF[1] != F[1]) { Shrink = FULL_TO_SHRINK(F[1]); if ((F[1] &= TblComplexMask[Shrink]) == 0) return false; CompF[1] = F[1]; int Column = FULL_TO_COLUMN(F[1]); int C1 = Column | FULL_TO_COLUMN(F[0]); int C2 = Column | FULL_TO_COLUMN(F[2]); F[0] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[2] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[1]; if (Single) { U[1] |= Single; int S = F[1] & TblPalaceMask[Single]; AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S); } } if (CompF[2] != F[2]) { Shrink = FULL_TO_SHRINK(F[2]); if ((F[2] &= TblComplexMask[Shrink]) == 0) return false; CompF[2] = F[2]; int Column = FULL_TO_COLUMN(F[2]); int C1 = Column | FULL_TO_COLUMN(F[0]); int C2 = Column | FULL_TO_COLUMN(F[1]); F[0] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[1] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[2]; if (Single) { U[2] |= Single; int S = F[2] & TblPalaceMask[Single]; AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S); } } if (CompF[3] != F[3]) { Shrink = FULL_TO_SHRINK(F[3]); if ((F[3] &= TblComplexMask[Shrink]) == 0) return false; CompF[3] = F[3]; int Column = FULL_TO_COLUMN(F[3]); int C1 = Column | FULL_TO_COLUMN(F[4]); int C2 = Column | FULL_TO_COLUMN(F[5]); F[4] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[5] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[3]; if (Single) { U[3] |= Single; int S = F[3] & TblPalaceMask[Single]; AN(F[0], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S); } } if (CompF[4] != F[4]) { Shrink = FULL_TO_SHRINK(F[4]); if ((F[4] &= TblComplexMask[Shrink]) == 0) return false; CompF[4] = F[4]; int Column = FULL_TO_COLUMN(F[4]); int C1 = Column | FULL_TO_COLUMN(F[3]); int C2 = Column | FULL_TO_COLUMN(F[5]); F[3] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[5] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[4]; if (Single) { U[4] |= Single; int S = F[4] & TblPalaceMask[Single]; AN(F[1], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S); } } if (CompF[5] != F[5]) { Shrink = FULL_TO_SHRINK(F[5]); if ((F[5] &= TblComplexMask[Shrink]) == 0) return false; CompF[5] = F[5]; int Column = FULL_TO_COLUMN(F[5]); int C1 = Column | FULL_TO_COLUMN(F[3]); int C2 = Column | FULL_TO_COLUMN(F[4]); F[3] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[4] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[5]; if (Single) { U[5] |= Single; int S = F[5] & TblPalaceMask[Single]; AN(F[2], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S); } } if (CompF[6] != F[6]) { Shrink = FULL_TO_SHRINK(F[6]); if ((F[6] &= TblComplexMask[Shrink]) == 0) return false; CompF[6] = F[6]; int Column = FULL_TO_COLUMN(F[6]); int C1 = Column | FULL_TO_COLUMN(F[7]); int C2 = Column | FULL_TO_COLUMN(F[8]); F[7] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[8] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[6]; if (Single) { U[6] |= Single; int S = F[6] & TblPalaceMask[Single]; AN(F[0], S); AN(F[3], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S); } } if (CompF[7] != F[7]) { Shrink = FULL_TO_SHRINK(F[7]); if ((F[7] &= TblComplexMask[Shrink]) == 0) return false; CompF[7] = F[7]; int Column = FULL_TO_COLUMN(F[7]); int C1 = Column | FULL_TO_COLUMN(F[6]); int C2 = Column | FULL_TO_COLUMN(F[8]); F[6] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[8] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[7]; if (Single) { U[7] |= Single; int S = F[7] & TblPalaceMask[Single]; AN(F[1], S); AN(F[4], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S); } } if (CompF[8] != F[8]) { Shrink = FULL_TO_SHRINK(F[8]); if ((F[8] &= TblComplexMask[Shrink]) == 0) return false; CompF[8] = F[8]; int Column = FULL_TO_COLUMN(F[8]); int C1 = Column | FULL_TO_COLUMN(F[6]); int C2 = Column | FULL_TO_COLUMN(F[7]); F[6] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[7] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[8]; if (Single) { U[8] |= Single; int S = F[8] & TblPalaceMask[Single]; AN(F[2], S); AN(F[5], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S); } } if (CompF[9] != F[9]) { Shrink = FULL_TO_SHRINK(F[9]); if ((F[9] &= TblComplexMask[Shrink]) == 0) return false; CompF[9] = F[9]; int Column = FULL_TO_COLUMN(F[9]); int C1 = Column | FULL_TO_COLUMN(F[10]); int C2 = Column | FULL_TO_COLUMN(F[11]); F[10] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[11] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[9]; if (Single) { U[9] |= Single; int S = F[9] & TblPalaceMask[Single]; AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S); } } if (CompF[10] != F[10]) { Shrink = FULL_TO_SHRINK(F[10]); if ((F[10] &= TblComplexMask[Shrink]) == 0) return false; CompF[10] = F[10]; int Column = FULL_TO_COLUMN(F[10]); int C1 = Column | FULL_TO_COLUMN(F[9]); int C2 = Column | FULL_TO_COLUMN(F[11]); F[9] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[11] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[10]; if (Single) { U[10] |= Single; int S = F[10] & TblPalaceMask[Single]; AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S); } } if (CompF[11] != F[11]) { Shrink = FULL_TO_SHRINK(F[11]); if ((F[11] &= TblComplexMask[Shrink]) == 0) return false; CompF[11] = F[11]; int Column = FULL_TO_COLUMN(F[11]); int C1 = Column | FULL_TO_COLUMN(F[9]); int C2 = Column | FULL_TO_COLUMN(F[10]); F[9] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[10] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[11]; if (Single) { U[11] |= Single; int S = F[11] & TblPalaceMask[Single]; AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S); } } if (CompF[12] != F[12]) { Shrink = FULL_TO_SHRINK(F[12]); if ((F[12] &= TblComplexMask[Shrink]) == 0) return false; CompF[12] = F[12]; int Column = FULL_TO_COLUMN(F[12]); int C1 = Column | FULL_TO_COLUMN(F[13]); int C2 = Column | FULL_TO_COLUMN(F[14]); F[13] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[14] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[12]; if (Single) { U[12] |= Single; int S = F[12] & TblPalaceMask[Single]; AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); AN(F[24], S); } } if (CompF[13] != F[13]) { Shrink = FULL_TO_SHRINK(F[13]); if ((F[13] &= TblComplexMask[Shrink]) == 0) return false; CompF[13] = F[13]; int Column = FULL_TO_COLUMN(F[13]); int C1 = Column | FULL_TO_COLUMN(F[12]); int C2 = Column | FULL_TO_COLUMN(F[14]); F[12] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[14] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[13]; if (Single) { U[13] |= Single; int S = F[13] & TblPalaceMask[Single]; AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); AN(F[25], S); } } if (CompF[14] != F[14]) { Shrink = FULL_TO_SHRINK(F[14]); if ((F[14] &= TblComplexMask[Shrink]) == 0) return false; CompF[14] = F[14]; int Column = FULL_TO_COLUMN(F[14]); int C1 = Column | FULL_TO_COLUMN(F[12]); int C2 = Column | FULL_TO_COLUMN(F[13]); F[12] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[13] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[14]; if (Single) { U[14] |= Single; int S = F[14] & TblPalaceMask[Single]; AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); AN(F[26], S); } } if (CompF[15] != F[15]) { Shrink = FULL_TO_SHRINK(F[15]); if ((F[15] &= TblComplexMask[Shrink]) == 0) return false; CompF[15] = F[15]; int Column = FULL_TO_COLUMN(F[15]); int C1 = Column | FULL_TO_COLUMN(F[16]); int C2 = Column | FULL_TO_COLUMN(F[17]); F[16] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[17] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[15]; if (Single) { U[15] |= Single; int S = F[15] & TblPalaceMask[Single]; AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[18], S); AN(F[21], S); AN(F[24], S); } } if (CompF[16] != F[16]) { Shrink = FULL_TO_SHRINK(F[16]); if ((F[16] &= TblComplexMask[Shrink]) == 0) return false; CompF[16] = F[16]; int Column = FULL_TO_COLUMN(F[16]); int C1 = Column | FULL_TO_COLUMN(F[15]); int C2 = Column | FULL_TO_COLUMN(F[17]); F[15] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[17] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[16]; if (Single) { U[16] |= Single; int S = F[16] & TblPalaceMask[Single]; AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[19], S); AN(F[22], S); AN(F[25], S); } } if (CompF[17] != F[17]) { Shrink = FULL_TO_SHRINK(F[17]); if ((F[17] &= TblComplexMask[Shrink]) == 0) return false; CompF[17] = F[17]; int Column = FULL_TO_COLUMN(F[17]); int C1 = Column | FULL_TO_COLUMN(F[15]); int C2 = Column | FULL_TO_COLUMN(F[16]); F[15] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[16] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[17]; if (Single) { U[17] |= Single; int S = F[17] & TblPalaceMask[Single]; AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[20], S); AN(F[23], S); AN(F[26], S); } } if (CompF[18] != F[18]) { Shrink = FULL_TO_SHRINK(F[18]); if ((F[18] &= TblComplexMask[Shrink]) == 0) return false; CompF[18] = F[18]; int Column = FULL_TO_COLUMN(F[18]); int C1 = Column | FULL_TO_COLUMN(F[19]); int C2 = Column | FULL_TO_COLUMN(F[20]); F[19] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[20] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[18]; if (Single) { U[18] |= Single; int S = F[18] & TblPalaceMask[Single]; AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[21], S); AN(F[24], S); } } if (CompF[19] != F[19]) { Shrink = FULL_TO_SHRINK(F[19]); if ((F[19] &= TblComplexMask[Shrink]) == 0) return false; CompF[19] = F[19]; int Column = FULL_TO_COLUMN(F[19]); int C1 = Column | FULL_TO_COLUMN(F[18]); int C2 = Column | FULL_TO_COLUMN(F[20]); F[18] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[20] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[19]; if (Single) { U[19] |= Single; int S = F[19] & TblPalaceMask[Single]; AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[22], S); AN(F[25], S); } } if (CompF[20] != F[20]) { Shrink = FULL_TO_SHRINK(F[20]); if ((F[20] &= TblComplexMask[Shrink]) == 0) return false; CompF[20] = F[20]; int Column = FULL_TO_COLUMN(F[20]); int C1 = Column | FULL_TO_COLUMN(F[18]); int C2 = Column | FULL_TO_COLUMN(F[19]); F[18] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[19] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[20]; if (Single) { U[20] |= Single; int S = F[20] & TblPalaceMask[Single]; AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[23], S); AN(F[26], S); } } if (CompF[21] != F[21]) { Shrink = FULL_TO_SHRINK(F[21]); if ((F[21] &= TblComplexMask[Shrink]) == 0) return false; CompF[21] = F[21]; int Column = FULL_TO_COLUMN(F[21]); int C1 = Column | FULL_TO_COLUMN(F[22]); int C2 = Column | FULL_TO_COLUMN(F[23]); F[22] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[23] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[21]; if (Single) { U[21] |= Single; int S = F[21] & TblPalaceMask[Single]; AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[24], S); } } if (CompF[22] != F[22]) { Shrink = FULL_TO_SHRINK(F[22]); if ((F[22] &= TblComplexMask[Shrink]) == 0) return false; CompF[22] = F[22]; int Column = FULL_TO_COLUMN(F[22]); int C1 = Column | FULL_TO_COLUMN(F[21]); int C2 = Column | FULL_TO_COLUMN(F[23]); F[21] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[23] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[22]; if (Single) { U[22] |= Single; int S = F[22] & TblPalaceMask[Single]; AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[25], S); } } if (CompF[23] != F[23]) { Shrink = FULL_TO_SHRINK(F[23]); if ((F[23] &= TblComplexMask[Shrink]) == 0) return false; CompF[23] = F[23]; int Column = FULL_TO_COLUMN(F[23]); int C1 = Column | FULL_TO_COLUMN(F[21]); int C2 = Column | FULL_TO_COLUMN(F[22]); F[21] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[22] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[23]; if (Single) { U[23] |= Single; int S = F[23] & TblPalaceMask[Single]; AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[26], S); } } if (CompF[24] != F[24]) { Shrink = FULL_TO_SHRINK(F[24]); if ((F[24] &= TblComplexMask[Shrink]) == 0) return false; CompF[24] = F[24]; int Column = FULL_TO_COLUMN(F[24]); int C1 = Column | FULL_TO_COLUMN(F[25]); int C2 = Column | FULL_TO_COLUMN(F[26]); F[25] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[26] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[24]; if (Single) { U[24] |= Single; int S = F[24] & TblPalaceMask[Single]; AN(F[0], S); AN(F[3], S); AN(F[6], S); AN(F[9], S); AN(F[12], S); AN(F[15], S); AN(F[18], S); AN(F[21], S); } } if (CompF[25] != F[25]) { Shrink = FULL_TO_SHRINK(F[25]); if ((F[25] &= TblComplexMask[Shrink]) == 0) return false; CompF[25] = F[25]; int Column = FULL_TO_COLUMN(F[25]); int C1 = Column | FULL_TO_COLUMN(F[24]); int C2 = Column | FULL_TO_COLUMN(F[26]); F[24] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[26] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[25]; if (Single) { U[25] |= Single; int S = F[25] & TblPalaceMask[Single]; AN(F[1], S); AN(F[4], S); AN(F[7], S); AN(F[10], S); AN(F[13], S); AN(F[16], S); AN(F[19], S); AN(F[22], S); } } if (CompF[26] != F[26]) { Shrink = FULL_TO_SHRINK(F[26]); if ((F[26] &= TblComplexMask[Shrink]) == 0) return false; CompF[26] = F[26]; int Column = FULL_TO_COLUMN(F[26]); int C1 = Column | FULL_TO_COLUMN(F[24]); int C2 = Column | FULL_TO_COLUMN(F[25]); F[24] &= TblMaskSingle[Column] & TblMaskDouble[C2]; F[25] &= TblMaskSingle[Column] & TblMaskDouble[C1]; int Single = (TblShrinkSingle[Shrink] & TblColumnSingle[Column]) ^ U[26]; if (Single) { U[26] |= Single; int S = F[26] & TblPalaceMask[Single]; AN(F[2], S); AN(F[5], S); AN(F[8], S); AN(F[11], S); AN(F[14], S); AN(F[17], S); AN(F[20], S); AN(F[23], S); } } } return true; }

by **zhouyundong_2012** » Tue Apr 07, 2026 2:48 am

Code: Select all: #include <windows.h> #include <stdio.h> #include "sd.h" /* * First Step : Compile this project with __ProgramWritter used and Run it * Second Step: Replace the function Update in the bottom of this file with all contents of Update.c created by First Step in current directory * Third Step : Recompile this project without __ProgramWritter and Run it * * Warning: Can NOT comment __ProgramWritter until finish 3 steps! * * Good Luck! Shanghai China 2012/3/14 Zhou Yun Dong */ #define __ProgramWritter byte Board[81]; //Board CGame G[50]; //Guess Board int Index; //Guess Index int BlockMask[27]; //Block int BlockMaskSum[3]; //Sum of BlockMask int F[27]; //27-bit Full Mask int U[27]; //3-bit Unique in Palace int CompF[27]; //Use for Compare int CompColumn[27]; //Processed 9-bit column mask int CompShrink[27]; //Processed 9-bit shrink mask byte AllBoard[TotalBoard][81]; //All Boards int TestCaseNumber; //Actual Number of Test Case bool Update(); void Guess(); bool Rollback(); byte Calculate(); void InitSodoku() { Index = 1; BlockMaskSum[0] = BlockMaskSum[1] = BlockMaskSum[2] = 0; for (int B = 26; B >= 0; --B) { BlockMask[B] = U[B] = 0; CompF[B] = F[B] = -1; CompColumn[B] = -1; CompShrink[B] = -1; } //Get Full Mask for (int i = 80; i >= 0; --i) { if (Board[i] == '0') continue; int B = TblMult3[Board[i] - '1'] + TblBoard_Block[i]; //Get BlockMask and Sum of BlockMask BlockMask[B] |= TblBoard_BlockMask[i]; BlockMaskSum[TblBoard_Block[i]] |= BlockMask[B]; //Get Full Mask F[B] &= TblSelfMask[i]; F[TblAnother1[B]] &= TblOtherMask[i]; F[TblAnother2[B]] &= TblOtherMask[i]; U[B] |= TblBoard_GridUniq[i]; } for (int B = 26; B >= 0; --B) F[B] &= BlockMaskSum[TblRmder3[B]] ^ BlockMask[B] ^ BIT_SET_27; memcpy(G[0].F, F, sizeof(F)); } void Guess() { int Least = 9; for (int i = 0; i < 27; i++) { if (TblUniqFlag[U[i]]) continue; G[Index].Block = i; if (TblNumberOne[LOW_9_BIT(F[i])] == 2) { G[Index].Line = 0; goto FoundLeast; } if (TblNumberOne[MID_9_BIT(F[i])] == 2) { G[Index].Line = 1; goto FoundLeast; } if (TblNumberOne[HIGH_9_BIT(F[i])] == 2) { G[Index].Line = 2; goto FoundLeast; } } Least = 9; for (int i = 0; i < 27; i++) { for (int j = 0; j < 3; j++) { int NumberOne = TblNumberOne[HML_9_BIT(F[i], j)]; if (NumberOne > 1 && Least > NumberOne) { G[Index].Block = i; G[Index].Line = j; if ((Least = NumberOne) == 2) goto FoundLeast; } } } FoundLeast: int Left = HML_9_BIT(F[G[Index].Block], G[Index].Line); int Mask = Left & -Left; //Save Data G[Index].Mask = Mask; G[Index].Left = Left; for (int B = 26; B >= 0; --B) { G[Index].F[B] = CompF[B] = F[B]; G[Index].U[B] = U[B]; } F[G[Index].Block] &= TblSelfMask[TblTailZero[Mask] + TblMult9[G[Index].Line]]; CompColumn[G[Index].Block] = -1; CompShrink[G[Index].Block] = -1; Index++; } bool Rollback() { if (Index-- == 1) return false; //Restore Data int Block = G[Index].Block; int Line = G[Index].Line; for (int B = 26; B >= 0; --B) { CompF[B] = F[B] = G[Index].F[B]; U[B] = G[Index].U[B]; CompColumn[B] = FULL_TO_COLUMN(F[B]); CompShrink[B] = FULL_TO_SHRINK(F[B]); } G[Index].Left ^= G[Index].Mask; G[Index].Mask = G[Index].Left & -G[Index].Left; F[Block] &= TblSelfMask[TblTailZero[G[Index].Mask] + TblMult9[Line]]; CompColumn[Block] = -1; CompShrink[Block] = -1; //Whether The Last Guess Index += TblNumberOne[G[Index].Left] != 1; return true; } byte Calculate() { bool FoundAnswer = false; while (true) { if (Update()) { if (U[0] + U[1] + U[2] != 21 || U[3] + U[4] + U[5] != 21 || U[6] + U[7] + U[8] != 21 || U[9] + U[10] + U[11] != 21 || U[12] + U[13] + U[14] != 21 || U[15] + U[16] + U[17] != 21 || U[18] + U[19] + U[20] != 21 || U[21] + U[22] + U[23] != 21 || U[24] + U[25] + U[26] != 21) { Guess(); continue; } if (FoundAnswer) //Find Answer return MoreAnswer; FoundAnswer = true; } if (!Rollback()) //Backtrack Over return FoundAnswer ? UniqAnswer : NoAnswer; } } #ifdef __ProgramWritter void ProgramWritter() { char str[256]; FILE *fp = fopen("Update.c", "w+"); fputs("bool Update()\n", fp); fputs("{\n", fp); fputs("\tint Changed = 1;\n", fp); fputs("\twhile (Changed)\n", fp); fputs("\t{\n", fp); fputs("\t\tChanged = 0;\n", fp); for (int i = 0; i < 27; i++) { sprintf(str, "\t\tint Shrink%d = FULL_TO_SHRINK(F[%d]);\n", i, i); fputs(str, fp); sprintf(str, "\t\tint Column%d = FULL_TO_COLUMN(F[%d]);\n", i, i); fputs(str, fp); sprintf(str, "\t\tif (CompShrink[%d] != Shrink%d)\n", i, i); fputs(str, fp); fputs("\t\t{\n", fp); sprintf(str, "\t\t\tint LastColumn%d = CompColumn[%d];\n", i, i); fputs(str, fp); sprintf(str, "\t\t\tif ((F[%d] &= TblComplexMask[Column%d]) == 0)\n", i, i); fputs(str, fp); fputs("\t\t\t\treturn false;\n", fp); fputs("\t\t\tChanged = 1;\n", fp); sprintf(str, "\t\t\tint Column%d = FULL_TO_COLUMN(F[%d]);\n", i, i); fputs(str, fp); sprintf(str, "\t\t\tCompColumn[%d] = Column%d;\n", i, i); fputs(str, fp); sprintf(str, "\t\t\tCompShrink[%d] = Shrink%d = FULL_TO_SHRINK(F[%d]);\n", i, i, i); fputs(str, fp); sprintf(str, "\t\t\tif (LastColumn%d != Column%d)\n", i, i); fputs(str, fp); fputs("\t\t\t{\n", fp); sprintf(str, "\t\t\t\tint S1 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother1[i]); fputs(str, fp); sprintf(str, "\t\t\t\tint S2 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother2[i]); fputs(str, fp); sprintf(str, "\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S2];\n", TblAnother1[i], i); fputs(str, fp); sprintf(str, "\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S1];\n", TblAnother2[i], i); fputs(str, fp); fputs("\t\t\t}\n", fp); sprintf(str, "\t\t\tint Single = (TblShrinkSingle[Shrink%d] & TblColumnSingle[Column%d]) ^ U[%d];\n", i, i, i); fputs(str, fp); fputs("\t\t\tif (Single)\n", fp); fputs("\t\t\t{\n", fp); sprintf(str, "\t\t\t\tU[%d] |= Single;\n", i); fputs(str, fp); sprintf(str, "\t\t\t\tint S = F[%d] & TblPalaceMask[Single];\n", i); fputs(str, fp); fputs("\t\t\t\t", fp); for (int j = TblRmder3[i]; j < 27; j += 3) { if (j != i) { sprintf(str, "AN(F[%d], S); ", j); fputs(str, fp); } } fputs("\n", fp); fputs("\t\t\t}\n", fp); fputs("\t\t}\n", fp); fputs("\t\telse\n", fp); fputs("\t\t{\n", fp); sprintf(str, "\t\t\tif (CompColumn[%d] != Column%d)\n", i, i); fputs(str, fp); fputs("\t\t\t{\n", fp); sprintf(str, "\t\t\t\tint LastColumn%d = CompColumn[%d];\n", i, i); fputs(str, fp); sprintf(str, "\t\t\t\tif ((F[%d] &= TblComplexMask[Column%d]) == 0)\n", i, i); fputs(str, fp); fputs("\t\t\t\t\treturn false;\n", fp); fputs("\t\t\t\tChanged = 1;\n", fp); sprintf(str, "\t\t\t\tColumn%d = FULL_TO_COLUMN(F[%d]);\n", i, i); fputs(str, fp); sprintf(str, "\t\t\t\tCompColumn[%d] = Column%d;\n", i, i); fputs(str, fp); sprintf(str, "\t\t\t\tCompShrink[%d] = Shrink%d = FULL_TO_SHRINK(F[%d]);\n", i, i, i); fputs(str, fp); sprintf(str, "\t\t\t\tif (LastColumn%d != Column%d)\n", i, i); fputs(str, fp); fputs("\t\t\t\t{\n", fp); sprintf(str, "\t\t\t\t\tint S1 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother1[i]); fputs(str, fp); sprintf(str, "\t\t\t\t\tint S2 = Shrink%d | FULL_TO_SHRINK(F[%d]);\n", i, TblAnother2[i]); fputs(str, fp); sprintf(str, "\t\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S2];\n", TblAnother1[i], i); fputs(str, fp); sprintf(str, "\t\t\t\t\tF[%d] &= TblMaskSingle[Shrink%d] & TblMaskDouble[S1];\n", TblAnother2[i], i); fputs(str, fp); fputs("\t\t\t\t}\n", fp); sprintf(str, "\t\t\t\tint Single = (TblShrinkSingle[Shrink%d] & TblColumnSingle[Column%d]) ^ U[%d];\n", i, i, i); fputs(str, fp); fputs("\t\t\t\tif (Single)\n", fp); fputs("\t\t\t\t{\n", fp); sprintf(str, "\t\t\t\t\tU[%d] |= Single;\n", i); fputs(str, fp); sprintf(str, "\t\t\t\t\tint S = F[%d] & TblPalaceMask[Single];\n", i); fputs(str, fp); fputs("\t\t\t\t\t", fp); for (int j = TblRmder3[i]; j < 27; j += 3) { if (j != i) { sprintf(str, "AN(F[%d], S); ", j); fputs(str, fp); } } fputs("\n", fp); fputs("\t\t\t\t}\n", fp); fputs("\t\t\t}\n", fp); fputs("\t\t}\n", fp); } fputs("\t}\n", fp); fputs("\treturn true;\n", fp); fputs("}", fp); fclose(fp); } #endif bool ReadFile() { FILE *file = fopen("TestCase.txt", "r"); if (!file) { printf("Can not open TestCase.txt in current directory"); getchar(); return false; } TestCaseNumber = 0; while (fgets((char *)AllBoard[TestCaseNumber], 1024, file)) { if (++TestCaseNumber == TotalBoard) break; } fclose(file); return true; } bool main() { CreateTable(); #ifdef __ProgramWritter ProgramWritter(); return true; #endif if (!ReadFile()) return false; printf("Press Enter to Slove ..."); getchar(); //Set Highest Priority DWORD dwOldProcessP = GetPriorityClass(GetCurrentProcess()); DWORD dwOldThreadP = GetThreadPriority(GetCurrentThread()); SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS); SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL); //Get Frequency LARGE_INTEGER Frequency; ::QueryPerformanceFrequency(&Frequency); //Get Start Time LARGE_INTEGER start; ::QueryPerformanceCounter(&start); int Answer = 0; //Procedure of Calculate for (int i = 0; i < TestCaseNumber; i++) { memcpy(Board, AllBoard[i], 81); InitSodoku(); if ((Answer = Calculate()) != UniqAnswer) break; } //Get Stop Time LARGE_INTEGER stop; ::QueryPerformanceCounter(&stop); double spend = (double)(stop.QuadPart - start.QuadPart) / (double)Frequency.QuadPart; printf("Spend = %f us\n", spend * 1000 * 1000); printf("Total = %d\n", TestCaseNumber); printf("Average = %f us\n", spend * 1000 * 1000 / TestCaseNumber); if (Answer != UniqAnswer) printf("***************Result: %d ***************\n", Answer); //Restore Priority SetThreadPriority(GetCurrentThread(), dwOldThreadP); SetPriorityClass(GetCurrentProcess(), dwOldProcessP); return true; } #ifdef __ProgramWritter bool Update() { return true; } #else //open Update.c and copy to here #endif

-----------1. Just replace Update() 2. What is different compared to old version? FULL_TO_SHRINK is heavy and FULL_TO_COLUMN is light in old version. FULL_TO_SHRINK is light and FULL_TO_COLUMN is heavy in new version.

by **zhouyundong_2012** » Tue Apr 07, 2026 3:09 am

b26 b17 b08 / b25 b16 b07 / b24 b15 b06
b23 b14 b05 / b22 b13 b04 / b21 b12 b03
b20 b11 b02 / b19 b10 b01 / b18 b09 b00

Shrink: b18 b09 b00
Column: b06 b03 b00

by **zhouyundong_2012** » Wed Apr 08, 2026 2:15 am

Is the new code faster or slower on 17-clues?

by **champagne** » Wed Apr 08, 2026 5:22 am

I thought that you had the answer and that you was prepared to publish benchmarks.

I feel confortable with the current version and would make several comments:

Our benchmarks are often done with a single thread. Where we are, this can give a false view. Our computers run several threads in parallel and the bottleneck turns usually to the memory bus access. Intense use of lookup tables for example, can be good with one thread, but have a reverse effect in the parallel process. The overall effect of a change in not easy to weight.

To-day, on my side, a key use of the brute force is to test some multiple solution puzzles to get a list of unavoidable sets. The best code can be different.

Seen the very good speed of my code derived from yours, improving this code is not important, but I am curious to see in the next days what is new here.

by **zhouyundong_2012** » Wed Apr 08, 2026 7:28 am

I am also very curious on your very good speed. I hope to see the speed of current solvers of yours and others. Derived from ZSolver or not, that's not important. Speed Speed Speed!

by **dobrichev** » Wed Apr 08, 2026 12:55 pm

Hi zhouyundong,

The latest performance comparison to my knowledge is done by Tom Dillon here.

I agree with champagne that speed is meaningful only within the context of the task you are executing.
As an example, once I executed batches where exotic pencilmarks appeared, once in a thousands to millions solver calls, taking over 5 seconds per call.
After short experimenting, I changed the guess algorithm in my solver, halving the performance for the majority of puzzles, but doubling the overall performance, just because of the improvement in these exotic puzzles.

BTW it would be great if AI is capable to assist us in reading and understanding our own old code.

by **champagne** » Wed Apr 08, 2026 5:54 pm

My last public (and private) brute force code is the DLL. I had a look to the code.

After a quick overview on your recent posts, I think that the main difference is an implementation, on my side, of the 'last in cell'.

If I wanted to-day improve the code, I would do 2 things,

replace the preprocessor ' in line ' implementation by a classical C++ in line code, easier to read

Improve the 'last cell' research using what I am doing in my last 'quick solver'.

In this code, each cell receive at the start the 9 possible digits, and the list is updated at each assignment (<= 81 assignments)
9 '81 bits fields' help to follow cells with a low count.

A loop on cells to assign is very efficient in several parts of the path.

by **champagne** » Sat Apr 11, 2026 6:53 am

I tried to see what could be done on my last code, the DLL

https://github.com/GPenet/brute-force-solver-DLL

At the end, pre processor expansion seems a good way to optimize the code.
As I wrote, compared to the code posted above, the main point is the search of singles in cells and the use of cells in priority in the guess part of the code.

I think that there is room for an optimisation in a multi thread context. Optimization of the cache effect is not at all done in this code.
This can increase slightly the run time in a stand alone run, but should be better with a hevy parallel process.
I also remember having red that the recent computers are anticipating next instructions in some conditions.

I am curious to try something in this direction. As far as I can see, the process itself would be nearly unchanged.

The test to build is to run in parallel many threads (I have a computer with 16 threads and I often run 14 threads in parallel).
The same run (say around one hour) with the right sample of puzzles )should give a good idea of the comparison with my current process.

Anybody having skills in the computers organization is welcome to help me.

I should be in a position to run such a test in May.

by **zhouyundong_2012** » Sat Apr 11, 2026 9:10 am

The first one: AI. You also can let AI help you to analysis codes and optimize codes, and write documents.

by **champagne** » Sat Apr 11, 2026 1:12 pm

zhouyundong_2012 wrote:The first one: AI. You also can let AI help you to analysis codes and optimize codes, and write documents.

What says your tool on my code ?? :roll:

The New Sudoku Players' Forum

3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)