How to represent a sudoku with deleted candidates as string?

Everything about Sudoku that doesn't fit in one of the other sections

How to represent a sudoku with deleted candidates as string?

Postby Stephan Heuscher » Sat Feb 10, 2007 1:54 pm

Hi,

I was wondering if there is a standard way to represent a sudoku as a string.

The only thing I found was the standard 81-number string with the numbers one to nine indicating the set number-cells, but there's no way to store deleted candidates in this form.

Does any one know a solution?

Greets

Stephan

P.S. After some thought I've come up with http://heuscher.ch/SudokuHelp#encoding ... I'm sure there are better solutions.
Stephan Heuscher
 
Posts: 11
Joined: 19 February 2006

Postby daj95376 » Sat Feb 10, 2007 6:10 pm

See my reply in the Programmers Forum to the same question you asked there.
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Postby re'born » Sun Feb 11, 2007 12:51 am

daj,

I just read your response on the programmer's forum and I think you misunderstood the meaning of the post. Stephan is looking for a way to represent a Sudoku that allows the pencilmark progress to be conveyed. That is, how can one encode the current pencilmark grid (not just the solved cells) into a single line?

rep'nA
re'born
 
Posts: 551
Joined: 31 May 2007

Postby daj95376 » Sun Feb 11, 2007 2:07 am

rep'nA wrote:daj,

I just read your response on the programmer's forum and I think you misunderstood the meaning of the post. Stephan is looking for a way to represent a Sudoku that allows the pencilmark progress to be conveyed. That is, how can one encode the current pencilmark grid (not just the solved cells) into a single line?

rep'nA

I am often mistaken when reading posts.

Since I haven't seen anyone post a PM in a single-line format, I presented examples of the four most acceptable ways that I've seen grids presented. The last one was of a PM that uses multiple lines. Otherwise, he's free to devise any method he likes ... but it's going to be one loooooong line for Ruud's example of a PM with maximal number of candidates!
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Postby ravel » Sun Feb 11, 2007 3:21 am

daj95376 wrote:... Ruud's example of a PM with maximal number of candidates!
Please can you give me a link or post it ?
As i understood it, Stephan wants to encode a candidate grid in 81 characters and also found a way, how to do that. I think, it would be very useful, when we could post them line by line like sudokus.
ravel
 
Posts: 998
Joined: 21 February 2006

Postby daj95376 » Sun Feb 11, 2007 5:31 am

ravel wrote:
daj95376 wrote:... Ruud's example of a PM with maximal number of candidates!
Please can you give me a link or post it ?
As i understood it, Stephan wants to encode a candidate grid in 81 characters and also found a way, how to do that. I think, it would be very useful, when we could post them line by line like sudokus.

http://forum.enjoysudoku.com/viewtopic.php?p=36986#p36986
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Clarification

Postby Stephan Heuscher » Sun Feb 11, 2007 12:05 pm

ravel wrote:...
As i understood it, Stephan wants to encode a candidate grid in 81 characters and also found a way, how to do that. I think, it would be very useful, when we could post them line by line like sudokus.


Thank you ravel for clarifying the issue. My aim is to capture the complete state of a sudoku in 81 characters so that it can easily be shared.

It does not have to be easily readable by humans (although this would not hurt), but it has to be understandable for humans. The main goal is that the complete state of a sudoku in progress can be captured and recreated.

For example, the sudoku

003 078 240
528 000 700
700 002 810
250 700 604
060 000 002
800 000 300
070 001 030
005 600 000
009 000 100

has three candidates in the first cell (1, 6, 9). All other candidates can be excluded by applying basic checks (see here). If a user decides to delete the "9" from the list of candidates, then this should be represented.

My proposition is to record the exclusion in the string resulting in

D03 078 240
528 000 700
700 002 810
250 700 604
060 000 002
800 000 300
070 001 030
005 600 000
009 000 100

(the first "0" has been switched to a "D"), see here.

Unless you are fluent in bit-manipulation and know all ISO 8859-1 code points, you will not see, that the "D" represents "100" in binary , i.e., the third candidate has been deleted...

Thank you for the encouraging replies.

Stephan
Stephan Heuscher
 
Posts: 11
Joined: 19 February 2006

re: "must be no more than 7 candidates per cell"

Postby Pat » Sun Feb 11, 2007 12:11 pm

Stephan Heuscher wrote:
I was wondering if there is a standard way to represent a sudoku as a string.

After some thought I've come up with




hey Stephan Heuscher, i was not aware of this requirement --

~ Pat
User avatar
Pat
 
Posts: 4056
Joined: 18 July 2005

Postby udosuk » Sun Feb 11, 2007 12:59 pm

The way I see it, you need a single character to represent the state of each cell, which has a total of 2^9=512 different possibilities (each of the 9 candidates being in or out). Every ASCII-byte can at most represent 2^8=256 different possibilities, so you need to perform some magical tricks to compress the 512 different possibilities into a byte of 8 bits...

I myself used a simple scheme of using 9 bits to represent each cell, which translate to a 324-chars octary-base representation, as shown in the link Danny provided...
udosuk
 
Posts: 2698
Joined: 17 July 2005

Re: re: "must be no more than 7 candidates per cell&quo

Postby Stephan Heuscher » Sun Feb 11, 2007 2:35 pm

Pat wrote:
Stephan Heuscher wrote:2. There must be no more than 7 candidates per number-cell (which is the case for any legal sudoku)

hey Stephan Heuscher, i was not aware of this requirement --


You caught me :-)...
BUT: Can you show me a single solution 9x9 sudoku with eight candidates in any number-cell? I've never seen one.

Stephan
Stephan Heuscher
 
Posts: 11
Joined: 19 February 2006

Postby Stephan Heuscher » Sun Feb 11, 2007 2:40 pm

udosuk wrote:The way I see it, you need a single character to represent the state of each cell, which has a total of 2^9=512 different possibilities (each of the 9 candidates being in or out). Every ASCII-byte can at most represent 2^8=256 different possibilities, so you need to perform some magical tricks to compress the 512 different possibilities into a byte of 8 bits...


You are totally right.

The trick that there is a maximum of 7 candidates (= 7 bits) per cell. If the sudoku only has one solution and the impossible numbers are removed, e.g., if there is a "3" in a row, then all 3s are removed from the candidate list in all cells in this row (same for column and block).

Stephan
Stephan Heuscher
 
Posts: 11
Joined: 19 February 2006

Re: re: "must be no more than 7 candidates per cell&

Postby udosuk » Sun Feb 11, 2007 3:07 pm

Stephan Heuscher wrote:BUT: Can you show me a single solution 9x9 sudoku with eight candidates in any number-cell? I've never seen one.

From the link Danny showed us, there is this puzzle:
Code: Select all
123456789 2356789   123456789 1235679   12356789  123456789 12456789  1245789   123456789
123456789 123456789 12456789  135679    123456789 123456789 12456789  15789     12456789
134567    23456789  12456789  1345679   123456789 123456789 12456789  145789    12456789
12456789  123456789 2456789   135679    12356789  12356789  1245679   13579     1235679
3456789   3456789   456789    12345679  12356789  123456789 123456789 123456789 12345679
2456789   256789    2456789   123456789 1256789   123456789 12456789  12345789  123456789
12346789  236789    246789    12345679  1256789   2346789   12456789  12456789  123456789
12346789  123456789 12456789  1345679   123456789 123456789 1456789   123456789 12345689
123456789 2356789   123456789 1345679   123456789 123456789 123456789 134589    12345689

This puzzle has candidate lists of 8 or 9 in length (a lot of them:) ), yet it has a unique solution...

Even if you're only looking for normal Sudoku with initial givens, there is this one from tso:
Code: Select all
....12.34
....56.78
.........
.........
41.....85
29.....67
.........
12..39...
98..47...

Where you'll see cells containing 9 candidates... Feed it into your solver, and check that it has a unique solution...

And also we often do variants such as Sudoku-X, Windoku, Killer Sudoku, etc and they usually have 13 initial givens or fewer and it's very likely you'll find cells with 8 or 9 candidates... If I were you I'd design a scheme which accomodates these variants as well...:idea:
udosuk
 
Posts: 2698
Joined: 17 July 2005

More than 7 candidates

Postby Stephan Heuscher » Sun Feb 11, 2007 11:40 pm

udosuk wrote:...
Code: Select all
....12.34
....56.78
.........
.........
41.....85
29.....67
.........
12..39...
98..47...

Where you'll see cells containing 9 candidates... Feed it into your solver, and check that it has a unique solution...

And also we often do variants such as Sudoku-X, Windoku, Killer Sudoku, etc and they usually have 13 initial givens or fewer and it's very likely you'll find cells with 8 or 9 candidates... If I were you I'd design a scheme which accomodates these variants as well...:idea:


Thanks you very much for your help!

I modified the encoding so that it now uses Chinese characters (starting with Unicode code point U5000) to encode the deleted numbers. What do you think of this:
Code: Select all
....12.34
....56.78
..倌......
.........
41.....85
29.....67
.........
12..39...
98..47...


It's the result of deleting the candidates 3 and 4 from the cell in the 3rd row/column, which has 9 possible candidates.

I'll admit that it's a bit far-fetched, but it works. Tell me what you think.

Stephan

For more information, see http://heuscher.ch/SudokuHelp
Stephan Heuscher
 
Posts: 11
Joined: 19 February 2006

Postby udosuk » Mon Feb 12, 2007 2:07 am

Personally I read and understand the Chinese characters well, so it looks less awkward to me...:)

But memory-wise, because each Unicode Chinese character is worth 2 ASCII-chars i.e. 16 bits, you're practically wasting 7 bits per cell, which isn't too efficient IMHO...

If you're only worrying about presenting this to the users in a human-readable way, it works well for me, but I'm not sure about others...:?:

Notice the English chars and Chinese chars are displayed in a different width on screen, so you probably should also use the Chinese full-width chars for the digits and the dots/zeros...
udosuk
 
Posts: 2698
Joined: 17 July 2005

Postby Smythe Dakota » Mon Feb 12, 2007 5:02 am

Here's a way to represent a Sudoku grid (including candidates) in about 60 bytes, on the average.

First, form a 9-digit number as follows: The first digit is the number of 1's among the givens, the next digit is the number of 2's, etc. For example, the 9-digit number 431232243 would mean that there are four 1's, three 2's, one 3, etc. among the givens.

A nine-digit number can be represented in 30 bits. Call this number the "header", and start your string with these 30 bits.

Next, we deal with the location of the givens. The first given 1 can be in any of the 81 cells, so it takes 7 bits to specify the location. The next 1 can be in any of 64 cells (delete the row and column the first 1 is in, never mind about boxes), so it takes 6 bits to specify this location. The locations of the remaining 1's can be specified in 6 bits (49 possible cells), 6 bits (36), 5 bits (25), 4 bits (16), 4 bits (9), 2 bits (4), and 1 bit (1). In our example, there are only four 1's, so all we need is 7+6+6+6, or 25 bits, to specify the locations of all the given 1's. Similarly, we then need only 19 bits to specify the locations of the three 2's, 7 bits for the one 3, etc. In our example, with a header of 431232243, it adds up to only 153 bits for all the givens.

Next come the candidates. These need be specified only for the remaining (empty) cells, and the scheme above knows which cells those are.

Theoretically, it would take 9 bits per cell to specify whether each digit (1 to 9) is a candidate in that cell. In practice, however, we can skip the "automatic non-candidates". By this I mean that, for example, the digit 4 is an automatic non-candidate for r5c6 if there is a 4 anywhere in either r5 and/or c6 and/or the center box. By now, which digits are automatic non-candidates is known to whatever goofy software has been devised to grok this scheme, so many bits can be saved. If, for example, the digits 4, 7, 9 are automatic non-candidates for the first empty cell, then the candidates for this cell can be represented in just 6 bits (one each for the digits 1, 2, 3, 5, 6, 8) instead of 9. Typically, an initial-position Sudoku will average about 4 automatic non-candidates per empty cell, so that only 5 bits will be required for each of the 60 or so empty cells. This comes out to 300 bits.

So we have a complete representation in 30+153+300, or 483 bits, which is 61 bytes.

This scheme will even work for illegal (multi-solution) grids, though the length of the string may be greater. A completely blank grid, for example, would require 759 bits, or 95 bytes.

Bill Smythe
Smythe Dakota
 
Posts: 564
Joined: 11 February 2006

Next

Return to General