And now, detailed answers to mladen
dobrichev wrote:
To me it is obvious that more UA sets are required for efficient scanning, more than those of size 12 used for 16-clue search.
I have some ideas for extremely fast UA collection, maybe similar to McGuire's process up to some level.
I share your views, although this is not a clear point in McGuire's article.
I did not work that much on the optimization of the data collection, I use the brute force when the list of UAs is exhausted, which is a reasonable start in terms of performances.
The code itself has to be improved if the number of calls is too big.
One problem is that the numbers of UAs of size>12 grows very fast, so there is here an optimization issue:
more UAs added-> more work to optimize the cells priority (in that process)
less UAs added -> more calls later
Another optimization factor is "how far upstream do you enter the added UAs"
One thing I noticed in the test is that sometimes(late in the process), one of the added UA can be "dead", killing the branch.dobrichev wrote:
Do you have statistics how the existing 17s deviate from the minimal number of clues required for band completion
for the band with highest minimal clues (the criterion which you used to reorder the catalogue)?
Such statistics is easy to collect, and if for half of the 17s the number of clues match, then this order can help.
If almost all of the 17s have "redundant" clues in this band I see no reason to prefer it as a startpoint.
Or, at least, correlation between the preferred band and the band with maximal clues in the 17s can be investigated.
Just another answer to that to explain why I think that this is nor relevant.
The true question IMO, considering the planned 3 steps strategy, is "how many sets of given" solving bands 1+2 wild be below the limit.
I have a provisional answer out of my ongoing test.
8% of SG(sets of given) require one more clue than the minimum
2% require 2 more cluesthe rest is peanut (3 and even 4more clues)
in that example, the 17 is part of the 8%.
dobrichev wrote:
Also it is still unclear how the loops should be arranged so that as much as possible of the collected data remain reusable for the next grids. Isn't it?
IMO, this is clearly how your call steps 2 and 3 for each solution of {Bands 1+2}
It is too early for me to enter in that task, a preliminary feasibility is to have some timing on the phase 1, the production of valid bands 1+2.
A grid producing a 17 is generally a worst case, I intend to apply as soon as possible the phase 1 to the sample 1/10^6 of the catalog.
dobrichev wrote:
The cell selection strategy is somehow controversial. What is the benefit of faster reduction of the unhit UA list against the opposite
- leaving as much as possible unhit UA on order impossibility of covering all of them to be detected earlier?
This is an open point. In my case, the expected bonus are the following
- shrinking the table is faster. Just a remark : having a 64 bits description of a UA, copying UAs still active in the new table is likely the faster way to go ahead. Revising the cells order is another question and could be changed.
- having more dead cells, the early filter to cancel a branch should work better. This is an alternative view to the Higher degree UAs in the McGuire process
- Going fast to an empty list of UA's, we should have (again in the strategy followed here) less implementations of "look for more UAs"