I continued to work on zhouyundong_2012's code starting from that position from the last post
a) a file of 4.847.466 puzzles rating 1.2 to 4.4 with a pattern of a past game
fss:1mn 10 against 1mn 18
I made 2 steps of improvement
1) replacing the Update() loop by an expanded code with no overhead load as in the original version
2) Doing a similar step to select in priority a guess of a bi value
The first step pushed down the runtime from 1mn 18 to 53 seconds, now faster than my version of fss
The reduction coming out of the second step was very small (about 1 second)
So the new solver, with that high optimisation of the central loop appears 20% better than fss. I have a slightly better ratio (a little more than 30%) with the file of potential hardest.
I have still some cleaning to do to get my final version in C++, but I am not expecting a significant effect on the performance.