eleven wrote:Hi champagne,
champagne wrote:based on your experience, what would be the price to pay to generate a pseudo unbiased sample for higher ratings.
I don't know, how one could do that in reasonable time without an incalculable bias.
I have heavy doubts that a sample with such a low count can be used to draw any conclusion in that area.
Puzzles with higher rating are that rare, that the overall percentage of JE puzzles in ER 8.5+ or ER 9+ puzzles should be very near to that in the sample.
My idea had been, that you compare just puzzles with fixed ER's like 8.3 and 9.0 in your and my sets. If there would have been a good correlation, then e.g. your percentages in 10.0 puzzles were also probable to be near to that of random puzzles. However the correlation seems to be very bad also for fixed ER's.
I am not expert in statistics, but I keep in mind from my school time that studying the value of a sample is a tough task.
Here it's clear that the sample has no value for ratings above 9.2, and likely a close to nil value for ratings 9.0 to 9.2
I can easily meet your view for ratings below 9.0. The generation of puzzles in the vicinity of potential hardest is heavily biased as soon as you leave the high ratings area.The highest is the distance to the cut off, the highest is the bias.
I started the task of rating the whole file of the "so called grey area" in my sample. This is a one month task at least, but at the end, I'll be in a position to produce a square table giving the ratio of seen frequency in the sample per rating and number of clues.
This could give a better idea of how far we are in both approach
EDIT
I don't know what you have in mind when you say
- Code: Select all
However the correlation seems to be very bad also for fixed ER's.
but it's clear to me that the ratings 7.6 to 8.2 in serate are over evaluated (lack of group handling).
the true barrier is 8.3, for puzzles requiring the "region or cell" conflict.
In the range 8.6 9.0, serate has IMO an homogeneous behaviour