dailysudoku.com

wkbcat · Joined: 11 Jun 2006 Posts: 2

hi-
this q may have been answered before, but i have just looked
in the forum for the first time and didn't see anything on first glance.

i have been doing these and my newspaper puzzles for some time now-
and have observed the following on many occasions:
my newspaper saturday puzzle is six-star (but like most ratings,
not very meaningful). often, when i enter it into the draw/play,
the dailysudoku program claims it is "too hard". then, i play it out
a dozen steps or so, checking the "grade" as i go. usually, at some
point, in ONE step the grade changes from "too hard" to "easy",
whereas as far as i am concerned, each step is roughly equally tricky
to to determine.

obviously this question is more about the grading program than the
puzzles themselves-
any ideas?
thanks- -wkbcat

keith · Posted: Sun Jun 11, 2006 10:48 pm Post subject: Questions and answers

wkbcat,

The Saturday puzzle you refer to is routinely posted and discussed on this thread. Look for "DB Saturday Puzzle", followed by the date.

The solver on this site has a limited repertoire of solution methods. If it cannot solve a puzzle, it says it is "too hard". Ther have been numerous posts on this - I think samgj (the site owner) will probably direct you to some of them, or simply answer your question in more detail.

Briefly, the solver does the basics: Naked and hidden singles, pairs, and line/box interactions. Recently, it also does X-wings.

Keith

Ruud · Joined: 18 Jan 2006 Posts: 31

wkbcat,

there is no easy answer to your question about ratings, but I may have a little more to tell about this subject than Keith.

This belongs in the list of heavily debated sudoku issues, like T&E vs. logic & the use of uniqueness-based methods.

I've done a lot of research into sudoku rating, and took part in a number of discussions on other forums.

Ratings are supposed to indicate relative toughness of a puzzle. Ratings can therefore never be universal but they can be meaningful to compare items in a single collection. A sudoku that I make is rated by my program. Relative toughness can be deducted from the ratings of 2 of my sudokus, but not one of mine and one of Sam's.

Ratings should reflect relative toughness for human solvers, but instead they usually count the number of 'tricks' encountered by a computer program.

What you think is tough is not wat I consider tough, because we do not solve these puzzles in the same way, because we have a different level of expertise and arsenal of tricks. A rating can only reflect the difficulty for the average sudoku solver. A slight change in your solving techniques can make a certain measuring system useless.

Some sites, like Mike Mepham's, use times returned by the players as input for ratings. There are 2 drawbacks. First, you cannot trust the data returned by the players. Many use bogus times or have used computer programs to solve the puzzle. The downward rounding due to personal pride also plays a role, but this affects all data equally. Second, you need to find some attributes in the puzzle that correlate with the solving time. Any statistician can tell you how easy it is to make errors in complex models, like humans solving a puzzle.

So, having considered all these issues, I have decided to use a different rating system.

It considers the following:
- Find the shortest path to solve the puzzle.
- For each step on this path:
- Measure the search domain (how many empty cells left, how many candidates available)
- Count the alternative moves, and give each move a difficulty rating.
- Discount points from moves that logically follow the previous move (same digit, same row, column, box, revealed by previous move)
- Search charge is calculated based on domain size divided by number of alternatives.
- Admin charge is calculated for techniques that require pencilmarks. However, as soon as one of these techniques is selected in the shortest path, the admin charge is dropped, because a human solver would keep pencilmarking richt upto the end once it has been started.

Certain techniques set a lower boundary for the difficulty rating, because only few people know how to use these techniques. This can sometimes produce skewed puzzles. There are sudokus that solve easily with singles upto a certain point and then require a hi-tech move. These puzzles I do not publish normally, but they make great technique examples or trainers. I have a benchmark list that is full of such puzzles.

To summarize:

I rate puzzles based on:

1. The number of steps needed to solve them
2. The estimated search time for each of these steps
3. The difficulty of each step (techniques)

cheers,
Ruud.

wkbcat · Joined: 11 Jun 2006 Posts: 2

thank you both-
at the suggestion, i did look down at the db saturday june 10
message just below. it was one of my examples, going from
too hard to easy in one step. it was a fun puzzle to solve.
-wkbcat