This post describes in detail the procedure of box marking, the first of two basic solving stages of Systematic Sudoku. Box marking identifies and marks slinks and naked subsets within boxes, as well as new clues. A walkthrough illustrates the dynamic action of box marking, guided by tracing rules that organize the process to capture every marking opportunity. The two-dimensional Sysudoku trace was introduced in January 2013.
The beginner’s page has snapshots of box marking in action, illustrating how box slinks, aligned triples and naked subsets reveal new clues. The previous post showed that slink marking eliminates many candidates that would otherwise clutter the grid. Here you will see box marking in motion, where the next step is determined, and the process just swings along.
Sysudoku box marking is very systematic. You follow a path that determines the next move. When the puzzle is expected to be hard, I make a record of this path, a two-dimensional trace. Sysudoku traces describe what is done, but not why. Traces make the solving reproducible and correctable. You read the trace by updating a puzzle grid as you follow the trace. We’ll follow a trace on our box marking walk through.
Let’s return to Sue de Coq’s celebrated post of October 24, 2005, titled “Two-Sector Disjoint Subsets”, to walk through the box marking of the puzzle. To see the grid on each move, just record each move on your own copy, as we go.
For each number, box marking finds every combination of dublex and crosshatch, using clues, slinks and aligned triples, that reduces the possible locations in every box to one or two, or three aligned cells. When that happens, we add clues or slinks or aligned triples to the box, and follow up by marking the effects of the addition. Boxes are scanned for these effects North to South, and in each band (row of boxes), West to East.
As we go through the box marking, I’ll explain the Sysudoku tracing rules that make it a sequence sysudokie trace writers follow.
The box marking trace is a set of lists, one for each number. The list is actually a nested list causes and effects. The top line of the number lists the effects from the dublex and crosshatches into the box, or from fills, covered below. Effects are processed as causes from left to right. All of the effects of effects of a cause are listed below it.
Top lists for 3, 6 and 7 are empty. That means that not all clues for the number are placed, but there are no effects to be recorded in the initial scan of numbers. Sometimes, when it is time to generate the top list, every box has it’s clue. In that case, the list header “n:” is omitted.
Now let’s verify the 1: list, from the initial grid. The sweep into NW leaves 6 cells for 1, but the crosshatch into N leaves two, for the slink Nm. The “m” stands for “marks”. We record this effect on the grid as “pencil” 1’s in the top left corner of the two cells reserved for the N 1-clue. The final step is to look at Nm as a cause of effects. Every effect gets its turn as a cause for more effects. Nothing here.
Continuing the scan of boxes for the 1: list, the crosshatch into C comes close, but no cigar. As you go through the boxes, you pick up every combination of two clues or slinks sweeping into one or two boxes. We get another slink when a crosshatch of clue and dublex leaves SEm. You decide where the marks go, and continue the marking on your copy, up to the first clue, the S4 effect.
Here is the grid, now box marked through the 7: list. The 2: list had one effect; the 3: list, none, when you arrive at S4. The absence of any lower case letters says this effect is a clue. It could have been marked as “Shs4” to say that it is a hidden single. It’s a hidden single in c5, meaning the only cell left for a 4. We do that only where the distinction is important. Likewise, trace readers don’t need to be told it’s a dublex
There’s a lot to explain under the S4 however. The parentheses surround a list of effects. The indention puts effect under cause. As a cause, S4, has two effects. One is that S4 completes a dublex into SW, cutting the 4-candidates from 5 to 2. The new slink SWm has no effects, and we move on to S3m. That’s the first fill effect. When we fill a S cell with 4, it cuts three 3-candidates in S two. Slink!
Now we have a notational mystery to clear up. Why “SWm”, and not “SW4m”? Because we are processing the 4: list. Inside this list, we have to identify the number if it isn’t 4.
The second effect, when it becomes a cause, has an effect. As a cause, it cuts five 3-candidates to three aligned ones. Is that a fill or a dublex? It’s both, and you are about ready to not care which.
Now you probably want to know why SWm was listed first. To make everybody’s list consistent, we place effects numbered like the cause first, and therefore first as a cause, then lower numbers first. There’s an overriding rule for fill effects. But you can forget about all of this until you become a sysudokie trace writer.
But one more thing. When was S3m marked on the grid? It was marked as an effect of S4. All of the immediate effects are put in the trace list and marked on the grid at the same time. They are all there when the first one is inspected for effects. It often happens that an earlier cause steals an effect from a later effect on the list, before the latter becomes a cause, if you know what I mean, which you probably don’t quite yet. It’s OK when that happens. The important part is that no effect gets overlooked as a cause.
Doing this on ©PowerPoint where erasures are no problem, I use this bookkeeping trick. I leave the clue effects listed after the first in pencil sized font on the grid, and then promote them to clue sized font when they become causes.
Now continue your trace reading to E5, where we are going to complete the effects list of C5 with the latest, and probably the last, addition to Sysudoku box marking. It’s called the 3-fill, and it now applies to c6.
When there is only one cell left for the missing number, or two cells left for two missing numbers, we fill them, with a clue or what we call a naked pair. With two clues, the overriding rule mentioned above is to put the number forced into its place first on the list. That gives priority to the forcing effects.
With three cells left, we can often fill them with:
In this case, it’s the second part. Once the 1 is forced into its place, the naked pair 67 is a 2-fill. Of course, we should have seen that before now, when Nm was marked. The point is that Sysudoku is now including three cells left among the limited set of events, like the wall or the square, to look at in box marking. The C5 cause triggers the 3-fill rule.
Now you’ll have no problem completing your grid for line marking in the next post. There are a couple of box marking events that did not come up in the classic Sue de Coq puzzle: the box/line and subsets beyond the naked pair mentioned above. They will be explained fully as they arise.
If you’re a beginner, you might just explore how far box marking alone can take you, before going into the even more systematic and “busier” line marking in the next post. Line marking is necessary for harder puzzles. It finds all the remaining candidates. But for many Sudoku, there aren’t any remaining candidates after Sysudoku box marking. If you do find some still standing, then you’ll be better motivated to “do the work” of line marking.
Going through this again and am understanding things better. I have a better understanding between the trace scheme you use and the one I have been using. Mine tended to show the results and yours the reason. Anyway here is one point of confusion. I think I figured it out but your other readers may be having issues? In your trace on the Sue De Coq in this post you start using ‘dx’ and ‘x’ notation, and while this is explained some on this page itself. It is NOT explained in your Sysudoku Trace Page.
Also I see a couple of entries here that are either a mistake or I just don’t understand.
1. 2: dx46=>C2m (should it be dxc46 ?)
2. 8: (2nd part) dxc13=>8m – shouldn’t that be dxc13=>NWm
Where ‘dx’ = your dublex elimiation, and ‘x’ = crosshatching notation
Anyway I think an overt explanation in the Trace page should explain in detail the ‘x’ and ‘dx’ notation. Especially on ‘dx’ notation: is it always ‘dxc’ or ‘dxr’ or is there a case for just ‘dx’
One other issue: I realize your Trace Notation is showing the reason for an event and that the actual results of that notation are shown on the grid. My problem with that is the sequence of the results is not explicit with that notation. So if you take your reason notation and add my result notation you get all of that without having a grid. This way a purely text description can explain both the reason and the results in sequence as they occur. There is a link – I will find it – that does exactly that. I will research it and get back with you. Also I appreciate your attempts at producing a consistent order to your trace – a canonical form if you will – but one area that is not clear is how to ‘order’ the multiple results of an action. E.G. the order of the items in the parenthetical expression. Is there a way to order them?
That link is: Sudoku step by step. It is NOT as comprehensive as your work, but it might give us ideas on how to document reason and results in some methodical fashion showing the sequence of both, without having to have a grid – E.G. and entirely text explanation.
I am a human solver. I have been making advances using Hodoku by Bernhard Hobiger :
Hodoku offers default puzzle levels: Easy, Intermediate, Hard, Unfair and Extreme. Each strategy is given a default score. Puzzle levels are then assigned based on the total score for all the strategies ‘needed’ to solve a given puzzle. All of the scores and assigned levels can be set by the user.
Up until recently I had thought that the distinctions between Hard and Unfair puzzles were quite appropriate. Using the defaults, Unfair puzzles contained Chains and ALS type strategies.
Hodoku offers the standard telephone keyboard configuration for unsolved cells.
I have just begun to study Sysudoku which takes a very different approach to unsolved cells. I am finding that even the basics of Sysudoku have a steep learning curve. I think part of this curve shape for me is because my understanding of links is poor. I have, until recently, been learning strategies without really clearly understanding their basis.
Nevertheless, it is my impression that Sysudoku may dramatically blur the distinctions between Hard and Unfair.
Thank you, Gordon. I’ll follow up on your excellent comment, and look into Hodoku puzzle ratings and solving strategies in the coming year.
I think when you are comfortable with strong vs weak links between candidates, it makes everything easier, and carries you a long way.
I remind all new readers to start at the beginning and work through those early posts. Even the beginner’s page has fundamental concepts too often missing.
Recent 2014 posts suggest doing box marking with clues only before slink marking, especially for easier puzzles. This process, the dublex bypass, changes nothing about box marking or line marking. It just bypasses some of the marking work for easier puzzles.