Programming a Sudoku solver

Date: 2/1/2021. By: David Luna

Tag(s):

Sudoku
Algorithms

Repository with complete source code can be found here

Sudoku and its complexity

Sudoku is a great game to play on its own, and it’s also a great to try to solve programmatically. The game has simple rules, but it also offers a challenge. The rules are simple:

There is a 9x9 grid with cells that can be filled with numbers from 1 to 9.
Each number cannot be repeated in the same row or column.
Each 3x3 subgrid must not have repeated numbers in it.

An example starting grid:

(Image taken from Wikipedia’s Sudoku Article )

The most simple solution would be to use a bruteforce approach to fill in the blanks in the grid, but this would not that optimal and it would take a long time with the harder puzzles that have fewer filled cells at the beginning. Considering that there are 81 cells in the grid, and each one of them can have any number from 1 to 9, it can be assumed that the worst case complexity for the puzzle taking the bruteforce approach is of O(81^9). Of course, the number of free cells in the grid at the beginning is lower than 81, but trying all of the possible numbers in each cell is still a problem, since it leads to a lot of dead ends and wasting time trying solutions that don’t work or even make sense.

There are certain steps that can be taken to mitigate this:

Check for violations of the rules when assigning the value of a cell to avoid expanding any further.
Do some backtracking after violations of the rules are detected to avoid expanding a branch that does not lead to an answer.
Limit the possible values of a cell to reduce the domain and complexity of the problem.

Sudoku and Constraint Satisfaction Problems

Constraint Satisfaction Problems (CSPs for short) are problems whose states must meet a set of requirements to be considered valid. In a sense, the steps mentioned previously are that, a way to establish constraints that must be met in order to have a Sudoku grid that makes sense and doesn’t break the rules. When defining a CSP it’s important to determine the variables of the problem, the domain of each one of these and the constraints the problem. In the case of sudoku, it looks like this:

The variables are each cell of the 9x9 grid: X = {X1, X2, ..., X81}.
The domain of each variable is the set of numbers from 1 to 9, so D = {1,2,3,4,5,6,7,8,9}.
The constraints are just the rules of the game: two variables in the same row or column can’t have the same value and no two variables in the same 3x3 subgrid can share the same value.

Once that the problem has been modeled as a CSP, one can take one of many different approaches to solve it. To define the sudoku CSP the following data structures are needed:

using VarType = Loc;
using VarValType = int;

class Loc {
  public:
	uint8_t row, col;
	Loc(uint8_t x, uint8_t y);
};

bool operator<(const Loc &loc1, const Loc &loc2);
bool operator==(const Loc &loc1, const Loc &loc2);

class CSP {
  public:
	std::map<VarType, std::set<VarValType>> domains;
	std::map<VarType, std::vector<VarType>> arcs;

	CSP();
	void printCSP(bool prettyPrint);
	void doBacktracking();
	void readCSP();

  private:
	std::vector<VarType> sortedVars; // For use in the backtracking solution

	bool isVarConsistent(const VarType &var);
	void buildArcs(const VarType &var);
	bool solutionHelper(unsigned long depth);
};

Note: the use of operators < and == for the class Loc is needed to use the class as key value for map and set. The implementation can be found here .

The class Loc is helpful to have a the coordinates (x,y) to locate each cell in the 9x9 grid of the game. CSP has the following public members:

domains defines the set of values that each cell on the grid can take.
arcs defines which cells in the grid have a constraint with each other.

The CSP constructor defines the arcs of each cell in the sudoku grid. It goes through every position and it calls the buildArcs function. The called function adds every variable in the same row, column and 3x3 subgrid to the set of neighbors of the cell at loc.

const uint8_t nRows = 9;
const uint8_t nCols = 9;
const uint8_t sSize = 3;

CSP::CSP() {
	for (uint8_t i = 0; i < nRows; ++i) {
		for (uint8_t j = 0; j < nCols; ++j) {
			Loc tempLoc(i, j);
			arcs.insert(std::make_pair(tempLoc, std::vector<Loc>{}));
			buildArcs(tempLoc);
		}
	}
}

void CSP::buildArcs(const Loc &loc) {
	for (uint8_t i = 0; i < nRows; ++i) {
		if (i != loc.row) {
			arcs.find(loc)->second.push_back(Loc(i, loc.col));
		}
	}
	for (uint8_t j = 0; j < nCols; ++j) {
		if (j != loc.col) {
			arcs.find(loc)->second.push_back(Loc(loc.row, j));
		}
	}
	uint8_t sRow = loc.row / sSize;
	uint8_t sCol = loc.col / sSize;
	for (uint8_t i = sRow * sSize; i < (sRow + 1) * sSize; ++i) {
		if (i != loc.row) {
			for (uint8_t j = sCol * sSize; j < (sCol + 1) * sSize; ++j) {
				if (j != loc.col) {
					arcs.find(loc)->second.push_back(Loc(i, j));
				}
			}
		}
	}
}

The AC-3 Algorithm

The AC-3 Algorithm is an algorithm that helps to reduce the domain of variables by establishing the arc consistency of each constraint. The algorithm checks every constraint between a pair variables (x,y) and removes the values of the domain of x and y that don’t meet said constraint. In this case, the algorithm is useful to reduce the possible values of each cell from {1,2,...,9} to only those that don’t break the rules.

With the CSP defined, the AC-3 algorithm uses two functions to reduce the domains of each variable and ensure that constraints have been met.

ac3Algorithm: this function creates a queue which stores all of the pairs (x,y) of variables that share a constraint. Afterwards, it enters a loop that calls the function removeInconsistencies with each pair on the queue. If the function call returns true, then all of the neighbors of x are inserted in the queue with the pair (neighbor, x). The loop ends when the queue has been emptied.
removeInconsistencies: this function receives loc1 and loc2, which are x and y from ac3Algorithm respectively. It checks if any value in the domain of loc1 does not meet the constraint with loc2. With the sudoku problem, such a violation would be perceived when the domain of loc2 is of size 1, and the only value in its domain is in the domain of loc1. Since the domain of loc2 can’t be empty, the value is removed from the domain of loc1. If no violations are found, the functions returns false, or true otherwise.

bool removeInconsistencies(CSP &sudo, Loc loc1, Loc loc2) {
	std::set<int> *currSet = &sudo.domains.find(loc2)->second;
	for (auto curr : sudo.domains.find(loc1)->second) {
		if (currSet->size() == 1 && currSet->count(curr)) {
			sudo.domains.find(loc1)->second.erase(curr);
			return true;
		}
	}
	return false;
}

void ac3Algorithm(CSP &csp) {
	std::queue<std::pair<Loc, Loc>> arcQueue;
	for (auto cell : csp.arcs) {
		auto loc = cell.first;
		if (csp.domains.find(loc)->second.size() > 1) {
			for (auto neighbor : cell.second) {
				arcQueue.push(std::make_pair(loc, neighbor));
			}
		}
	}

	while (!arcQueue.empty()) {
		std::pair<Loc, Loc> curr = arcQueue.front();
		arcQueue.pop();
		if (removeInconsistencies(csp, curr.first, curr.second)) {
			for (auto neighbor : csp.arcs.find(curr.first)->second) {
				arcQueue.push(std::make_pair(neighbor, curr.first));
			}
		}
	}
}

Backtracking and Minimum Remaining Values Heuristic

Now that the domain of each variable has been reduced to meet the constraints of the problem, the next step is to apply backtracking to solve the problem. It can be noted that in some cases applying the AC-3 algorithm will be enough to solve the problem, but not in the hardest ones.

A useful heuristic to use with backtracking is the Minimum Remaining Values Heuristic. When using it, the nodes with the smallest domain are explored first, leaving the variables with big domains at the end of the tree. When an constraint violation is detected while doing the backtracking, a larger amount of branches will be cut off with this heuristic.

To use it, the variables with a domain bigger than one must be sorted according to the size of their domains. The vector sortedVars will be used to keep track of which variables have the smallest domains. The function call to std::sort with the help of the comparator function locComp takes care of this sorting step. Afterwards, the recursive helper function is called to start the process of backtracking.

void CSP::doBacktracking() {
	auto locComp = [this](const Loc &loc1, const Loc &loc2) {
		return domains.find(loc1)->second.size() < domains.find(loc2)->second.size();
	};

	for (auto loc : domains) {
		sortedVars.push_back(loc.first);
	}
	std::sort(sortedVars.begin(), sortedVars.end(), locComp);
	solutionHelper(0);
}

The solutionHelper function is a recursive function that receives a depth or index from the sortedVars vector. It selects the loc variable from that index and uses it to establish the domain of it. The steps are the following:

If the depth or index passed is greater or equal to the size of sortedVars, return true.
Select the variable loc and create a temporary domain set with the current domain of it.
Iterate through the original domain of loc, one variable at a time.
Take the value var and make it the only one in the domain of loc, while also removing it from the neighboring variables that have a constraint with loc.
If there are no inconsistencies, i.e. a neighbor with an empty domain and the recursive call to solutionHelper returns true, return true.
If the previous step failed, return the value of var to the domains of the neighboring variables.
If no variable in the original domain of loc can be used to set the value of the cell without violating any constraints, return false.

bool CSP::solutionHelper(unsigned long depth) {
	if (depth >= sortedVars.size()) {
		return true;
	}
	const Loc *loc = &sortedVars.at(depth);
	std::set<int> domain = domains.find(*loc)->second;
	for (int var : domain) {
		domains.find(*loc)->second = std::set<int>{var};
		std::vector<bool> wasErased;
		for (auto neighbor : arcs.find(*loc)->second) {
			auto *neighborDomain = &domains.find(neighbor)->second;
			if (neighborDomain->count(var)) {
				neighborDomain->erase(var);
				wasErased.push_back(true);
			} else {
				wasErased.push_back(false);
			}
		}
		if (isVarConsistent(*loc) && solutionHelper(depth + 1)) {
			return true;
		}
		int i = 0;
		for (auto neighbor : arcs.find(*loc)->second) {
			if (wasErased.at(i++)) {
				domains.find(neighbor)->second.insert(var);
			}
		}
	}
	domains.find(*loc)->second = domain;
	return false;
}

Putting it all together

The only remaining step is defining the readCSP and printCSP functions. For readCSP, the code is the following:

std::set<int> initialDomain = {1, 2, 3, 4, 5, 6, 7, 8, 9};

void CSP::readCSP() {
	for (uint8_t i = 0; i < numRows; ++i) {
		std::string temp;
		std::cin >> temp;
		for (unsigned long int j = 0; j < temp.size(); ++j) {
			uint8_t curr = temp.at(j) - '0';
			if (curr) {
				domains.insert(std::make_pair(Loc(i, j), std::set<int>{curr}));
			} else {
				domains.insert(std::make_pair(Loc(i, j), initialDomain));
			}
		}
	}
}

Printing the CSP is a matter of iterating through domains and print the domain of each cell. The implementation can be found here

With all of this implemented, the way to put it all together is as simple as this:


int main() {
	CSP csp;

	csp.readCSP();
	ac3Algorithm(csp);
	csp.doBacktracking();
	std::cout << "Output:\n";
	csp.printCSP(true);

	return 0;
}