Skip to main content davidlunadeleon 

Programming a Sudoku generator


Date: 4/2/2021. By: David Luna


  • Repository with complete source code of the project can be found here

Picking up where we left last time

In my previous post I wrote about a Sudoku solver made in C++, which used the AC-3 Algorithm to achieve arc consistency, and then used the Minimum Remaining Values heuristic to guide the search towards a solved Sudoku given a certain input. If you have not read the post, I recommend it as a starting point that will give context to this post. It can be found here

Working smarter, not harder

When faced with the question on how to generate a new sudoku from scratch, it can be hard to come up with a technique that makes sure that the puzzle generate will be valid and solvable. We have to take into account the constraints of the problem, just like we did with the solver algorithm. A cell in the grid can’t have the same value as another cell in the same row, column or 3x3 subsquare.

The good thing is that we have as a starting point an algorithm that can solve any partially filled sudoku we feed to it. The only problem is that it will always produce the same output given the same input (which is not really a bad thing when it comes to solving, but for as a generator it can be a problem). We have to somehow use randomness to feed a partially filled puzzle to the solver, which will give us a complete grid back. Afterwards, we can clear as many cells as we want and there it is, a randomly generated Sudoku puzzle!

The following code is written in GDScript, because I wrote a Sudoku game with the Godot Game Engine. The language is quite readable and nice to work with. Let’s get started.

Writing a solver (again)

Since the explanation for the Sudoku solver was covered in my previous post, I will skip the implementation here. The GDScript version of it can be found here

Writing the generator

As a starting point for our generator, we have to take care of which random input to use and feed to the solver algorithm we previously wrote. A simple approach I found is choosing a random number between 1 and 9 to position in each column of the grid. We have to take care that the numbers are positioned in different rows, and also different 3x3 subsquares. Since the positioning of the number must meet certain constraints, we can use a function that was used in the solving algorithm: is_var_consistent.

func is_var_consistent(key):
	if domains.get(key).size() > 1:
		return false
	var val = domains.get(key).front()
	for neighbor in constraints.get(key):
		var domain = domains.get(neighbor)
		if domain.empty() || (domain.size() == 1 && domain.has(val)):
			return false
	return true

This function takes a key (a [x][y] position in the grid) and checks if its domain has conflicting values with respect to its neighbors. If the domain of the key is bigger than 1, we have nothing to worry about, since deleting one value of the domain won’t leave the key without any other value to choose. If both this cell and another cell that share a constraint such as X =/= Y both have just one value in their domain, then one has to remove the value from its domain and ends up with no other value to choose, leading to an inconsistent state.

We can generate and position the randomly generated number until the initial grid we created is consistent.

func verify_rows(rows):
	for i in range(NCOLS):
		var row = rows[i]
		var key = str(row) + str(i)
		if !is_var_consistent(key):
			return false
	return true

func gen(difficulty):
	var randNum = (randi() % 9) + 1 # a random int where 1 <= randNum <= 9
	var rows = range(9) # rows = [0,1,2,3,4,5,6,7,8]
	var verified = false
	while !verified:
		clear_domains()
		rows.shuffle()
		for x in range(NCOLS): # NCOLS = 9
			var row = rows[x]
			grid[row][x] = randNum
			domains[str(row) + str(x)] = [randNum]
		verified = verify_rows(rows)
    ...

The initial random number to place in the grid is selected randomly. The same goes for the row in which the value of each column will be placed in. The function verify_rows takes care of saying if we can continue or not.

The placing process begins with the function clear_domains, which sets the domain of each cell in the grid to the initial domain of {1,2,3,4,5,6,7,8,9}. If a constraint is violated with the placing of our initial grid, then the function will return false and the process to choose an initial grid will begin again. Each iteration inside the while loop we clear the grid to start from scratch and end up with only the randNum in it. After this step we could end up with something like this:

Partially filled sudoku

Once we have this partially filled grid, it can be fed to the solving algorithm, which will in turn solve the puzzle for us. With a solved Sudoku, we just have to delete a clear a random number of cells in the grid, with the random number based on the difficulty chosen by the user. The position of the cleared cells is also chosen randomly.

	...
	solve()
	var keys_to_del = domains.keys().duplicate()
	keys_to_del.shuffle()
	var n_to_del
	if difficulty == 0: 		# easy
		 n_to_del = rng.randi_range(15, 25)
	elif difficulty == 1: 	# medium
		 n_to_del = rng.randi_range(30, 45)
	else: 				# hard
		 n_to_del = rng.randi_range(50, 60)
	keys_to_del.resize(n_to_del)
	while !keys_to_del.empty():
		var key = keys_to_del.pop_back()
		var x = key[0].to_int()
		var y = key[1].to_int()
		clear_cell(Vector2(x, y))

Since we already have the problem modelled as a CSP, we can take the list of keys in our domains dictionary to get a list of all cells in the grid. To randomly select which keys to delete, we can shuffle this array and then select a random number of keys to delete. This is just what we do in the if part of the function, the number of keys to delete is selected randomly based on difficulty and then the cells are cleared calling the clear_cell function, which just sets the domain of the cell to the initial domain.

The result

Generated sudoku grids

The result is a unique and solvable Sudoku grid. This algorithm is quite simple in its approach to generating Sudoku puzzles and it’s not perfect. It doesn’t make sure that the generated puzzle has a single possible solution. This extra step could be added at the end to make the generated grid even better, but I will leave it as it is. Feel free to take it as a base for a new Sudoku generating algorithm.