Puzzle generation

Oct 6, 2024

This is the third post in a dev log about Spilled Mushrooms, a turn-based card solitaire puzzle game available exclusively for Playdate.

As noted in the previous post about inspirations and design philosophies, I wanted Spilled Mushrooms to guarantee every puzzle is solvable. During the initial prototyping phase, many puzzles were simply unsolvable, which led to a lot of frustration. If I struggled to solve a puzzle, there was no way to know if it was because I was missing some key strategy, or if the puzzle just sucked.

The earliest naive attempt at guaranteeing solvability was to explore the entire search space until a solution was found or the space was exhausted. This proved to be extremely problematic on the Playdate, whose tiny little CPU started sweating at just the thought of evaluating nearly 280k move trees. Since thoroughly evaluating a puzzle was out of the question, I needed to look for a procedural approach that could guarantee solvability.

Looking at other puzzle games, it's fairly common to design a puzzle backwards: start at the goal and move backwards until you reach a starting state you're happy with, or run out of moves / time. This is a particularly effective strategy for games like mahjong solitaire, which can be dealt in reverse to guarantee solvability. Unfortunately, Spilled Mushrooms has effects that can alter the results of turns before or after it. Consequently, playing cards in reverse doesn't really have any advantage over just playing them forwards, as it's possible that earlier turns can enhance or impede later turns, causing the game to over- or undershoot the original target.

Instead, the final implementation uses a different approach, suggested by my brother: instead of starting at the goal and playing backwards until you reach a good starting state, just start at the beginning and play forwards until you reach a good goal:

shuffle and draw 3 area cards and 8 critter cards
start each area with an arbitrarily high number of mushrooms
play randomly for 7 turns
calculate the number of mushrooms gathered from each area
use the gathered mushrooms as the starting counts

In a sense, this can be thought of as playing in reverse: rather than collecting mushrooms, the played critters are spilling the mushrooms as they go. However, the sequencing of the player's and the generator's moves will be the same instead of reversed.

Finding fun

Unfortunately, evaluating a totally random move tree will almost always yield a very boring puzzle. Over half of all move trees completely skip an area or send a majority of critters to one area, leaving the other areas suspiciously lacking in mushrooms. Combined with the various trait effects in the game, any given random puzzle was very likely to come up with at least one area below 5 mushrooms, and frequently with an area at 0 mushrooms. Puzzles with these types of mushroom distributions are incredibly obvious to solve.

To counteract this, the puzzle generator uses two tricks. First, it guarantees that each area will have between one and three cards played at it. This narrows the search space from nearly 280k move trees to about 134k move trees and substantially increases the likelihood that a puzzle provides some challenge to the player by eliminating a lot of very obvious solo critter plays. Second, it repeatedly tries random move trees until it finds a puzzle with at least 10 mushrooms in each area. Because most critters are loosely balanced around collecting 4-6 mushrooms in isolation, this usually results in a solution with a fairly balanced distribution of critters. Several other acceptance criteria were experimented with, but most of them had a tendency to make incredibly boring puzzles.

Disguising the delay

While the second trick ensures a better player experience, it comes at the expense of computation time. After a moderate (but not extensive) amount of optimization, each random move tree takes approximately 30-40ms to evaluate on hardware*. An acceptable puzzle can usually be found for a given draw within about 30 random attempts, which translates to about one second of waiting. However, not every set of random cards can even meet these constraints: some areas with restrictive or negative traits can impede a set of 8 critters sufficiently that no move tree will collect more than 10 mushrooms from the area. To avoid getting trapped in an unlucky draw, the area and critter cards are periodically discarded and redrawn.

To cover up this "loading" time, a shuffling animation quickly slides the currently drawn critter cards across the screen. When evaluating an entire move tree each frame, the animation chokes the CPU and frequently drops frames in inconsistent and unpleasant ways. To reduce the CPU load, the generator evaluates only four turns per frame, meaning one full move tree is evaluated every 1.75 frames. This slightly slows down the generation time as some of the frame budget is wasted each frame, but the animation quality is greatly improved. Because the animation plays for a fixed duration (48 frames, or 1.6 seconds), at least 27 move trees are evaluated per shuffle**, with some being pruned early for invalid moves. Out of all the evaluated move trees, the puzzle with the highest total mushroom count is selected, as high mushroom counts tend to result from satisfying combos. The garbage collector is disabled during generation and triggered as soon as all of the cards are off screen*** to help conceal the large pause it causes.

* Spilled Mushrooms is written entirely in Lua. I most certainly could have squeezed a lot more performance if I wrote the core game logic in C, but, uh, I didn't. I did have an experiment where I designed a domain-specific language for the game's mechanics, but abandoned it due to timeline constraints with the planned launch date.

** Changing the duration of the shuffle animation changes the number of random move trees attempted, which can change the results in interesting ways. For example, increasing the number of attempts tends to compress the appearance rates of all cards, yielding a more balanced suite of puzzles. After analyzing several shuffle durations, I found 48 frames to be the sweet spot, with a shuffle success rate of just under 50%. Decreasing the shuffle speed to 16 frames (0.53 sec) yielded a shuffle success rate of just under 30%, while increasing it to 112 frames (3.73 sec) yielded a success rate of just over 60%.

*** Annoyingly, I think the final build shipped with an off-by-one error that causes the last card to just barely be on screen when the garbage collector is triggered. Such is life.

Designing within the generator

For the puzzle generator to work as intended, traits need to be designed with it in mind. In particular, two traits designed before the puzzle generator needed to be reworked for compatibility: Ambush and Burrow.

Ambush was intended to emulate the feeling of waiting until the last second to attack, and so it had originally been designed to do exactly that: a critter with this trait would not collect mushrooms until it was able to collect all of the remaining mushrooms. For the generator, this effectively meant it would never collect mushrooms, as the arbitrarily high mushroom count during generation was never low enough to trigger Ambush. This meant that in the generator's eyes, any critter with Ambush was a dead card: puzzles that included it either didn't use it, or ended up far too easy because they only needed six critters to solve.

Burrow was designed to mimic the cartoonish behavior of a Gopher popping in and out of holes: whenever an area ran out of mushrooms, a Burrow critter would migrate to the area with the most mushrooms left. Again, the generator struggled: with no areas ever being reduced to zero, Burrow never activated. This meant that puzzles generated using Gopher were often quite easy as well: by playing the Gopher early, a player could really capitalize on its trait with synergies like Forest's Abundant, Moose's Inspiring, or Elephant's Nurturing.

Because these two traits inherently led to puzzles being too easy, it made the traits themselves feel bland and uninteresting. Any puzzle with a Gopher or Tiger was almost always completely trivial. Consequently, both of them received a redesign that worked within the constraints of the new puzzle generator. That is, they behave the same regardless of the number of mushrooms in a given area. Ironically, I find that Gopher tends to find itself in some of the hardest puzzles now.

Discovering synergies and bugs

One of the most interesting outcomes of the puzzle generator is that it's able to "discover" interesting combos and synergies to use in puzzles without me being aware of them. This often puts me, the designer and creator of all of the game's mechanics, in a very peculiar position, loosely summarized as: "how the #$@! am I supposed to collect 85 mushrooms in 7 turns?"

I'm confident this is a feeling shared by the designers of many other card games once players get their hands on them. There are countless cards that have forced rule changes or bans as the players discover combos the designers never thought of that "break" the game in one way or another*. However, Spilled Mushrooms strikes me as unique because it's not the players discovering the broken combos - it's the game.

* On Playdate, Hand of the Divine received a balance patch that reset the leaderboards because of an unintended combo that completely dominated the high scores.

The puzzle generator is inherently unaware of any particular card mechanics or synergies - it just plays cards randomly until it's happy with itself. However, because it selects the puzzle with the highest mushroom count from each shuffle, it tends to lean towards puzzles that rely on strong combos to hit the target mushroom counts and avoid puzzles that play cards in suboptimal ways - even if it doesn't understand what a combo or a mistake is.

On the flip side, it's also able to discover bugs! Early in development I stumbled upon a puzzle that took me well over an hour of playing before giving up and revealing the solution. The reason I wasn't able to solve it is because I wasn't aware of a bug in the interaction between two cards. The puzzle generator, unaware of what's intended and what isn't, accepted it happily.

Difficulty

Despite the generator preferring high-count puzzles, it does regularly build puzzles using suboptimal plays. Rather than seeing this as a problem, I believe it's an essential part of the game's difficulty. If every puzzle requires optimal play for every turn, then puzzles become exceedingly difficult. However, by introducing the occasional "mistake" during generation, it reduces the average difficulty of puzzles and gives the players some freedom to choose their own strategy and play style.

To further pull away from viewing it as a negative, the game will reward players with trophies for one-upping the generated solution. For example, completing a delivery early or playing a critter that collects zero mushrooms. These trophies can add an extra challenge for players, as well: if the puzzles are too easy, a player can change their focus from just completing the delivery to completing the delivery with a particular trophy in mind, or trying to collect as many trophies for each puzzle as they can. Players may also find it particularly challenging to find a solution that collects exactly as many mushrooms as the generated solution.

When it comes to quantifying difficulty, I've found no reliable metrics from my own experience. There are some general trends that can be observed, such as a lower number of valid solutions or higher mushroom total generally mean a puzzle is more difficult, but there are plenty of counterexamples for any trend that I observed. The strongest correlation I've seen between number of attempts taken or total time to solve is the presence of a Gopher. I do not enjoy puzzles with a Gopher.

Balancing cards

A more pleasant consequence of the puzzle generator is that cards generally don't need to be balanced relative to each other. While each critter is generally designed to collect between 4 and 6 mushrooms in isolation, if there's a powerful combo that allows it to collect substantially more than that, the puzzle generator will compensate by generating a puzzle with a higher mushroom count.

However, card balance does still need to be considered. For example, critters that are too weak or circumstantial won't appear in puzzles as often, and when they do appear, will often go unused. Likewise, areas with overwhelmingly restrictive traits will rarely show up in a puzzle. Most cards are designed with a handful of strong synergies in mind, and restrictive traits tend to have several counters in place. Unfortunately, with single digit stat lines, it can be difficult to fine tune the potency of a trait, so some traits do fall out of the target appearance rate.

I believe I covered many of the important aspects of the puzzle generator here, but feel free to reach out to me directly! You can find me on Discord as @scizzorz or reach out to me via email from the support link on Catalog.