How It Works

Wobble is powered by a reinforcement learning agent trained to solve Wordle-like puzzles using a simplified Q-learning algorithm.

The Learning Process

The agent plays Wordle-style games repeatedly, learning from trial and error. Each game is an episode, and after thousands of episodes the model improves its ability to pick strong guesses.

State: After each guess, the environment provides feedback about how many letters are correct and in the right position (greens), and how many letters are correct but misplaced (yellows).
Actions: The agent chooses a word to guess. Over time it learns which guesses lead to better outcomes.
Reward: The agent receives positive points for getting greens and yellows, a large bonus for solving the word, and a penalty if it fails within 6 tries. if (guessed): reward += 100 - 15 * log2(7 - turn) else: reward -= 1000

Learning is done using the Q-learning update rule, which adjusts the value of each state-action pair:


Q[s][a] ← Q[s][a] + α * (reward + γ * max(Q[s’]) -
            Q[s][a])

α is the learning rate (how much new info overrides old).
γ is the discount factor (importance of future rewards).
max(Q[s’]) is the best future value from the next state.

Training

The agent is trained by playing against a large set of possible words. Over time, it learns which strategies increase the chance of solving the puzzle within the allowed attempts. The results of training are stored in a Q-table, which the bot then uses to make informed decisions during play.

Outcome

After training, the agent is able to approach Wordle systematically — starting with informative guesses, narrowing down possibilities, and converging on the correct word more efficiently than random play.

Note:

No official Wordle™ code, data, or other resources are used.
All training is done locally with custom word lists and environments.

Wordle is a trademark of The New York Times Company. This project is not affiliated with or endorsed by The New York Times Company.