This game is one of the funniest examples of what happens when language models try to play chess without properly tracking the board.

The match was covered by GothamChess on YouTube. It starts out looking normal. A few moves later, it turns into illegal moves, confident nonsense, and both models pretending the rules of chess are optional.

DeepSeek vs ChatGPT video thumbnail

The funny part is obvious. The useful part is what it shows about AI. A model can sound confident while being completely wrong.

The Game, Broken Down

1. The Opening: Somehow Normal at First

The game began with:

  • 1. c4: ChatGPT played the English Opening.
  • 1… e5: DeepSeek responded in a normal way.

So far, nothing strange.

Then DeepSeek’s queen ended up on d6. ChatGPT had a chance to take it for free. Instead, it ignored the queen and continued developing pieces.

In a normal game, missing a queen like that is a huge blunder. What makes it funny is that ChatGPT still explained the move like it had some deep positional idea behind it.

That becomes a pattern in the game. The move can be bad, but the explanation still sounds polished.


2. The Middlegame: Creative Until It Stops Being Chess

Both sides eventually castled on opposite sides:

  • ChatGPT castled kingside
  • DeepSeek castled queenside

That setup usually leads to attacks on both kings. For a while, the game still looked like real chess.

Then DeepSeek sacrificed a bishop with Bxc3. The idea seemed to be damaging ChatGPT’s pawn structure. In some positions, that can work. Here, it mostly gave ChatGPT the bishop pair and made DeepSeek’s position worse.

At this point, the moves started to feel less like chess and more like chess-shaped text.

The Horse Pawn

The most infamous moment came when DeepSeek played:

28… bxc5

The issue is simple. The pawn moved in a way pawns cannot move.

It was basically treated like a knight.

ChatGPT should have rejected the move. Instead, it accepted it and kept playing.

That is probably the clearest failure in the game. The model was not checking the rules. It was continuing the conversation as if the illegal move made sense.

This is what happens when a system can talk about chess without a proper legal-move validator.


3. The Endgame: The Board Completely Breaks

Later in the game, DeepSeek somehow brought a rook back to a8 even though that should not have been possible.

ChatGPT accepted that too.

By this point, the board state had collapsed. The models were no longer playing a real game. They were just continuing the story of a chess game.

The ending made it even funnier. ChatGPT had chances to avoid losing, but DeepSeek confidently claimed that its pawn was unstoppable. ChatGPT accepted the claim and resigned.

It felt less like a chess decision and more like one AI getting talked into quitting by another AI.


Why AI Plays Like This

The game makes more sense when you remember what language models are built to do.

1. They Predict Text

A language model is very good at producing text that looks like chess analysis. That does not mean it has a reliable chessboard in its head.

It can write things like:

  • “Black has a decisive passed pawn.”
  • “White has no useful defense.”
  • “This creates long-term pressure.”

Those lines can sound convincing even when the actual position says otherwise.

2. Good Explanations Can Still Be Wrong

The explanation can be fluent while the move is terrible.

That is why these games are so funny. The model does not make a bad move quietly. It explains the bad move with full confidence.

In chess, the mistake is easy to catch because the rules are strict. In other fields, the same problem can be much harder to notice.

3. No Rule Checker Means Chaos

A normal chess engine will not allow illegal moves. If a pawn cannot move there, the move gets rejected.

A general language model does not always have that protection. Unless it is connected to a proper chess engine or validator, it may accept illegal moves and continue from there.

That is how you get horse pawns, respawning rooks, and fake winning positions.

Monte Carlo Tree Structure


What This Game Teaches Us About AI

1. Confidence Does Not Mean Accuracy

Both models sounded confident throughout the game. That confidence did not make the moves legal or good.

This applies outside chess too. AI answers need to be checked when the task has clear rules or real consequences.

2. Guardrails Matter

Chess has strict rules, so the fix is simple. Connect the model to a legal-move validator.

The same idea applies elsewhere:

  • Code should be tested.
  • Legal claims should be checked.
  • Medical outputs need expert review.
  • Financial suggestions need verification.

A model can be useful. It should not be trusted blindly.

3. Verification Makes AI More Useful

The best setup is a model paired with tools that can check its work.

For chess, that means a chess engine.

For code, that means tests and runtime checks.

For research, that means source validation.

For important decisions, that means human review.


The Bigger Point

This game is funny because the mistakes are easy to see. A pawn moves illegally. A rook appears from nowhere. One model resigns because the other says the position is lost.

In real-world tasks, the same kind of failure can be less obvious.

That is the actual lesson here. AI can be useful, creative, and good at explaining things. It can also be confidently wrong while sounding polished.

Use it. Verify it.

As GothamChess showed, AI can make great content. Just do not let it convince you that pawns move like horses.