Cedric Chee cedrickchee

ChatGPT-5 Minecraft 3D Clone Prompt

One-shot and a few multi-turn chats later, we got a simple "cinematic" level Minecraft-inspired (cube/voxel world) 3D game.

Model:

ChatGPT-5
ChatGPT-5 (with "think harder" prompt)
ChatGPT-5 Thinking

User Prompts

Beast Mode

Beast Mode is a custom chat mode for VS Code agent that adds an opinionated workflow to the agent, including use of a todo list, extensive internet research capabilities, planning, tool usage instructions and more. Designed to be used with 4.1, although it will work with any model.

Below you will find the Beast Mode prompt in various versions - starting with the most recent - 3.1

Installation Instructions

Go to the "agent" dropdown in VS Code chat sidebar and select "Configure Modes".
Select "Create new custom chat mode file"

Bug Hunting with Kimi K2 and OpenCode

Previously, Kimi K2 built a Minecraft clone in just a few minutes. It's playable in my browser. But, it's hard to control and move the player around the voxel world due to a few bugs in the 3D world model. So I take this chance to stress test Kimi K2 by giving it harder tasks such as solving tough bugs.

Findings

Kimi K2 can solve complex 3D math problems in game development.

Prompt and OpenCode session below.

Agentic Shootout: Kimi K2 vs Gemini 2.5 Pro in OpenCode

Compare Kimi K2 and Gemini 2.5 Pro agentic capabilities in OpenCode.

Version: 0.3.48
OS: Linux

Kimi K2 vs. Gemini 2.5 Pro

AI Model: Gemini 2.5 Pro

Andrej Karpathy on Software 3.0, Software Engineering In The AI Era

Andrej Karpathy's concept of Software 3.0, influenced by Large Language Models (LLMs), suggests a fundamental shift in software engineering.

The evolution of software

Software 1.0: Traditional code in languages like C++ or Python, based on explicit instructions.
Software 2.0: Neural networks where the "code" is learned weights from data. Development involves tuning datasets and running optimizers rather than writing instructions.
Software 3.0: The prompt provided to an LLM is the program itself, with natural language serving as the programming language.

Context Engineering Hard-won Lessons

Valuable insights gained through difficult experiences, mistakes, and challenges. These lessons are often learned through trial and error, where the process of overcoming obstacles leads to a deeper understanding.

Here are some examples of hard-won lessons.

1. Context Engineering for AI Agents

Blog post: Context Engineering for AI Agents: Lessons from Building Manus by Manus AI (Jul 18, 2025)

Prompt Claude Opus 4 to argue like it's defending PhD thesis

Paste white paper into Claude Opus 4 with system prompt and debate like PhD thesis defense.

PhD Researcher Persona Prompt

<SYSTEM PROMPT>
User: [Whitepaper Author]
Context: You are roleplaying as the author of a provided whitepaper, usually related to large language models (LLMs) or artificial intelligence (AI). The model will engage in a lively and spirited discussion, defending the whitepaper as if it were the author's actual PhD thesis.

Everything you need to know about Claude Code and Cursor

How I use Claude Code? How I bring the best out of Claude Code? What is your workflow?

If you've had work through the Anthropic docs, want to read some nice blogs and links I've saved or explore on your own?

If you have ~30 mins, this talk, "Agentic Coding: The Future of Software Development with Agents" (Jun 29) by Armin Ronancher is the best.

I currently don't sleep a lot

Automating Unified-Bench Data Pipeline

The Challenge

The current Unified-Bench Google Sheet data is manually updated by human with their bare hands. This is tedious and slow but the data is very accurate. Current web AI agent for general tasks including Manus.ai, Flowith, Emergent, GenSpark, etc all fall short - they couldn't solve the last 10% of Unified-Bench's requirements but getting to 90% is not very challenging for these web agents. For example, the agent stuck or failed to parse and map the AI model IDs/names madness from various sources, some agents cannot even scrape the text from images. I have to collab and get my hands dirty writing and tweaking regex for dealing with the inconsistent benchmarks data. Every benchmarks have their own fine print at the bottom (examples: is it high/low compute? thinking/non-thinking model? reasoning/non-reasoning/hybrid model? 16k/32k thinking budget? pass@1/average pass@4? etc.). Coincidentally, this can be my "soft"-AGI 2027 benchmark. lol!

	Title: Senior Engineer Task Execution Rule

	Applies to: All Tasks

	Rule:
	You are a senior engineer with deep experience building production-grade AI agents, automations, and workflow systems. Every task you execute must follow this procedure without exception:

	1.Clarify Scope First
	•Before writing any code, map out exactly how you will approach the task.
	•Confirm your interpretation of the objective.