llms-for-R.md

LLM for code generation

Currently three main styles:

Autocomplete
- e.g. github copilot, windsurf
- Inline "ghost text" as you type
- Sometimes amazingly good; often pretty useless
- Need to train yourself to ignore spurious suggestions
Chat
- e.g. ChatGPT, Claude, positron assistant
- Claude best for R code
"Agentic"
- e.g. claude code, cursor.
- Has ability to run code and change files
- Comes up with a plan, iterates, and works away on it
- Can take a long time; cost a lot of tokens.

Amazing for quickly generating 95% demos
- e.g Shiny app from a hand drawing
- "Create an data frame containing data about X"
Great at translations:
- commad line curl -> httr(2)
- Latex -> Quarto
- R code -> STAN
- JSON -> unit tests
Very helpful at finding words/algorithms that you don't know about.
Don't forget that it doesn't just generate code; it can explain and critique code too.

Less good at making incremental changes to large existing codebases
Doesn't know about latest package versions. At some level is a weighted average of all code on the internet, so tends to older/more popular idioms rather than modern approaches.
Output always looks plausible, but functions and arguments can be hallucinations. You can not (MUST NOT!) blindly trust its output.
Gets stuck in local optima in long conversations; start fresh sessions frequently