Benchmark: kultainen esimerkki + zensical-dokumentointiohjeet

- golden-examples/todo/: 6/6 PASS referenssitoteutus
  - SQLAlchemy 2.0 (DeclarativeBase, Mapped, mapped_column)
  - Pydantic v2 (ConfigDict)
  - PEP 621 pyproject.toml, Python >=3.14
  - Uniikki testidata per testi
- CODE_SYSTEM päivitetty: few-shot kultaisesta esimerkistä
- DOCUMENTATION.md: zensical-dokumentointiohjeet
This commit is contained in:
2026-04-14 07:28:47 +03:00
parent 8f154a578c
commit d6a544909c
7 changed files with 311 additions and 17 deletions

View File

@@ -0,0 +1,84 @@
# Dokumentointiohjeet — Zensical
Hyvä dokumentointi kertoo **mitä asia ON**, ei mitä se tekee. Se on kuin zen-koan: lyhyt, tarkka, riittävä.
## Periaatteet
1. **Yksi rivi riittää.** Jos tarvitset kappaleen, koodi on liian monimutkainen.
2. **Kerro mitä, älä miten.** `"""Tietokantamallit — SQLAlchemy 2.0, SQLite."""` ei `"""This module creates database models using SQLAlchemy..."""`
3. **Älä toista koodia.** Jos funktio on `create_todo`, docstring ei ole "Creates a todo".
4. **Suomi tai englanti, ei molempia.** Valitse yksi kieli per projekti.
5. **Ei täytesanoja.** "This module provides functionality for" → poista.
## Mitä dokumentoidaan
| Kohde | Dokumentointi | Esimerkki |
|-------|--------------|-----------|
| **Moduuli** (.py) | Aina. Yksi rivi: mitä tiedosto sisältää. | `"""Pydantic v2 -skeemat — Create ja Response."""` |
| **Luokka** | Aina. Mitä entiteetti edustaa. | `"""Tehtävä — otsikko, deadline, prioriteetti."""` |
| **Funktio** | Vain jos nimi ei kerro kaikkea. | `get_db``"""Tietokantasessio per pyyntö."""` |
| **CRUD-endpoint** | Ei. Nimi + HTTP-metodi riittää. | `create_todo`, `list_todos` — itsedokumentoivia |
| **Testi** | Ei. Testin nimi on dokumentaatio. | `test_get_todo_not_found` — selvä |
| **Konfiguraatio** | Kommentti vain jos arvo yllättää. | `check_same_thread: False # SQLite + FastAPI` |
## Mitä EI dokumentoida
- Importteja
- Ilmeisiä parametreja (`item_id: int`)
- Tyyppivihjeitä jotka kertovat saman asian
- Geneerisiä "boilerplate"-docstringejä
## Esimerkkejä
### Hyvä (zensical)
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
...
def get_db():
"""Tietokantasessio per pyyntö."""
...
```
### Huono (verbose)
```python
"""
This module defines the database models for the Todo application.
It uses SQLAlchemy ORM to create the database tables and provides
the session factory for database connections.
"""
class Todo(Base):
"""
Represents a todo item in the database.
Attributes:
id: The unique identifier for the todo item.
title: The title of the todo item.
...
"""
...
```
### Huono (tyhjä)
```python
# Ei docstringejä ollenkaan — lukija ei tiedä mikä tiedoston rooli on
class Todo(Base):
__tablename__ = "todos"
...
```
## Tarkistuslista
Generoitu koodi on hyvin dokumentoitu kun:
- [ ] Jokainen .py-tiedosto alkaa yksirivisellä docstringillä
- [ ] Jokainen luokka kertoo mitä entiteetti edustaa
- [ ] Docstringit ovat saman kielen kuin muu koodi
- [ ] CRUD-endpointeilla ei ole turhia docstringejä
- [ ] Kommentteja on vain siellä missä koodi yllättää

View File

@@ -0,0 +1,61 @@
"""FastAPI CRUD — yksi endpoint-setti per entiteetti."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Todo
from schemas import TodoCreate, TodoResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.post("/todos/", response_model=TodoResponse, status_code=201)
def create_todo(item: TodoCreate, db: Session = Depends(get_db)):
db_item = Todo(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/todos/", response_model=list[TodoResponse])
def list_todos(db: Session = Depends(get_db)):
return db.query(Todo).all()
@app.get("/todos/{item_id}", response_model=TodoResponse)
def get_todo(item_id: int, db: Session = Depends(get_db)):
item = db.query(Todo).filter(Todo.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Todo not found")
return item
@app.put("/todos/{item_id}", response_model=TodoResponse)
def update_todo(item_id: int, item: TodoCreate, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/todos/{item_id}", status_code=204)
def delete_todo(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
db.delete(db_item)
db.commit()

View File

@@ -0,0 +1,30 @@
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)

View File

@@ -0,0 +1,11 @@
[project]
name = "todo-app"
version = "0.1.0"
requires-python = ">=3.14"
dependencies = [
"fastapi",
"uvicorn[standard]",
"sqlalchemy",
"pytest",
"httpx",
]

View File

@@ -0,0 +1,22 @@
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
"""Uuden tehtävän luonti. Pakolliset: title."""
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
"""Palautettava tehtävä — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)

View File

@@ -0,0 +1,69 @@
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert response.json()["title"] == "Osta maitoa"
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava tehtävä"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava tehtävä"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
response = client.put(
f"/todos/{created['id']}", json={"title": "Uusi otsikko"}
)
assert response.status_code == 200
assert response.json()["title"] == "Uusi otsikko"
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 404

View File

@@ -11,7 +11,11 @@
*/
import { execSync } from 'child_process';
import { writeFileSync, mkdirSync, rmSync, existsSync } from 'fs';
import { writeFileSync, readFileSync, mkdirSync, rmSync, existsSync } from 'fs';
import { dirname, join } from 'path';
import { fileURLToPath } from 'url';
const __dirname = dirname(fileURLToPath(import.meta.url));
// === CLI-argumentit ===
const args = process.argv.slice(2);
@@ -141,15 +145,29 @@ Blog → Author: name,email,bio(Text|None) / Post: title, content(Text), author_
const FIX_SYSTEM = 'You are a Python code fixer. Return ONLY the corrected Python file. No markdown fences, no explanations — just valid Python code.';
// === Kultainen esimerkki ===
const GOLDEN_DIR = join(__dirname, 'golden-examples', 'todo');
const GOLDEN_FILES = ['models.py', 'schemas.py', 'main.py', 'test_main.py', 'pyproject.toml'];
function loadGoldenExample() {
if (!existsSync(GOLDEN_DIR)) return '';
let example = '\nREFERENCE IMPLEMENTATION (todo project — follow this exact structure, style, and conventions):\n\n';
for (const f of GOLDEN_FILES) {
const path = join(GOLDEN_DIR, f);
if (existsSync(path)) example += `=== ${f} ===\n${readFileSync(path, 'utf-8').trim()}\n\n`;
}
return example;
}
const GOLDEN_EXAMPLE = loadGoldenExample();
const CODE_SYSTEM = `You are a Python backend developer. Generate a complete FastAPI project with SQLAlchemy and SQLite.
Given the project requirements and JSON specification, generate these 5 files:
Given the project requirements, JSON specification, and a REFERENCE IMPLEMENTATION, generate these 5 files:
1. models.py - SQLAlchemy models with database setup (create_engine, declarative_base, sessionmaker, Base.metadata.create_all)
2. schemas.py - Pydantic schemas (Create + Response for each entity, use ConfigDict(from_attributes=True))
3. main.py - FastAPI application with full CRUD endpoints for each entity
4. test_main.py - Pytest tests using TestClient with separate test database and dependency override
5. pyproject.toml - Project configuration with dependencies
1. models.py SQLAlchemy 2.0: DeclarativeBase, Mapped, mapped_column (NOT legacy declarative_base)
2. schemas.py Pydantic v2: ConfigDict(from_attributes=True) (NOT class Config)
3. main.py FastAPI CRUD endpoints for each entity
4. test_main.py Pytest with TestClient, separate test.db, unique test data per test
5. pyproject.toml PEP 621 [project] format (NOT [tool.poetry])
OUTPUT FORMAT — use these exact markers to separate files:
@@ -168,18 +186,17 @@ OUTPUT FORMAT — use these exact markers to separate files:
=== pyproject.toml ===
<toml content>
DOCUMENTATION — every file must have a one-line module docstring. Classes get a one-line docstring. Keep it zensical: say what it IS, not what it does. No filler.
RULES:
- SQLite: create_engine("sqlite:///./app.db", connect_args={"check_same_thread": False})
- Each model: auto-increment "id" Column(Integer, primary_key=True, index=True)
- Schemas: BaseModel with ConfigDict(from_attributes=True) for Response variants
- Endpoints per entity: POST (create, 201), GET (list), GET by id (404 if missing), PUT (update), DELETE (204)
- Tests: separate test.db, override get_db dependency, use TestClient
- pyproject.toml: fastapi, uvicorn[standard], sqlalchemy, pytest, httpx
- Status fields: String(20) with default, NEVER Enum
- Follow the REFERENCE IMPLEMENTATION patterns exactly
- SQLAlchemy 2.0: DeclarativeBase + Mapped + mapped_column (not Column())
- Python type unions: str | None (not Optional[str])
- pyproject.toml: PEP 621 [project] format, requires-python = ">=3.14"
- Tests: unique descriptive data per test, NOT generic "test_title" strings
- Absolute imports only (from models import ..., from schemas import ...)
- Python booleans: True/False/None (not true/false/null/none)
- NO markdown fences inside file content — just raw code
- Every _id foreign key field MUST have ForeignKey("table.id") constraint`;
- Only test endpoints that exist in main.py — no extra tests`;
// === Tiedostoparseri LLM-vastauksesta ===
function parseGeneratedFiles(text) {
@@ -285,7 +302,7 @@ async function runPipeline(model, scenario) {
// 3. LLM-koodigenerointi
console.log(` [3/5] Koodigenerointi (LLM)...`);
const codePrompt = `PROJECT REQUIREMENTS:\n${req.text}\n\nJSON SPECIFICATION:\n${JSON.stringify(spec, null, 2)}\n\nGenerate the complete project with all 5 files.`;
const codePrompt = `${GOLDEN_EXAMPLE}\n---\n\nPROJECT REQUIREMENTS:\n${req.text}\n\nJSON SPECIFICATION:\n${JSON.stringify(spec, null, 2)}\n\nGenerate the complete project with all 5 files. Follow the reference implementation patterns exactly.`;
const codeResp = await ollamaChat(model, codePrompt, CODE_SYSTEM, 8192);
timings.push(codeResp);
writeFileSync(`${dir}/_code_raw.txt`, codeResp.text);