377 Commits

Author SHA1 Message Date
2d1b1d3ec6 initial commit: agentic office
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:14:39 +03:00
1d701ae75e CodeBench: --boss orkestroija — iso malli analysoi, pieni korjaa
- --boss + --boss-ollama: iso malli tekee spekin JA analysoi käännösvirheet
- Boss muotoilee selkeät korjausohjeet ("add import X, change line Y")
- Worker (pieni malli) toteuttaa korjauksen täsmällisen ohjeen perusteella
- Boss ei generoi koodia itse — vain analysoi ja ohjaa
2026-04-15 00:51:31 +03:00
a32c4787f8 CodeBench: plaintext-speksi pienille malleille
- spec-plain.md: "entity Author (authors): name string, email string"
- extractPlainSpec() parseri plaintext → {entities, relationships}
- Small-profiili käyttää plain-formaattia, large JSON
- specText muuttuja: plaintext tai JSON prompteihin
- Ei voi mennä syntaktisesti rikki kuten JSON
2026-04-15 00:37:34 +03:00
6ccf6fb0e1 CodeBench: stripaa === marker === file-by-file outputista 2026-04-15 00:28:57 +03:00
a3ea0c2fda CodeBench: file-by-file build-validointi vasta kun kaikki tiedostot valmiina
- Vaihe 1: generoi kaikki 4 tiedostoa peräkkäin (konteksti kasvaa)
- Vaihe 2: go build kaikilla tiedostoilla → virheet per tiedosto → korjaus
- Ratkaisee: models.go yksinään ei käänny (ei main-funktiota)
- Virheet ryhmitellään tiedostoittain, korjataan vain viallinen tiedosto
2026-04-15 00:21:36 +03:00
3b1a02a9af CodeBench: go build output ei vuoda terminaaliin (stdio: pipe) 2026-04-15 00:12:57 +03:00
56133b5d19 CodeBench: siistimpi file-by-file virheilmoitus 2026-04-15 00:12:20 +03:00
9670c85750 CodeBench: file-by-file välitön go build -validointi + korjausloop
- Jokainen tiedosto (paitsi _test.go) validoidaan go buildilla heti
- Käännösvirheet palautetaan mallille korjattavaksi (max 2 yritystä)
- Virhe korjataan yhden tiedoston kontekstissa, ei koko projektia
- Testitiedosto validoidaan vasta lopussa go testillä (vaihe 5)
2026-04-15 00:10:06 +03:00
20a1e5f015 CodeBench: --convert-model Python→Go pipeline
- 8b generoi Pythonia (osaa sen), 30b konvertoi Go:ksi
- convert-go.md prompti: FastAPI→Chi, SQLAlchemy→database/sql mappaukset
- Koodigenerointi käyttää Python golden+promptia kun convert-model asetettu
- Vaihe [3.5/5] konvertointia varten
2026-04-14 23:54:45 +03:00
a16c33f4fb CodeBench: go.mod-generointi ennen missing-tarkistusta 2026-04-14 23:37:09 +03:00
afef340eb8 CodeBench: file-by-file logiin tokenit, rivimäärä ja tok/s per tiedosto 2026-04-14 23:36:29 +03:00
a65a25c56c CodeBench: --file-by-file generointi pienille malleille
- Generoi yksi tiedosto kerrallaan: models.go → handlers.go → main.go → tests
- Edellisten tiedostojen koodi kontekstissa seuraavalle
- Max 2048 tok per tiedosto (vs 10240 kaikki kerralla)
- go.mod generoidaan aina golden examplesta (ei mallin tuotoksesta)
- Promptissa "Write ONLY the file X" + "Start with package main"
2026-04-14 23:31:20 +03:00
178bef1277 CodeBench: korvaa go.mod aina golden versiolla — pienet mallit tuottavat vääriä moduulipolkuja 2026-04-14 23:14:07 +03:00
1649d2e864 CodeBench: --spec-ollama lippu eri Ollama-instanssille spec-vaiheissa 2026-04-14 23:05:07 +03:00
3caefa2f6e CodeBench: automaattinen go.mod-korjaus pienille malleille 2026-04-14 23:02:44 +03:00
65e7365e75 Poistettu virheelliset 8b Go-tulokset (väärä promptti: code-small → Python) 2026-04-14 23:00:31 +03:00
9aa4a46768 CodeBench: kielisuffiksi priorisoituu prompttivalinnassa (code-go > code-small) 2026-04-14 22:58:14 +03:00
61966783e3 CodeBench: spec-simple.md pienille malleille
- Yksinkertaistettu JSON-skeema: ei sa_type/py_type, vain type-kenttä
- Small-profiili käyttää automaattisesti spec-simple promptia
- Vähemmän kenttiä per entity → pienempi output → 8b selviytyy
2026-04-14 22:06:29 +03:00
7bcba3daf8 CodeBench: --spec-model lippu — eri malli spec-vaiheille (1-2) 2026-04-14 21:43:44 +03:00
7d49d62f81 CodeBench: kompaktoitu Go handlers.go golden — error handling yhdelle riville 2026-04-14 21:15:57 +03:00
5b8919ef89 CodeBench: sekunnit aikaleimaan (T17-58 → T17-58-23) 2026-04-14 21:10:52 +03:00
a4942edb9f CodeBench: --no-orchestrate lippu orkestroinnin ohittamiseen 2026-04-14 21:08:34 +03:00
8fc31f2a53 CodeBench: kierroskohtainen output-dir + tiivistetty Go golden example
- runPipeline saa round-parametrin, dir: model__scenario__r1, __r2 jne.
- todo-go.md testit 6→4 (poistettu list+update toisteiset), 466→370 riviä
2026-04-14 20:57:50 +03:00
01364b7031 CodeBench: korjaa Go-pipeline — tiedostoparseri + go mod tidy
- parseGeneratedFiles regex: lisätty .go ja .mod päätteet
- Markdown fence strippaus: lisätty go/gomod
- Dockerfile.go-test: go mod tidy ennen testejä (go.sum generoidaan)
- Testattu: 6/6 golden example ilman go.sum
2026-04-14 19:27:19 +03:00
f3cd1347ab CodeBench: Go-tuki — Chi + SQLite + httptest
- Golden example: todo-go/ (6/6 testit läpi)
- todo-go.md golden reference
- prompts/code-go.md koodigenerointi-prompti
- Dockerfile.go-test (golang:1.23-alpine)
- benchmark.mjs: LANG_CONFIG, parseTestOutput, prompt/golden-valinta Go:lle
- Käyttö: node benchmark.mjs --lang go --models qwen2.5-coder:32b
2026-04-14 19:20:18 +03:00
5ea2540588 CodeBench: promptit kokonaan englanniksi — poistettu suomenkieliset esimerkit 2026-04-14 18:58:20 +03:00
b91253235e CodeBench: lisätty qwen2.5-coder:32b profiili Rust-kandidaatiksi 2026-04-14 18:56:51 +03:00
ac2e3e92fc CodeBench: orkestrointi kaikille malleille ja kielille kun >1 entiteetti
Aiemmin vain small+python. Nyt kaikki multi-entity skenaariot pilkotaan
entiteetti kerrallaan — myös Rust ja large-mallit.
Author toimii jo, Article inkrementaalisena lisäyksenä helpompi.
2026-04-14 18:47:34 +03:00
0975385101 CodeBench: reqwest 0.13 + Docker volume cache + rust:latest
- reqwest 0.12 → 0.13, rustls-tls → rustls (golden, Dockerfile, promptit)
- Docker volume cache: kipina-cargo-registry + kipina-cargo-target
- rust:latest (1.94) + cmake (aws-lc-sys vaatii)
- Dockerfile yksinkertaistettu — esikäännös ei toimi, volume hoitaa
- Golden example 10/10 testattu uudella setupilla
2026-04-14 18:42:05 +03:00
bb8be3ffb4 CodeBench: revert-if-worse + erillinen testFixRounds-laskuri
- Seurataan parasta testitulosta (bestPassed/bestFiles)
- Jos korjaus huonontaa: palautetaan paras versio ja lopetetaan
- fixRounds laskee vain testikorjaukset, ei cargo check -kierroksia
- Estää 4/7 → 0/1 regressiot korjaussilmukassa
2026-04-14 18:24:46 +03:00
8fbb8eda2d CodeBench: esikäännä Rust-riippuvuudet Docker-imageen — 35x nopeampi
Dummy-projekti samalla Cargo.toml:llä: cargo check + cargo build --tests
imageen. Runtime kääntyy vain itse projekti (~2.5s vs ~90s).
2026-04-14 18:16:15 +03:00
742f331d93 CodeBench: Rust cargo check -vaihe ennen testejä + käännösvirheiden itsekorjaus
- Vaihe 4/5: cargo check Docker-kontissa ennen cargo test -ajoa
- Käännösvirheet syötetään mallille korjattavaksi (max 2 kierrosta)
- Estää turhat cargo test -ajot kun koodi ei käänny
2026-04-14 17:52:45 +03:00
2f602717b8 CodeBench: tiivistetty todo-rs.md golden example 540→331 riviä
- handlers.rs: tiiviimpi muotoilu, kommentit kuvaavat patternia
- tests: 10 testiä → 4 avaintestiä (create, get, not_found, delete)
- spawn_server tiivistetty
- Kaikki kriittiset patternit säilyvät: RETURNING, fetch_optional, rows_affected
2026-04-14 17:50:19 +03:00
d003f73217 CodeBench: tyhjennä GPU-muisti jokaisen kierroksen alussa
clearVram()-funktio vapauttaa kaikki Ollama-mallit VRAM:sta ennen testiä.
2026-04-14 17:47:55 +03:00
882bcece06 CodeBench: kirjoita results.json jokaisen kierroksen jälkeen
Välitulokset näkyvät heti tiedostossa, ei tarvitse odottaa koko ajon loppua.
2026-04-14 17:45:54 +03:00
477c21efd0 CodeBench: Rust golden example — todo-rs.md + kielitietoinen valinta
- Luotu todo-rs.md golden example Rust-referenssitoteutuksesta
- getGoldenForModel() huomioi nyt LANG: todo.md → todo-rs.md Rust-moodissa
- Korjattu golden-compact-rs.md /:id → /{id} bugi
- Juurisyy: malli sai Python golden examplen mutta piti generoida Rustia
2026-04-14 17:37:38 +03:00
088bad7b21 CodeBench: code-rs.md — spawn_server-esimerkki, {id} vahvistus, init_db yksinkertaistus
- Eksplisiittinen spawn_server()-koodi testien promptiin (async move wrappaus)
- {param} reittiohje vahvistettu kahdesti, chaining-ohje
- init_db: .expect() ei Result
- "You MUST generate ALL 6 files"
2026-04-14 17:08:26 +03:00
de3e33d46e CodeBench: code-rs.md — korjaa Rust-prompti kolmeen kriittiseen ongelmaan
- sqlx::query_as::<_, T>() runtime-funktiot, EI query_as!() compile-time makroja
- Route path {id} syntaksi, ei /:id (axum 0.8)
- app(pool) ottaa SqlitePool ja kutsuu .with_state(pool)
- Lisätty RETURNING, Result-palautustyypit, testiohjeistus
2026-04-14 16:40:15 +03:00
dcdb360098 Benchmark-tulokset: orkestrointi nosti 8b blogin 0p → 80p (med)
Orkestroitu 5 kierrosta: [0, 80, 80, 0, 80] med:80
3/5 kierrosta 100% testit läpi (11/11, 12/12, 11/11).
2/5 kaatui JSON-speksi -vaiheessa (ei orkestroinnin ongelma).
2026-04-14 15:45:27 +03:00
0b926c2cad CodeBench Taso 3: orkestrointi — pilko entiteetti kerrallaan pienille malleille
Small-profiili + >1 entiteettiä → generoi entiteetti kerrallaan:
1. Author (yksi entiteetti, 8b osaa)
2. Syötä Author-koodi → generoi Post + FK
3. Yhdistä ja testaa

Large-profiili jatkaa kuten ennen (kaikki kerralla).
2026-04-14 15:00:40 +03:00
a8f731d38e CodeBench: palautetaan 8b todo-readme.md — combined liian iso, hukkuu
combined-readme aiheutti uusia virheitä (POST 200 vs 201, 405).
Malli ei pysty käsittelemään kahta esimerkkiä kerralla.
2026-04-14 14:58:15 +03:00
5d0baf3ff1 CodeBench: combined-readme.md — todo + blog golden example 8b:lle
Molemmat esimerkit (single entity + FK relaatio) yhdessä tiedostossa.
1699 tokenia, 10.4% kontekstista. 8b näkee konkreettisen FK-patternen.
2026-04-14 14:54:12 +03:00
8e9fbc5422 CodeBench: code-small — FK update-testiesimerkki (author_id mukana PUT:ssa)
8b:n test_update_article palautti 422 koska author_id puuttui.
Lisätty konkreettinen test_update_post esimerkki FK-kentän kanssa.
2026-04-14 14:15:09 +03:00
06089a58b2 CodeBench: code-small — ForeignKey importin tarkennus (sqlalchemy, ei .orm)
8b importtaa ForeignKey väärästä paikasta (sqlalchemy.orm).
Lisätty eksplisiittinen "NOT from sqlalchemy.orm!" -varoitus.
2026-04-14 14:05:59 +03:00
a25c52cff4 CodeBench: mallikohtainen golden example (profiles.json → golden kenttä)
qwen3-coder:30b → todo.md (annotaatiot)
qwen3:8b → todo-readme.md (GitHub README -muoto, tutuin koulutusdata)
Golden example ladataan dynaamisesti per malli pipelinen sisällä.
2026-04-14 14:04:28 +03:00
0c3303a640 CodeBench: tyhjennä VRAM automaattisesti ennen testiajoa 2026-04-14 14:00:38 +03:00
ba48b737f2 CodeBench: --scenarios tukee yksittäistä skenaariota (todo/users/blog) 2026-04-14 13:58:38 +03:00
a3f1ead3e6 CodeBench: code-small — test_list assert >= 1 (ei == 1)
8b:n blog kaatui koska test_list assertoi tarkkaa määrää
vaikka testit jakavat saman tietokannan.
2026-04-14 13:58:13 +03:00
7fe72480b1 CodeBench: qwen3:8b primary-rooliin, FK-esimerkit code-small promptissa
profiles.json: role-kenttä (primary/backup/minimal/retired).
code-small.md: lisätty konkreettinen FK-pattern ja testi-esimerkki
relaatioille — 8b:n blog-skenaario kaatui koska ei osannut FK:ta.
2026-04-14 13:55:40 +03:00
92964e322f CodeBench: mallikohtaiset promptiprofiilit (profiles.json)
- profiles.json: malli → profiili → prompti -mappaus
- code-small.md: tiivistetty prompti pienille malleille (8b, 4b)
- benchmark valitsee automaattisesti oikean promptin mallin perusteella
- qwen3-coder:30b → code.md (large), qwen3:8b → code-small.md (small)
2026-04-14 13:54:26 +03:00
e54c1b057c Golden example: tarkat 6 testiä per entiteetti, ei ylimääräisiä
Malli generoi test_search, test_filter yms. joita ei ole endpointeissa.
Nyt todo.md listaa tarkalleen 6 testiä per entiteetti nimillä.
2026-04-14 12:56:50 +03:00
1de7e5c90b CodeBench: nopea syntaksitarkistus ennen Docker-ajoa
py_compile tarkistaa .py-tiedostot millisekunneissa.
Syntaksivirhe → suoraan itsekorjaukseen, ohitetaan Docker (~10s säästö).
2026-04-14 12:52:03 +03:00
e360896436 CodeBench Taso 4: itsekorjaava looppi — syötä pytest-virhe mallille
Jos testit epäonnistuvat, LLM saa virheilmoituksen + koodin ja korjaa.
Max 3 korjauskierrosta. Testattu: qwen3:8b users 0/6 → korjaus → 6/6.
2026-04-14 12:46:06 +03:00
6a40ca5730 CodeBench: golden example markdown-muodossa (koodi + selitykset)
todo.md yhdistää koodin ja annotaatiot: miksi pattern on valittu,
mitä EI saa tehdä. 1567 tokenia (vs raaka 1340, compact 335).
Benchmark lataa .md-version oletuksena, fallback erillisiin tiedostoihin.
2026-04-14 12:38:25 +03:00
2d470ee418 CodeBench: deprecated-patterns.md + inline deprecated-säännöt promptissa
Lisätty SQLAlchemy 2.0, Pydantic v2, FastAPI ja Python deprecated →
modern patterniparit. Uusimmat dokumentaatiot tarkistettu 2026-04-14.
2026-04-14 12:28:35 +03:00
062e6af776 CodeBench: vahvista CRITICAL-sääntö — ei ylimääräisiä kenttiä
qwen3:14b lisäsi created_at spekin ulkopuolelta ja käytti
server_default=datetime.now (virheellinen). Nostettu CRITICAL-tasolle.
2026-04-14 12:27:10 +03:00
75870c1100 CodeBench: korjaa aikaleima-sääntö — ei lisää ylimääräisiä kenttiä, func import
func.now() aiheutti NameError koska mallit eivät importanneet func:ia.
Uusi lähestymistapa: kielletään ylimääräiset kentät, ja JOS speksissä
on aikaleimat niin käytetään server_default + func import.
2026-04-14 12:18:36 +03:00
6e83fad31d CodeBench: 3 uutta promptisääntöä 5-kierroksen virheanalyysistä
1. Aikaleimakentät: server_default=func.now(), ei pakollisia Create-schemassa
2. Ei ylimääräisiä filter/search-endpointeja
3. Ei ylimääräisiä kenttiä spekin ulkopuolelta
2026-04-14 12:14:36 +03:00
0f3310996e CodeBench: oletus-URL 127.0.0.1 localhostin sijaan (Node 18 IPv6-ongelma) 2026-04-14 11:08:21 +03:00
e2a16b8ff6 CodeBench: väliraportti jokaisen kierroksen jälkeen
Näyttää mediaanin, kaikkien kierrosten pisteet ja trendin (▲▼─).
2026-04-14 11:04:51 +03:00
a0d3748faf CodeBench: --rounds N toistaa testiajot 1-10 kertaa
Kierrosyhteenveto näyttää mediaanin, min/max ja pass-raten per kierros.
Käyttö: node benchmark.mjs --models qwen3:14b --scenarios all --rounds 3
2026-04-14 11:03:00 +03:00
01b4fb8e22 CodeBench: --compact tiivistää golden examplen templaatiksi
Python: 1340 → 335 tokenia (−75%)
Rust: 3383 → 445 tokenia (−87%)
Käyttö: node benchmark.mjs --compact --models qwen3:4b
2026-04-14 10:59:39 +03:00
e7b33b7d6f CodeBench: Rust-tuki (--lang rust), golden example todo-rs, Dockerfile.cargo-test
- golden-examples/todo-rs/: Axum 0.8 + SQLx + SQLite, 10 testiä
- prompts/code-rs.md: Rust-koodingenerointiprompt
- Dockerfile.cargo-test: rust:1.87-slim testikontti
- benchmark.mjs: --lang python|rust, kieliriippuvainen golden example,
  parseri tukee cargo test -tuloksia, src/ alihakemistot
2026-04-14 10:55:50 +03:00
9da5540ca2 Golden example: todo-rs (Axum + SQLx + SQLite) 2026-04-14 10:50:16 +03:00
838d5fbd73 SUPERAGENTS.md: mallisuositukset VRAM-luokittain benchmark-datan perusteella
4GB qwen3:4b, 8GB qwen3:8b (95p), 16GB qwen3:14b (100p),
24GB qwen3-coder:30b (97p, 123 tok/s), 48GB molemmat rinnakkain.
Thinking-moodi huonontaa tuloksia. Yleismallit voittivat kooderimallit.
2026-04-14 10:33:35 +03:00
d02f6a51c1 CodeBench: --think lippu thinking-moodin testaamiseen
think:true + 3× token-raja (ajattelu vie ~2/3 tokeneista).
Käyttö: node benchmark.mjs --think --models qwen3:14b
2026-04-14 10:12:44 +03:00
8ba9ef83a3 CodeBench: num_ctx 16384 — rajoita konteksti-ikkuna VRAM-säästöksi
256K konteksti varaa ~15 GB KV-cachea vaikka benchmark käyttää ~3K.
16K riittää hyvin ja säästää merkittävästi VRAM:ia.
2026-04-14 09:49:30 +03:00
f50dc884a3 CodeBench: automaattinen aikaleima ja arkistointi results/-kansioon
Output-hakemisto /tmp/kipina-benchmark/2026-04-14T12-30/
Tulokset kopioidaan automaattisesti results/<aikaleima>.json/.html
2026-04-14 09:47:32 +03:00
7b27800390 Siirrä kipina-codebench projektin päätasolle 2026-04-14 09:44:14 +03:00
b93ae2fd1b Golden examples: README.md — ohje uusien esimerkkien luomiseen 2026-04-14 09:44:02 +03:00
4c116428c3 kipina-codebench: itsenäinen benchmark-moduli git-submoduliksi
Refaktoroitu tests/-kansiosta omaksi moduliksi:
- prompts/ — kaikki promptit erillisinä .md-tiedostoina
- golden-examples/ — todo (taso 1) + blog (taso 2)
- benchmark.mjs lataa promptit ja esimerkit dynaamisesti
- Dockerfile.pytest, report-template.html, package.json, README.md
- results/ — tallennetut benchmark-tulokset
2026-04-14 09:42:20 +03:00
542230f091 Benchmark: promptisääntö — update-testidatan pitää sisältää kaikki pakolliset kentät
qwen3-coder:30b users 6/7: test_update_user lähetti luontipaiva=None
NOT NULL -kenttään. Uusi sääntö estää tämän.
2026-04-14 09:31:42 +03:00
c217271907 Benchmark-tulokset 2026-04-14: mistral-perhe ja top3-vertailu
top3: qwen3-coder:30b ★★★★★ 97p, codestral:22b ★★★★☆ 88p, qwen3.5:35b 40p
mistral: codestral:22b 80p, mistral-small3.1 30p, devstral:24b 44p
2026-04-14 09:30:40 +03:00
a08b5f3893 Benchmark: think:false — kytke ajattelu pois Ollama-kutsuissa
Thinking-mallit (qwen3.5) käyttivät kaikki tokenit ajatteluun
eikä content-kenttään jäänyt mitään. think:false pakottaa
suoran vastauksen ilman ajattelublokkia.
2026-04-14 08:48:03 +03:00
25b9ab0c37 Benchmark: käytä thinking-kenttää fallbackina jos content tyhjä
qwen3.5 palauttaa vastauksen thinking-kentässä kun content on tyhjä.
Lisätty debug-logi thinking-malleille.
2026-04-14 08:45:06 +03:00
62c9b6e17e Benchmark: nosta token-rajoja thinking-malleja varten
qwen3.5 palauttaa ajattelun erillisessä thinking-kentässä,
content jää tyhjäksi jos tokenit loppuvat kesken.
Vaatimukset 1024→2048, speksi 2048→4096.
2026-04-14 08:42:32 +03:00
ad097ca712 Benchmark: HTML-raportti laskee pisteet itse (toimii vanhoilla tuloksilla) 2026-04-14 08:29:47 +03:00
868d116961 Benchmark: HTML-webbiraportit tuloksista
Standalone HTML-tiedosto joka sisältää:
- Yhteenvetokortit (keskiarvo, paras malli, nopein, testit)
- Mallikohtainen taulukko palkkikaavioilla
- Yksittäiset tulokset sortattavassa taulussa
- Dark mode, ei ulkoisia dependencyjä
2026-04-14 08:27:01 +03:00
02e3701d77 Benchmark: output-tokenit yhteenvetotaulussa per skenaario ja yhteensä 2026-04-14 08:20:32 +03:00
b3abf4e89f Benchmark: mallikohtainen yhteenvetotaulu + kokonaisaika
Näyttää per malli: testit ja aika per skenaario, kokonaisläpäisy,
kokonaisaika, keskimääräinen tok/s ja keskipisteet.
2026-04-14 08:19:27 +03:00
9f2899b83d Benchmark: pisteytys (0-100) ja tähtiluokitus tuloksissa
Pisteytys: speksi 10p + koodi 10p + testit 60p + korjaukset 20p.
Tähdet: ★★★★★ (90+), ★★★★☆ (70+), ★★★☆☆ (50+), ★★☆☆☆ (25+), ★☆☆☆☆ (1+).
Näkyy per-ajo rivillä, tulostaulussa ja yhteenvedossa.
2026-04-14 08:10:27 +03:00
4a811e4171 Benchmark: näytä kontekstin koko (promptin token-arvio) tuloksissa 2026-04-14 08:05:59 +03:00
8efbf96295 Golden example: blog (taso 2, relaatiot Author → Post)
13 testiä, ForeignKey-relaatio, uniikki suomalainen testidata
(Aleksis Kivi, Tove Jansson jne). Testattu Docker-kontissa.
2026-04-14 08:03:21 +03:00
16f40a7536 Benchmark: pytest ajetaan Docker-kontissa (kipina-pytest)
Kontti hoitaa uv init + uv add + pytest eristetyssä ympäristössä.
Python 3.14, ei VIRTUAL_ENV-ongelmia, täysi toistettavuus.
Image: docker build -t kipina-pytest -f tests/Dockerfile.pytest tests/
2026-04-14 07:39:23 +03:00
42ee959781 Benchmark: uv init + uv add hoitaa projektiasetuksen
LLM generoi enää 4 tiedostoa (ei pyproject.toml).
Pipeline: uv init → uv add deps → kirjoita .py → pytest.
Poistaa Poetry-yhteensopivuusongelmat kokonaan.
2026-04-14 07:34:06 +03:00
0850a139f1 Benchmark: fallback korvaa Poetry-pyproject.toml PEP 621 -versiolla
Kaikki testatut mallit generoivat [tool.poetry] -muodon
vaikka kultainen esimerkki näyttää [project]-muodon.
uv sync ei ymmärrä Poetrya → pytest ei asennu → kaatuu.
Fallback korvaa rikkinäisen pyproject.toml kultaisella versiolla.
2026-04-14 07:30:55 +03:00
d6a544909c Benchmark: kultainen esimerkki + zensical-dokumentointiohjeet
- golden-examples/todo/: 6/6 PASS referenssitoteutus
  - SQLAlchemy 2.0 (DeclarativeBase, Mapped, mapped_column)
  - Pydantic v2 (ConfigDict)
  - PEP 621 pyproject.toml, Python >=3.14
  - Uniikki testidata per testi
- CODE_SYSTEM päivitetty: few-shot kultaisesta esimerkistä
- DOCUMENTATION.md: zensical-dokumentointiohjeet
2026-04-14 07:28:47 +03:00
8f154a578c SUPERAGENTS.md: benchmark-arkkitehtuuri kehityksen todentamiseen
Moniulotteinen pisteytys (0-100), portaittaiset vaikeustasot
(CRUD → relaatiot → liiketoimintalogiikka → kehittyneet patternit),
historiavertailu ja regressiotunnistus.
2026-04-14 07:16:37 +03:00
7221f5e920 SUPERAGENTS.md: itseoppivan koodausagentin arkkitehtuuri ja toteutussuunnitelma 2026-04-14 07:14:17 +03:00
34a56e408d Benchmark: stripThinking tukee myös qwen3/3.5 <think>-tageja 2026-04-14 06:58:18 +03:00
ecd4bc2ac3 Benchmark: nosta koodigeneroinnin token-raja 4096 → 8192
gemma4:e4b tuotti 323 riviä ja tokenit loppuivat kesken,
pyproject.toml ei mahtunut vastaukseen.
2026-04-14 06:38:40 +03:00
7dc2af59c3 Benchmark: stripThinking poistaa gemma4-ajattelutagit vastauksista 2026-04-14 06:35:31 +03:00
4aa09e1025 Benchmark: LLM generoi koodin templaattien sijaan
Vaihe 3 käyttää nyt oikeaa LLM-kutsua (CODE_SYSTEM-prompti)
koodin generointiin. Templaattifunktiot poistettu kokonaan.
Tämä mittaa mallin todellista koodingenerointikykyä.
2026-04-13 22:23:35 +03:00
20cea8f268 Model benchmark: testaa kaikki Ollama-mallit järjestelmällisesti
Ajaa täyden pipeline-kierroksen per malli × skenaario:
1. Client-prompti → vaatimukset
2. Manager/SPEC_SYSTEM → JSON-speksi
3. Template-generointi → koodi
4. Validointi + LLM-korjaussilmukka
5. uv sync + pytest

Tuottaa vertailutaulukon: speksin laatu, testien tulos, nopeus.
Tukee suoraa Ollamaa (--ollama) ja hub-reittiä (--hub).
2026-04-13 22:08:47 +03:00
38a18c555b Debug: reititys logittaa kaikki solmut ja niiden tilat 2026-04-13 21:53:40 +03:00
8138e41aa1 native-noden tuunausta 2026-04-13 21:29:05 +03:00
6ee5bdf960 Native node: lämmittelykutsu lataa mallin VRAM:iin heti käynnistyksessä 2026-04-13 21:23:56 +03:00
cf3bf54bf8 kipina-node: automaattinen versiopäivitys build-hashilla
Poistettu interaktiivinen "haluatko korvata?" -kysely. Tilalle:
- Bootstrap hakee .build-hash palvelimelta joka käynnistyksellä
- Vertaa paikalliseen kipina-node-bin.hash
- Lataa uuden automaattisesti jos hash eroaa
- Näyttää version käynnistyksen yhteydessä

Ei enää tilannetta jossa vanha binääri jää vahingossa ajoon.
2026-04-13 21:21:48 +03:00
56f21a96c9 TUI: VRAM-tila värikoodattu (vihreä=100% GPU, keltainen=osittainen, punainen=CPU) 2026-04-13 21:12:50 +03:00
763b93396c Reititys: busy-solmut suodatetaan pois — työ jakautuu solmuille
Aiemmin busy-lukko luettiin mutta sitä ei käytetty suodatukseen,
joten sama solmu valittiin aina uudelleen vaikka se oli varattu.
Nyt matching-lista suodattaa pois busy-solmut, joten toinen
vapaa solmu saa tehtävän. Heavy-fallback kevyempään solmuun
jos kaikki isot mallit ovat varattuja.
2026-04-13 21:09:24 +03:00
e09962940a Native node: VRAM-tila TUI:ssa (ollama ps)
- fetch_ps(): hakee /api/ps ja palauttaa ModelVramStatus
- ModelVramStatus: size vs size_vram → 100% GPU / osittainen / CPU
- TUI: uusi "VRAM: ✓ qwen3:32b (20.1 GB) — 100% GPU" -rivi
- Taustapäivitys 30s välein
- Tuore linux-x86_64 binääri
2026-04-13 21:06:27 +03:00
5e44b63b0c Native node: tuore linux-x86_64 -binääri (reconnect, timestamp, node_id) 2026-04-13 16:54:28 +03:00
0f3881aa02 Fix: async RwLock read ennen Mutex-scopea (Send-yhteensopivuus) 2026-04-13 16:34:51 +03:00
fa85dcc5b3 Älykäs reititys: capability=heavy priorisoi isoimman mallin solmun
Hub:
- Parsii node_models:sta suurimman mallin parametrimäärän (B)
  per solmu (esim. qwen3:32b → 32, qwen2.5-coder:7b → 7)
- Tallentaa node_max_param_b: HashMap<u64, u32>
- ChatCompletionRequest: uusi capability-kenttä ("heavy"/"light")
- Reitityslogiikka: capability=heavy → valitsee solmun jolla on
  suurin malli; oletus → natiivi ensin kuten ennenkin

Frontend (pipeline):
- JSON-speksin generointi: capability=heavy
- QA-korjaussilmukan koodikorjaus: capability=heavy
- Observer/README-arviointi: capability=heavy
- Vaatimukset (Client): oletus (kevyt, kelpaa pieni malli)

Tämä mahdollistaa sen, että A40-koneella pyörivä Qwen3:32B
saa raskaat tehtävät ja selaimen 0.5B-malli hoitaa kevyet.
2026-04-13 16:30:47 +03:00
58d93613f0 Hero-kuvat: oikeat kipina.tech-kuvat (forge, serpent, gecko) 2026-04-13 14:33:11 +03:00
66b4435362 Teemavalitsin: painike kiertää gecko/forge/serpent, oletus forge
- Teemapainike (emoji) oikeaan yläkulmaan kuten kipina.tech:ssä
- Oletus forge (syaani), tallennetaan localStorage:iin
- Hero-kuva vaihtuu teeman mukaan fade-efektillä
- Kolme hero-kuvaa: gecko_hero, forge_hero (hämähäkki), serpent_hero
2026-04-13 14:29:14 +03:00
3a00de9b8e Kolme kipina.tech-teemaa: gecko, forge, serpent — satunnaisvalinta
Tuodaan kipina.techin kolme visuaalista teemaa kipina.studioon:
- gecko: lämmin kulta/oranssi (#ff7b00)
- forge: kyber-sininen/syaani (#00e5ff)
- serpent: neon-turkoosi (#00ffff)

Teema arvotaan satunnaisesti joka sivulatauksella. Kaikki aiemmin
hardcoodatut #ff6b00-aksenttivärit korvattu CSS-muuttujilla
(--hero-accent, --hero-glow) jotka mukautuvat teemaan.
2026-04-13 14:22:33 +03:00
670141c8c3 QA-korjaussilmukka: validointi delegoi ongelmat Coder-agentille
Aiemmin mekaaninen validateProjectCode() vain listasi ongelmat terminaaliin.
Nyt pipeline toimii näin:
1. QA-agentti ajaa mekaanisen validoinnin
2. Jos ongelmia → ryhmittelee ne tiedostoittain
3. Delegoi jokaisen tiedoston korjauksen oikealle agentille (Coder/Data/QA)
4. Agentti (LLM) palauttaa korjatun tiedoston
5. Validointi ajetaan uudelleen — max 2 korjauskierrosta
6. Lopullinen tulos näytetään vihreänä/punaisena
7. Tarkkailija arvioi lopullisen version

Kaikki korjausvaiheet tallentuvat promptLog:iin → näkyvät oppimispolussa.
2026-04-13 14:09:10 +03:00
59daebbd38 Template pipeline: docker-compose.yml ja .dockerignore mukaan generointiin
Jokainen generoitu projekti sisältää nyt:
- Dockerfile (oli jo)
- docker-compose.yml (uusi: build + portti 8000 + named volume)
- .dockerignore (uusi: .venv, __pycache__, *.db, .git)

Testattu: docker compose build + kontin käynnistys + API-kutsu OK.
2026-04-13 13:27:50 +03:00
42b71dbf77 Templatejen laatu: declarative_base, ConfigDict, ForeignKey
- models.py: sqlalchemy.ext.declarative → sqlalchemy.orm (poistaa
  MovedIn20Warning-varoituksen)
- schemas.py: class Config → model_config = ConfigDict() (poistaa
  PydanticDeprecatedSince20-varoituksen)
- models.py: _id-kentät saavat ForeignKey("taulu.id") kun speksissä
  on relationship-merkintä

Testattu: 10 erilaista projektia, 78/78 testiä läpi, 0 varoitusta.
2026-04-13 13:18:11 +03:00
b88a741f85 Template pipeline: JS→Python -arvomuunnokset korjattu
Ongelma: generoiduissa Python-tiedostoissa JS-booleanit (false/true)
päätyvät sellaisenaan Python-koodiin, jossa ne eivät ole valideja.
Lisäksi datetime-importit puuttuivat kun LLM antoi extra_imports-kentässä
pelkän "datetime"-merkkijonon eikä kokonaista import-lausetta.

Korjaukset:
- pyLiteral(): muuntaa JS-arvot Python-literaaleiksi (false→False jne.)
- pyJsonLiteral(): testidatan serialisointi Python-dict-muodossa
- tmplSchemas: datetime-importit tunnistetaan automaattisesti kentistä
- tmplModels + tmplSchemas: oletusarvot pyLiteral()-funktion kautta
- tmplTests: JSON.stringify korvattu pyJsonLiteral():lla
- Validaattori: tunnistaa nyt datetime-import-puutteet ja JS-booleanit

Testattu: molemmat aiemmin rikkinäiset speksit generoivat nyt toimivan
koodin — 6/6 pytest-testiä läpi molemmilla.
2026-04-13 12:44:08 +03:00
Jaakko Vanhala
68c7195d54 TEMPLATING.md: periaatteet rakennuspalapohjaiselle koodigeneroinnille 2026-04-13 06:59:12 +03:00
Jaakko Vanhala
3d20238eef Uudelleenreititystä ja templatingia 2026-04-13 06:54:56 +03:00
Jaakko Vanhala
8b8ba01af3 Toipuminen yhteyskatkoksesta: hub ilmoittaa API:lle, node reconnectaa
- Hub: kun node katoaa kesken tehtävän, palauttaa virheen API-kutsulle
- Hub: node_active_task seuraa mikä tehtävä on kesken
- Hub: timeout 600s → 120s
- Node: reconnect nollaa busy-tilan ja näyttää sen TUI:ssa
2026-04-13 06:50:45 +03:00
Jaakko Vanhala
a3b95a56e8 Native node: timestamp lokiin, node_id otsikkoon, yhdistetty-tila 2026-04-13 06:32:11 +03:00
Jaakko Vanhala
5b20ebe800 Opas: terminologia korjattu — relaatio on taulu, relationship on yhteys 2026-04-12 20:32:52 +03:00
Jaakko Vanhala
ffe9bd6902 Opas päivitetty: relaatiotuki, architect-agentin rooli, vertailuluvut 2026-04-12 20:27:41 +03:00
Jaakko Vanhala
d27068b11a UI-korjaus korjattu (GTFO gemini) 2026-04-12 20:18:39 +03:00
Jaakko Vanhala
8468724a4c Architect-prompti parannettu, relaatiotuki templateihin, englanti-sääntö
- SPEC_SYSTEM: chain-of-thought, domain-esimerkit, anti-patternit, relaatiosäännöt
- Speksi-puhdistus: korjaa sa_type | None -virheet automaattisesti
- Etusivun teksti päivitetty
- Koodissa käytetään aina englantia (entity/field names)
2026-04-12 20:15:22 +03:00
Jaakko Vanhala
6ef71b7e5c Templates for different tasks 2026-04-12 20:02:25 +03:00
Jaakko Vanhala
b2ee8b9031 Pipelinen parannuksia building blockeilla 2026-04-12 18:48:14 +03:00
Jaakko Vanhala
c1a5f8aff5 ZIP-tiedostonimi lyhennetty max 3 sanaan 2026-04-12 16:07:46 +03:00
Jaakko Vanhala
8ee997cb56 Projektin ZIP-lataus projektikorttiin
Lataa .zip -nappi renderöidään projektikortin headeriin.
ZIP rakennetaan selaimessa ilman ulkoisia kirjastoja (CRC-32 + ZIP-rakenne inline).
Kansiorakenne säilyy: prompts/*.md -tiedostot menevät alihakemistoon.
2026-04-12 15:59:14 +03:00
Jaakko Vanhala
cd67562a67 QA katselmoi, DevOps keskittyy deploymenttiin
- Review-luuppi siirretty DevOps→QA: QA katselmoi koodin ja
  lähettää korjausvaatimukset Coderille (max 3 kierrosta)
- QA:n prompt laajennettu: review-checklist + testien kirjoitus
- DevOps:n prompt uusittu: Dockerfile + deployment -fokus
- Pipeline: Client→Manager→Coder→QA review↔Coder fix→QA testit→DevOps Dockerfile→Observer
- AGENTS_VERSION 4→5
2026-04-12 15:55:45 +03:00
Jaakko Vanhala
1f85c03624 Pipeline-rajoitteet kevennetty ja näkyville Asetukset-sivulle
- maxTokens: client/manager/devops/observer 512→1024
- Client: 200→400 sanaa, 3-5→3-8 ominaisuutta, MVP-rajoitus poistettu
- Manager: 4-5→8 tiedostoa, vapaa tila 6→8
- Terminaali: 100→300 riviä, CrewAI prompt truncation 20→50 riviä
- Uusi pipelineConfig-objekti (localStorage-persistenssi)
- Asetukset-sivulle Pipeline-rajoitteet -osio sliderien kanssa
- AGENTS_VERSION 3→4
2026-04-12 15:47:46 +03:00
Jaakko Vanhala
74a2045def Landing page + oppimispolku + esimerkkiprojektit
1) Landing: gecko hero, projektin syöttökenttä, "Käynnistä"-nappi
2) Oppimispolku-välilehti: promptLog step-by-step (system prompt, syöte, tulos)
3) Kolme esimerkkiprojektia: Käyttäjähallinta-API, UWB-data-analyysi, Todo-sovellus
4) Landing → App -siirtymä käynnistää pipelinen suoraan
2026-04-12 15:24:44 +03:00
Jaakko Vanhala
9b2b7767b5 Depoa paranneltu 2026-04-12 14:28:58 +03:00
Jaakko Vanhala
1718805978 CrewAI-yhteensopiva projektioutput: agents.yaml, tasks.yaml, crew.py, prompts/
Pipeline kerää promptLog-listan jokaisesta agenttikutsusta (system prompt +
syöte + tulos) ja generoi lopuksi CrewAI-rakenteen files-objektiin.
Korjattu myös template.order.length-kaatuminen vapaassa tilassa.
2026-04-12 13:41:04 +03:00
Jaakko Vanhala
7fcc97f525 docker-compose.prod: poistettu dist-volume mount joka yliajoi Docker-imagen frontendin 2026-04-12 12:00:21 +03:00
Jaakko Vanhala
7ce990b42a Dockerfile.prod: frontend COPY-polut korjattu (src/ → ./src/) 2026-04-12 11:56:47 +03:00
Jaakko Vanhala
dc71829430 Riippuvuuksien siivous: burn, smollm, phi3, uuid, log, console poistettu 2026-04-12 11:53:36 +03:00
Jaakko Vanhala
5d4a553520 riippuvuuksia karsittu 2026-04-12 11:49:08 +03:00
Jaakko Vanhala
5e82c798b1 vcachet kusee 2026-04-12 11:46:23 +03:00
Jaakko Vanhala
5f147b774f deployment kokonaan uusiksi 2026-04-12 11:41:09 +03:00
Jaakko Vanhala
4983217ee0 korjailtu depon cacheja 2026-04-12 11:20:06 +03:00
Jaakko Vanhala
27c33e41c3 v0.3.2: Asiakas-agentti, dynaaminen pipeline, /api/chat, kpn stop 2026-04-12 11:09:24 +03:00
Jaakko Vanhala
2b33980be4 buildia viilattu 2026-04-12 11:05:35 +03:00
Jaakko Vanhala
8995bcef30 ui updates 2026-04-12 10:40:56 +03:00
Jaakko Vanhala
2f140c8a15 uusi projekti 2026-04-12 10:28:57 +03:00
Jaakko Vanhala
094b183c17 toimii suht ok 2026-04-12 08:02:17 +03:00
Jaakko Vanhala
a91b9539b3 Promptin generointiin muutoksia 2026-04-12 07:43:59 +03:00
Jaakko Vanhala
6e2f85daa8 Lisätty *.log gitignoreen, poistettu native-node.log seurannasta 2026-04-12 07:41:34 +03:00
Jaakko Vanhala
466e61d730 Cache-busting: kipina-node lataus- ja asennusskripti ohittaa välimuistin
StatusBar ja kipina-node-skripti käyttävät ?v=timestamp-parametria
välimuistin ohittamiseen. Binäärin uudelleenlataus oletuksena Y.
deploy-binaries.sh kopioi myös kipina-node-skriptin palvelimelle.
2026-04-12 07:40:33 +03:00
Jaakko Vanhala
5f00582053 UI:n system prompt ja sampling-parametrit välittyvät inferenssiin asti
Frontend lähettää agentin asetukset (system_prompt, temperature, top_k,
max_tokens, repeat_penalty, stop) API:lle. Hub välittää ne solmulle.
Native-node ja Wasm-coder käyttävät välitettyjä arvoja hardkoodattujen
sijaan.
2026-04-12 07:39:41 +03:00
Jaakko Vanhala
e272b0d124 TUI build korjattu 2026-04-12 06:43:12 +03:00
Jaakko Vanhala
d3affb3a09 TUI again 2026-04-12 06:33:10 +03:00
Jaakko Vanhala
1377e72f78 TUI inc 2026-04-12 06:26:34 +03:00
Jaakko Vanhala
403f35efdc TUI inc 2026-04-12 06:22:52 +03:00
Jaakko Vanhala
ce0ccbddd3 Jotain jännää 2026-04-11 19:17:48 +03:00
Jaakko Vanhala
80806498e0 Remote start stop control 2026-04-11 19:14:20 +03:00
Jaakko Vanhala
660e80c2bc natiivinodehommajuttuja 2026-04-11 18:14:08 +03:00
Jaakko Vanhala
591cfcb04b Päivitetyt kipina-node-binäärit: macOS, Linux x86/ARM, Windows
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:04:53 +03:00
Jaakko Vanhala
3cda57f0bc Hub: solmujen mallilistaus muistiin + /api/tags palauttaa verkon mallit
Natiivisolmun auth-viestistä tallennetaan mallilistaus node_models-mappiin.
/api/tags priorisoi verkon solmujen malleja lokaalin Ollaman edelle.
api_hardware käyttää tietokannan litteää rakennetta.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:04:41 +03:00
Jaakko Vanhala
23e7b92d03 kipina-node: auth-viesti välittää mallinimen ja Ollama-mallilistauksen hubille
build_auth_message käyttää nyt oikeaa mallinimeä hardkoodatun sijaan.
Lisäksi natiivisolmu hakee Ollaman mallilistauksen ja lähettää sen
auth-viestissä hubille.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:04:23 +03:00
Jaakko Vanhala
9f58febe21 Deploy-putki: Windows-build + automaattinen binäärikäännös
build-binaries.sh: lisätty Windows x86_64 (mingw-w64) neljänneksi
kohteeksi. deploy.sh: binäärit käännetään automaattisesti ennen
Docker-buildia, jolloin ne päätyvät Astron kautta kipina.studioon.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:03:53 +03:00
Jaakko Vanhala
b1de0d37f7 lisätty admin laitteistonäkymä 2026-04-11 17:42:17 +03:00
Jaakko Vanhala
4ff626ab88 broadcastit pois 2026-04-11 17:37:16 +03:00
Jaakko Vanhala
a45616046d Hub: broadcast-viestittely korvattu kohdennetulla reitityksellä
API-vastaukset käyttävät nyt oneshot-kanavaa broadcast-suodatuksen
sijaan, ja user_text lähetetään vain lähettäjäsolmulle. Stats-broadcast
säilyy UI:lle ja adminille.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:36:24 +03:00
Jaakko Vanhala
ee048b0b68 kipina-node: automaattinen Ollama-instanssien haku + konttituki
Skripti skannaa localhost, 127.0.0.1, ollama, host.docker.internal
ja tarjoaa valikon jos useampi löytyy. Ei vaadi enää paikallista
ollama-binääriä — toimii myös Docker-konttia tai remote-instanssia
vasten. OLLAMA_URL välitetään Rust-binäärille.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:19:00 +03:00
Jaakko Vanhala
4e83569194 Konsoliloki näyttää mallin nimen: ✓ qwen2.5-coder:3b | 438 tok | 4952ms | 93.4 tok/s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:01:32 +03:00
Jaakko Vanhala
f42b692eeb Lyhennetty konsolilogi: yksi rivi per pyyntö + yksi rivi per tulos
Ennen: koko prompti + vastaus logitettiin (satoja rivejä)
Jälkeen:
  → task_id:abc | 42r prompti | "Write ONLY models.py..."
  ✓ 128 tok | 3200ms | 40.0 tok/s | "from sqlalchemy import..."

llm_done-viestissä prompt lyhennetty viimeiseen riviin (ei koko kontekstia).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:00:39 +03:00
Jaakko Vanhala
f79bb16f3d kipina-node binäärijakelu: download-skripti + macOS ARM64 binääri
kipina.studio/kipina-node — shell-skripti joka:
1. Tunnistaa OS/arch (macOS ARM, Linux x86/ARM)
2. Tarkistaa Ollaman (asennettu? käynnissä?)
3. Lataa kielimallin automaattisesti
4. Lataa oikean binäärin kipina.studio/download/
5. Käynnistää noden → yhdistää hubiin

Käyttö: curl -sSL https://kipina.studio/kipina-node | bash
Tai:    curl -sSL https://kipina.studio/kipina-node -o kipina-node && chmod +x kipina-node && ./kipina-node

build-binaries.sh — kääntää binäärit kaikille alustoille (Docker).
macOS ARM64 binääri (4.9MB) valmis, Linux x86_64 build käynnissä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:51:31 +03:00
Jaakko Vanhala
e81fc33faf Join-dialogi: kaksi selkeää vaihetta (Ollama + kipina-node binääri)
Vaihe 1: Asenna Ollama
  curl -fsSL https://ollama.ai/install.sh | sh
  (+ brew/Windows-vaihtoehdot)

Vaihe 2: Lataa ja käynnistä kipina-node
  curl -sSL https://kipina.studio/kipina-node -o kipina-node && chmod +x kipina-node && ./kipina-node

Ei vaadi Rustia — valmis binääri ladataan suoraan.
Molemmat komennot kopioitavissa yhdellä klikkauksella.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:56:00 +03:00
Jaakko Vanhala
433726c553 Palautettu docker-compose.prod.yml: vain Caddy + Hub (ei Ollamaa palvelimella)
Ollama ajetaan käyttäjien omilla koneilla join.sh:n kautta,
ei palvelimella. Selain-Wasm toimii fallbackina.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:53:06 +03:00
Jaakko Vanhala
dec2e24e2f "Liitä koneesi" -nappi + join.sh + Docker native-node
UI: status-palkissa vihreä "+ Liitä koneesi" -nappi joka avaa dialogin:
  curl -sSL https://kipina.studio/join.sh | bash

join.sh:
- Tarkistaa Ollaman → tarjoaa asennusta jos puuttuu
- Käynnistää Ollaman jos ei pyöri
- Lataa kielimallin (qwen2.5-coder:3b)
- Käynnistää native-noden → yhdistää wss://kipina.studio/ws

Docker: Dockerfile.native + docker-compose.prod.yml päivitetty
ollama + native-node -konteilla palvelinpuolelle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:46:22 +03:00
Jaakko Vanhala
9058033669 Poistettu fonttiskaalaus (A-/A+) — ei vaikuttanut terminaaliin
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:33:20 +03:00
Jaakko Vanhala
8bd86e6325 Fonttikoon A-/A+ säädin: ±20% viidessä askeleessa
Oikeassa yläkulmassa A- ja A+ napit. Skaalaa 80-120%, tallennetaan localStorageen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:31:33 +03:00
Jaakko Vanhala
c1133bb075 Terminaalin fontti 15→16px
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:27:31 +03:00
Jaakko Vanhala
6502d75efc Terminaalin syöttökenttä korostettu: sininen reunus, varjo, isompi fontti 16px
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:25:37 +03:00
Jaakko Vanhala
9f8b7fe920 UI-fonttikoot kasvatettu: body 16px, terminaali 15px, tabit 15px, status 14px
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:23:46 +03:00
Jaakko Vanhala
746bc20fcb Agenttikuvakkeet kasvatettu: 50→64px kuva, 72→90px kortti, isompi fontti
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:22:42 +03:00
Jaakko Vanhala
93f6baa0ea UI kasvatettu: container 1200→1600px, terminaali korkeampi, padding leveämpi
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:18:34 +03:00
Jaakko Vanhala
cc8e871735 deploy-fast.sh: luo hakemisto palvelimelle ennen rsync:iä
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:15:19 +03:00
Jaakko Vanhala
e90f3460c3 deploy-fast.sh: päivitä vain frontend ilman kontin uudelleenkäynnistystä
docker-compose.prod.yml: frontend/dist mountataan volumena (read-only).
Hub servaa tiedostot suoraan — rsync päivittää ne lennossa.

Kolme deploy-tasoa:
1. deploy-fast.sh — vain frontend (sekunteja, ei downtime)
2. deploy-light.sh — rsync + remote Docker build (minuutteja)
3. deploy.sh — lokaali build + image siirto (hidas mutta varma)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:12:10 +03:00
Jaakko Vanhala
4d74c38618 Dockerfile: poistettu turha COPY pkg (Astro kopioi public/:n automaattisesti)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:06:39 +03:00
Jaakko Vanhala
8a1b204179 v0.3.1: Avatarit WebP (18MB→256KB), PNG:t temp-kansioon
Kaikki avatar-viittaukset .png → .webp (200px, quality 80).
Alkuperäiset PNG:t siirretty temp/avatars-png/ (gitignored).
Hub-versio 0.3.0 → 0.3.1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:06:02 +03:00
Jaakko Vanhala
b19f5a3518 deploy-light.sh: rsync + remote build (ei image-siirtoa)
Lähettää vain lähdekoodin rsync:llä (~2MB muuttuneet tiedostot),
palvelin buildaa Docker-imagen itse. Nopeampi kuin 80MB imagen siirto.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:56:01 +03:00
Jaakko Vanhala
38dc36e846 Dockerfile wasm-builder: lisätty dummy-cratet workspace-yhteensopivuuteen
cargo metadata vaatii kaikkien workspace-jäsenten Cargo.toml:n.
Lisätty hub/, native-node/, cli/ dummy-tiedostot wasm-builder-vaiheeseen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:07:28 +03:00
Jaakko Vanhala
4fe6931b5f Revolutionized 2026-04-10 08:06:15 +03:00
Jaakko Vanhala
b8e8a83e49 Dockerfile.prod päivitetty Astro-frontendille
Muutokset:
- Vaihe 1: Node.js buildaa Astro-frontendin (frontend/dist/)
- Vaihe 2: wasm-pack buildaa Wasm-moduulin erikseen
- Vaihe 3: Hub rakennetaan Rustilla
- Vaihe 4: Yhdistetään dist/ + pkg/ + avatars/ + templates/ + GUIDE.md

STATIC_DIR=/app/frontend/dist (ei enää /app/static)
Avatarit, templates ja GUIDE.md kopioidaan erikseen koska
Astro ei kopioi public/-tiedostoja dist/:iin buildin yhteydessä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:04:49 +03:00
Jaakko Vanhala
3d6914974d Prompti-textarea kasvaa automaattisesti sisällön mukaan
Koko prompti näkyy kerralla kun avatarin klikkaa — ei scrollausta.
Textarea saa overflow:hidden + auto-height sekä avatessa että kirjoittaessa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:03:18 +03:00
Jaakko Vanhala
9aff2ec154 uv kauttaaltaan + Tarkkailijan raportti rakenteelliseksi markdowniksi
uv-päivitykset:
- Koodarin NEVER-lista: ei requirements.txt, ei pip, käytä uv
- Template pyproject.toml: PEP 621, uv-yhteensopiva
- Raportin Quick Start: uv sync + uv run uvicorn

Tarkkailijan raportti uudessa formaatissa:
- Overview (yksi kappale)
- Files (taulukko: tiedosto + tarkoitus)
- Quick Start (uv-komennot koodiblokissa)
- Docker (build + run koodiblokissa)
- API Endpoints (taulukko: method, path, description)
- Architecture (rakenne ja päätökset)
- Risk Assessment (taulukko: severity, issue)

Malli saa taulukkopohjat valmiina → täyttää ne oikealla datalla.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:59:10 +03:00
Jaakko Vanhala
ecd4525a7f Review-korjauskierrokset nostettu 2→3
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:51:41 +03:00
Jaakko Vanhala
7a3e5278b9 Review-korjausluuppi: DevOps tarkistaa korjaukset, max 2 kierrosta
Pipeline:
1. DevOps review → löytää virheitä
2. Koodari korjaa → päivittää files-objektin
3. DevOps review (kierros 2) → tarkistaa korjaukset
4. Jos yhä virheitä → Koodari korjaa uudelleen
5. LGTM tai max 2 kierrosta → eteenpäin

Terminaalissa näkyy kierrosnumero: "koodikatselmointi (kierros 2)"
LGTM merkitään vihreällä ✓-merkillä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:50:43 +03:00
Jaakko Vanhala
8dcf269b42 strip_code_fences: poistetaan kaikki backtick-rivit aggressiivisesti
Ollama tuottaa \`\`\`python ... \`\`\` -blokkeja vaikka system prompt
kieltää ne. Nyt kaikki rivit jotka alkavat \`\`\` suodatetaan pois,
myös keskeltä vastausta (useita koodiblokkeja per vastaus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:47:03 +03:00
Jaakko Vanhala
cb16f35265 Tiedostot kulkevat agentilta toiselle: korjaukset päivittyvät, QA saa korjatun koodin
Kontekstiketju nyt:
1. Data → models.py
2. Koodari → schemas.py (saa models.py kontekstina)
3. Koodari → main.py (saa models.py + schemas.py)
4. Koodari → pyproject.toml
5. DevOps → review (saa kaikki tiedostot)
6. Koodari → korjaukset → parsitaan takaisin files-objektiin (--- filename ---)
7. QA → testit (saa KORJATUT tiedostot, ei alkuperäisiä)
8. DevOps → Dockerfile (saa kaikki tiedostot + testit)
9. Tarkkailija → README (saa kaiken)

Aiemmin: korjaukset menivät terminaaliin mutta eivät päivittyneet files-objektiin.
Nyt: korjattu koodi parsitaan --- filename --- -erottimilla takaisin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:44:01 +03:00
Jaakko Vanhala
b9d340b4b4 install.sh: Debian/Ubuntu-asennusskripti
Asentaa automaattisesti:
1. Build-työkalut (build-essential, pkg-config, libssl-dev)
2. Rust (rustup)
3. Node.js 22 (nodesource)
4. Ollama
5. qwen2.5-coder:3b -malli

Käyttö: ./network-poc/install.sh && ./network-poc/local.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:42:27 +03:00
Jaakko Vanhala
dd07e536f0 Korjattu duplikaatti const tst -määrittely kpnProject:ssa
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:39:51 +03:00
Jaakko Vanhala
9af481a022 DevOps-agentin prompti laajennettu staattiseksi koodianalyysiksi
9-kohdan checklist: importit, nimeämiset, tyypit, virheenkäsittely,
resurssivuodot, tietoturva, endpointit, Pydantic v2, täydellisyys.

Aiemmin 7 kohtaa, nyt 9 — lisätty: type hints, tietoturva (raw SQL,
hardcoded secrets), Pydantic v2 (model_dump, from_attributes).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:36:57 +03:00
Jaakko Vanhala
529a30a6e1 Korjattu harhaanjohtava GPU-viesti: Ollama käyttää GPU:ta automaattisesti
Kun --no-default-features (ei wgpu/nvml), viesti on nyt:
"GPU-tunnistus ei käytössä. Ollama käyttää GPU:ta automaattisesti."
eikä "GPU:ta ei havaittu — CPU-moodissa" (joka oli väärä M2:lla).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:35:00 +03:00
Jaakko Vanhala
7d842529b1 Tarkkailijan raportti: klikkaa avataria → modal + kehäväri arvosanalla
Tarkkailijan vastaus alkaa VERDICT-rivillä:
- GREEN → vihreä kehä → "OK"
- ORANGE → oranssi kehä → "HUOMIOITA"
- RED → punainen kehä → "KRIITTISTÄ"

Kehäväri ja glow jäävät näkyviin pipelinen jälkeen.
Klikkaamalla Tarkkailija-avataria avautuu raportti-modal jossa
README.md renderöidään markdown-muotoiltuna (taulukot, koodi, listat).
Modal sulkeutuu ✕-napista tai klikkaamalla taustaa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:33:11 +03:00
Jaakko Vanhala
c731c18360 DevOps generoi Dockerfilen + Tarkkailija kirjoittaa README.md-raportin
Uudet pipeline-vaiheet:

DevOps — Dockerfile:
- python:3.12-slim + uv (astral-sh)
- Oikea COPY-järjestys (pyproject.toml → sync → source)
- Expose 8000, CMD uvicorn

Tarkkailija — README.md:
- Markdown-raportti joka sisältää:
  - Tiedostolista ja kuvaukset
  - Käyttöohjeet (uv + Docker)
  - API Endpoints -taulukko
  - Arkkitehtuurihuomiot
  - Riskiarviointi
- README.md lisätään projektikorttin tiedostoihin
  → avattavissa editorissa

Pipeline nyt: Data → Koodari → DevOps review → korjaukset →
QA testit → DevOps Dockerfile → Tarkkailija README → Valmis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:29:45 +03:00
Jaakko Vanhala
5498eb6cbb Kaikki 6 agenttia osallistuvat pipeline-projektiin
Pipeline-vaiheet:
1. Data-agentti: models.py (tietokanta-asiantuntija)
2. Koodari: schemas.py, main.py (ohjelmistokehittäjä)
3. Koodari: pyproject.toml
4. DevOps: koodikatselmointi (importit, nimeämiset, virheet)
5. Koodari: korjaukset (jos DevOps löysi ongelmia)
6. QA: pytest-testit (test_main.py lisätään projektiin)
7. Tarkkailija: riskianalyysi (arkkitehtuuri, tietoturva)

Data-agentti valitaan automaattisesti models.py/database.py -tiedostoille.
Jokainen vaihe highlightaa oikean avatarin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:26:57 +03:00
Jaakko Vanhala
43f0aebf54 Poistettu pystylayout — highlight toimii vaakarivillä pipelinen aikana
Avatarit pysyvät aina vaakarivillä. Aktiivinen agentti saa
glow-highlightin kun pipeline etenee (koodari → testaaja → koodari).
Highlight poistuu kun pipeline valmistuu.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:23:06 +03:00
Jaakko Vanhala
6413f0238f Avatarit vaihtuvat pystyyn pipelinen aikana, aktiivinen agentti highlightataan
Pipeline käynnistyessä:
- Avataririvi vaihtuu pystyyn (column) → näkee kuka tekee mitä
- Aktiivinen agentti saa glow-highlightin (coder sininen, tester sininen)
- Korjausvaiheessa koodari highlightataan uudelleen

Pipeline valmistuttua:
- Palautetaan vaakarivi (row)
- Poistetaan highlight

Poistettu manuaalinen layout-toggle-nappi — layout on automaattinen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:17:30 +03:00
Jaakko Vanhala
6b0394586e Mermaid-kaaviot oppaaseen + avatareiden vaaka/pysty-toggle
Mermaid ladataan CDN:stä (esm module). Opas-sivun renderMd tunnistaa
\`\`\`mermaid -koodiblokit ja renderöi ne SVG-kaavioiksi dark-teemalla.

toggleAgentLayout vaihtaa avatareiden suunnan (row/column),
tallennetaan localStorageen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:11:01 +03:00
Jaakko Vanhala
108094b06a Opas-sivun markdown-renderöijä laajennettu: taulukot, inline-muotoilu, listat
Lisätty tuki:
- Taulukot (| header | ... | -parsinta, thead/tbody, border-collapse)
- Inline: **bold**, *italic*, \`code\` (kaikissa elementeissä)
- Numeroidut listat (1. 2. 3.)
- Parempi tyhjien rivien käsittely (8px spacer, ei <br>)
- Otsikkojen border-bottom h1/h2:lle

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:01:54 +03:00
Jaakko Vanhala
d7c974792d Agenttien päivitys kysyy käyttäjältä: OK = oletukset, Peruuta = säilytä omat
Kun AGENTS_VERSION kasvaa eikä localStorage ole tyhjä, näytetään confirm-dialogi:
"Agenttien oletuspromptit on päivitetty. Haluatko ottaa uudet käyttöön?"
- OK: ylikirjoitetaan oletuksilla
- Peruuta: käyttäjän muokkaukset säilyvät

Ensimmäisellä käyttökerralla (tyhjä localStorage) ladataan oletukset ilman kysymystä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:00:08 +03:00
Jaakko Vanhala
1987eb57a0 Agenttien oletuspromptit päivittyvät automaattisesti (AGENTS_VERSION)
Kun AGENTS_VERSION kasvaa, localStorage ylikirjoitetaan uusilla oletuksilla.
Ei tarvitse enää manuaalisesti tyhjentää localStorage.removeItem('kpn-agents').
Kasvata AGENTS_VERSION aina kun oletusprompteja muutetaan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 06:58:30 +03:00
Jaakko Vanhala
12ca87415c Poistettu native-noden kovakoodattu system prompt — agentin prompti toimii nyt
Ollaman system-kenttä yliajoi agentin konfiguroiman promptin.
Nyt system-kenttää ei lähetetä ollenkaan — agentin prompti tulee
osana prompt-kenttää (kpnRun koostaa sen frontendissä).

Tämä mahdollistaa per-agentti promptien toimimisen myös natiivilaskennalla.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 06:54:18 +03:00
Jaakko Vanhala
a0e52faa44 Localhost vapautettu IP-yhteysrajasta, tuotannon raja nostettu 20:een
Kehitysympäristössä (127.0.0.1) ei enää yhteysrajaa — useita
selainikkunoita ja native-nodeja voi yhdistää vapaasti.
Tuotannossa raja 10→20 per ulkoinen IP.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:14:06 +03:00
Jaakko Vanhala
f910cd8c61 Kattavat promptit + opettavat tooltipit jokaiselle parametrille
Promptit laajennettu moninkertaisiksi jokaiselle agentille:
- Manageri: RULES + EXAMPLE OUTPUT -formaatti
- Koodari: 8 CRITICAL RULES + NEVER-lista (importit, nimeäminen, Pydantic v2)
- Data: SQLAlchemy-spesifit ohjeet (String(length), connect_args, sessionmaker)
- QA: pytest-testirakenne (5 testitapausta enumeroituna)
- DevOps: 7-kohdan CHECKLIST + LGTM/ISSUE-vastausformaatti
- Tarkkailija: 4-alueen arviointi + RISK-formaatti + SHIP IT/NEEDS WORK

Tooltipit (hover 💡):
- Temperature: milloin matala/korkea, suositukset per rooli
- Max tokens: milloin nostaa/laskea, ~1 token ≈ 4 merkkiä
- Top-K: milloin muuttaa, harvoin tarpeen
- Repetition penalty: miksi liian korkea rikkoo koodin
- System prompt: hyvän promptin rakenne (rooli → säännöt → esimerkit → kiellot)

Prompt-tekstikenttä kasvatettu 4 → 8 riviä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:11:11 +03:00
Jaakko Vanhala
91dc7579bc Per-agentti sampling-parametrit: temperature, top-k, max tokens, repetition penalty
Jokainen agentti saa omat parametrit jotka näkyvät avatarin konfigurointipaneelissa:
- Temperature: Manageri 0.5 (tarkka), Koodari 0.7, Testaaja 0.3 (deterministinen)
- Max tokens: Manageri 512, Koodari 1024, Testaaja 512
- Top-K ja Repetition penalty per agentti
- Sliderit reaaliaikaisilla arvoilla

Parametrit tallentuvat localStorageen agentin mukana.
Perustelut: manageri ja testaaja hyötyvät matalasta temperaturesta
(determinismi tärkeää), koodari tarvitsee enemmän tokeneita ja
hieman korkeamman temperaturen luovempiin ratkaisuihin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:03:11 +03:00
Jaakko Vanhala
90c9a7e4fa Asetukset-välilehti: kaikki LLM-parametrit muokattavissa UI:sta
Uusi "Asetukset"-tab jossa:
- System Prompt (tekstikenttä, Courier-fontti)
- Temperature (slider 0-1.5, reaaliaikainen arvo)
- Top-K (slider 1-100)
- Repetition Penalty (slider 1.0-2.0)
- Max Tokens (slider 64-4096)
- Stop-sekvenssit (yksi per rivi)
- Mallinvalinta (dropdown: 1.5B/3B/7B Q4/7B)
- "Palauta oletukset" -nappi

Kaikki tallentuvat localStorageen (kpn-settings).
Jokainen parametri selitetty hint-tekstillä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:44:03 +03:00
Jaakko Vanhala
1216e016c2 Template-pohjainen projektipipeline + opettavat selitykset
Uusi lähestymistapa: sen sijaan, että malli keksii rakenteen tyhjästä,
sille annetaan mallipohja (template) joka sisältää:
- Tiedostojärjestys (models → schemas → main → pyproject.toml)
- Esimerkkikoodi jokaiselle tiedostolle (few-shot)
- Yksityiskohtaiset ohjeet (importit, nimeämiskäytännöt, patternit)

Jokainen vaihe selitetään terminaalissa:
💡 models.py — "Define the SQLAlchemy model. Always include engine
   with check_same_thread=False for SQLite..."
💡 Koodikatselmointi — "Testaaja tarkistaa importit, nimeämiset..."
💡 Tulos — "Aja: uv run uvicorn main:app --reload"

templates/fastapi-crud.json sisältää täydellisen esimerkkiprojektin
jota malli adaptoi käyttäjän kuvaukseen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:32:03 +03:00
Jaakko Vanhala
d85cab4bc0 Native-noden vastausten siivous: stop-sekvenssit + selitystekstien poisto
Stop-sekvenssit laajennettu: Please note, This is, Example, ```
strip_code_fences laajennettu poistamaan:
- Selitystekstit lopusta (Please note, This is a basic, Note that, ...)
- Johdantolauseet alusta (Sure!, Here is, Certainly!)
System prompt vahvistettu: "No 'Please note' or 'Here is' text"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:04:43 +03:00
Jaakko Vanhala
4fef8824e1 Kattavat oletuspromptit kaikille agenteille
Manageri: arkkitehtuuri + tiedostojako + pyproject.toml
Koodari: importit + Pydantic/SQLAlchemy-erottelu + moderni Python
Data: normalisointi + SQLAlchemy + migration-ystävälliset patternit
QA: pytest + TestClient + edge caset + mock
DevOps: koodireview + importtien tarkistus + LGTM-protokolla
Tarkkailija: riskianalyysi + tietoturva + arkkitehtuurihuomiot

Promptit näkyvät klikkaamalla avataria → konfigurointipaneeli.
Tyhjennä localStorage (kpn-agents) jotta uudet promptit tulevat voimaan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:59:19 +03:00
Jaakko Vanhala
009bf492c8 Parannetut koodarin promptit + token-raja 512→1024
Koodarin prompti sisältää nyt:
- Import-vihjeen: "from models import ..." aiemmista tiedostoista
- Nimeämisvihjeen: Pydantic-schemat (UserCreate) vs SQLAlchemy (User)
- "Include all necessary imports. Write complete, working code."

Native-noden max_tokens nostettu 512→1024 jotta CRUD-endpointit
mahtuvat yhteen vastaukseen.

Testattu API:n kautta: 3B-malli tuottaa nyt oikeat importit,
erilliset Pydantic-schemat ja kaikki 5 CRUD-endpointtia.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:56:24 +03:00
Jaakko Vanhala
f7e0e8dff8 Status-palkki tunnistaa natiivisolmun: ei enää turhaa Wasm-latausta
Sivulatauksessa tarkistetaan onko hubissa jo laskentasolmu (natiivi/Wasm).
Jos on → "Valmis (natiivi)", llmReady=true, ei Wasm-latausta.
Jos 503 → Wasm-fallback "Alusta"-napista.

Poistettu automaattinen Wasm-käynnistys (ensureNode) sivulatauksessa
koska natiivisolmu hoitaa laskennan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:48:23 +03:00
Jaakko Vanhala
eb57ee7b92 Avatarit ja tyylit palautettu main-haarasta: glassmorphism-kortit, oikeat hahmot
Avatarit: karhunpentu (Manageri), kipina (Koodari), pesukarhu (Data),
susi (QA), laiskiainen (DevOps), aikuinen_susi (Tarkkailija).

CSS: glassmorphism-taustat, hover-animaatio, active-glow,
pyöristetyt reunat, varjostukset. Sama tyyli kuin main-haarassa.

Huom: tyhjennä localStorage (kpn-agents) selainkonsolin kautta
jotta uudet oletusavatarit tulevat voimaan:
  localStorage.removeItem('kpn-agents')

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:46:21 +03:00
Jaakko Vanhala
84d13153ed Korjattu: agents ja defaultAgents siirretty scriptin alkuun (let hoisting)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:40:35 +03:00
Jaakko Vanhala
8beac57b50 Agenttien konfigurointi: promptit, mallit, järjestys, omat agentit
Klikatessa avataria avautuu konfigurointipaneeli:
- Nimen muokkaus
- Mallinvalinta (0.5B/1.5B/3B/7B)
- System promptin muokkaus
- Pipeline-järjestys (drag & drop avatarit ja pipeline-tagit)
- Agentin poisto

"+" -nappi luo uuden agentin satunnaisella avatarilla.
Konfiguraatio tallennetaan localStorageen (kpn-agents).
Pipeline (kpn project) käyttää agenttien prompteja ja malleja.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:38:30 +03:00
Jaakko Vanhala
44067efdb6 Monaco-lataus refaktoroitu: singleton Promise, parempi virheenkäsittely
initMonaco palauttaa nyt aina saman Promisen (ei moninkertaista latausta).
AMD-loader virheet ja CDN-virheet napataan ja logitetaan.
monacoLoaded-lippu korvattu Promise-pohjaisella tilanhallinnalla.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:24:08 +03:00
Jaakko Vanhala
5528be1812 Korjattu monacoLoaded: siirretty scriptin alkuun ennen switchTab-kutsua
let ei hoistu — monacoLoaded pitää olla määritelty ennen initMonaco-kutsua.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:21:22 +03:00
Jaakko Vanhala
f4cf4c73b9 Korjattu syntaksivirhe: ylimääräinen }); poistettu openInEditorista
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:18:58 +03:00
Jaakko Vanhala
e19852a509 Korjattu Monaco: window.monaco asetetaan eksplisiittisesti AMD-loaderin jälkeen
openInEditor käyttää nyt async/await initMonacoa ja window.monaco-globaalia
eikä oleta AMD-moduulin scopea. Korjaa "monaco is not defined" -virheen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:16:20 +03:00
Jaakko Vanhala
6de0df365e Korjattu projektikortin JSON-parsintavirhe: tiedostot globaaliin muuttujaan
Koodin sisältämät lainausmerkit ja erikoismerkit rikkoivat data-files
HTML-attribuutin. Nyt tiedostot tallennetaan window._projectFiles[id]:hen
ja onclick-handlerit viittaavat siihen suoraan. Ei JSON DOM:ssa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:11:11 +03:00
Jaakko Vanhala
28f620f901 kpnRun lähettää suoraan hubille, Wasm-fallback vain 503:lla
Ei enää odoteta Wasm-latausta ennen API-kutsua. Jos natiivisolmu
on hubissa, vastaus tulee suoraan. Jos hub palauttaa 503 (ei solmua),
käynnistetään Wasm-fallback ja yritetään uudelleen.

Tämä korjaa tilanteen jossa native-node on jo käynnissä mutta
selain yritti silti ladata Wasmia ensin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:09:26 +03:00
Jaakko Vanhala
3497f66db7 Agenttiavatarit palautettu: AgentBar-komponentti viidellä roolilla
Manageri (pöllö), Koodari (kameleontti), Testaaja (rukoilijasirkka),
QA (kilpikonna), Data (norsu). Klikattavat, highlight aktiiviselle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:53:35 +03:00
Jaakko Vanhala
1c7362c9b0 Native-node oletusmalli: qwen2.5-coder:3b
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:46:48 +03:00
Jaakko Vanhala
9983c80ef1 Native-noden oletusmalli vaihdettu kvantisoiduksi: qwen2.5-coder:7b-instruct-q4_K_M
Q4-kvantisointi: ~4GB (vs. 7GB), ~40 tok/s M2:lla (vs. ~25 tok/s).
Parempi nopeus/laatu-suhde.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:35:11 +03:00
Jaakko Vanhala
fc1fb33d5e local.sh: käynnistää native-noden automaattisesti jos Ollama on käynnissä
Käynnistysjärjestys: frontend build → hub → native-node (jos Ollama löytyy).
Ilman Ollamaa käytetään selaimen Wasm-laskentaa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:28:56 +03:00
Jaakko Vanhala
3bee8e8020 Kaikki pipeline-vaiheet käyttävät qwen-coder -mallia (smollm-135m poistettu)
Testaaja ja QA saivat 503 koska smollm-solmua ei ole ladattu.
Kaikki agentit käyttävät nyt samaa qwen-coder -mallia.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:25:04 +03:00
Jaakko Vanhala
f8ea5ed76e Yksinkertaistettu reititys: poistettu busy-tila ja jonotus
Pipelinen peräkkäiset kpnRun-kutsut saivat 503 koska hub merkitsi
solmun busyksi eikä vapauttanut sitä ajoissa. Reititetään aina
ensimmäiselle matchaavalle solmulle. LLM_BUSY suojaa Wasm-puolella.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:22:53 +03:00
Jaakko Vanhala
6c7c2d6dd3 local.sh: npm install automaattisesti jos node_modules puuttuu
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:18:47 +03:00
Jaakko Vanhala
c179b4ab7e .gitignore: node_modules, dist, .astro poistettu gitistä
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:18:16 +03:00
Jaakko Vanhala
a8c4af0975 Frontend uudelleenrakennettu: Astro-komponentit, Wasm pääsäikeessä, ei Workeria
Vanha frontend siirretty temp/. Uusi rakenne:
- StatusBar.astro, Terminal.astro, Editor.astro, Guide.astro
- global.css erillinen
- Wasm pääsäikeessä (ei Worker — yksinkertainen, debugattava)
- Tab-completion, dropdown, projektikortti, Monaco, GUIDE.md
- Ei tokenisointia eikä koodilaboratoriota

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:17:39 +03:00
Jaakko Vanhala
e3fdb91ac5 local.sh: buildaa frontendin ja käynnistää hubin yhdellä komennolla
Käyttö: ./network-poc/local.sh → http://localhost:3000

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:42:24 +03:00
Jaakko Vanhala
9925079729 Korjattu laskenta: poistettu warmup (aiheutti busy-konfliktin) + parempi solmun odotus
Ongelma: warmup-prompt käynnisti inferenssin heti yhdistymisen jälkeen,
jolloin oikea kpnRun-prompt tuli solmulle kun se oli vielä busy.
Solmu lähetti llm_error mutta UI:ssa ei ollut käsittelijää → ikuinen odotus.

Korjaukset:
1. Poistettu warmup ensureCoderNode:sta — ei tarvita koska kpnRun
   käynnistää solmun automaattisesti
2. kpnRun odottaa coderWsReady-lippua max 15s (pollaus 500ms välein)
   kiinteän 1s viiveen sijaan
3. Selkeä virheilmoitus jos solmu ei käynnisty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:37:28 +03:00
Jaakko Vanhala
6031737f83 Laskentasolmu käynnistyy automaattisesti: kpnRun + refresh-autostart
Kaksi korjausta laskentaan:
1. kpnRun kutsuu ensureCoderNode() automaattisesti jos solmu ei ole
   vielä käynnissä — käyttäjän ei tarvitse muistaa kpn load
2. localStorage-autostart: jos malli oli ladattu ennen refreshiä,
   ensureCoderNode() ajetaan automaattisesti sivulatauksessa

Tämä korjaa "Ei vapaata solmua" -virheen kpn run coder -komennoissa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 17:06:51 +03:00
Jaakko Vanhala
b6a8fa2671 Monaco Editor -välilehti: selainpohjainen koodieditori agenttien tuottamalle koodille
Uusi "Editor"-välilehti jossa:
- Monaco Editor (VS Coden ydin) CDN:stä, dark-teema
- Tiedostopuu vasemmalla (klikataan tiedostoa)
- Välilehdet ylhäällä (useita tiedostoja auki)
- Kielitunnistus tiedostopäätteestä (Python, Rust, JS, ...)
- "Avaa editorissa" -nappi projektikorteissa

Monaco ladataan taustalla requestIdleCallback:llä — ei hidasta
sivun käynnistymistä. Editor alustetaan vasta kun sitä tarvitaan.

Projektikortin "Avaa editorissa" -nappi:
1. Avaa Editor-välilehden
2. Luo Monaco-mallit jokaiselle tiedostolle
3. Renderöi tiedostopuun ja välilehdet
4. Avaa ensimmäisen tiedoston editoriin

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 17:04:59 +03:00
Jaakko Vanhala
0dc53dba1c Korjattu Worker-laskentasolmun timing: odotetaan WS-yhteyden avautumista
Ongelma: start_agent_node palautui heti ennen kuin WebSocket ehti avautua.
Worker lähetti 'started' ja warmup lähti liian aikaisin → hub ei löytänyt
solmua → "Ei vapaata solmua" -virhe.

Korjaukset:
1. worker.js: kuuntelee "Yhteys Hubiin avattu" -logia Wasmista ja
   resolveaa started-Promisen vasta sen jälkeen (15s timeout)
2. worker.js: onerror + onunhandledrejection käsittelijät
3. worker.js: console.error välitetään pääsäikeelle
4. index.astro: ensureCoderNode odottaa (await) workerStarted-Promisea
   ennen warmupia ja pending-promptin lähetystä

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:55:22 +03:00
Jaakko Vanhala
857afbe111 feat: complete revolution architecture modernization
- Decoupled robust frontend into an Astro framework in `frontend/`.
- Replaced direct WebSocket broadcast with Smart Routing to distribute workload only to idle capable nodes, preventing 503 errors and duplicate responses.
- Rewrote WASM panic points (`unwrap()` handling) into panic-safe match blocks in qwen_coder.rs preventing Node Web Workers from crashing.
- Integrated robust dynamic Three.js 3D visualization.
- Resolved mermaid and THREE.js frontend hydration issues.
2026-04-09 16:38:24 +03:00
Jaakko Vanhala
84b78eb9c6 GPU-tunnistus valinnainen: cargo run --no-default-features toimii ilman nvml/wgpu
Native-node kääntyy nyt macOS:llä ja muilla koneilla ilman NVIDIA-ajureita:
  cargo run --no-default-features  ← vain Ollama, ei GPU-tunnistusta
  cargo run                        ← oletus: GPU-tunnistus mukana (nvml + wgpu)

Feature flag "gpu-detect" kontrolloi nvml-wrapper ja wgpu -riippuvuuksia.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:42:35 +03:00
Jaakko Vanhala
4f18377a3b Native-node lähettää NODE_API_KEY auth-viestissä hubille
Luetaan NODE_API_KEY-ympäristömuuttuja ja lisätään api_key-kenttä
auth-viestiin. Hub tarkistaa avaimen ja hylkää solmun jos se ei täsmää.

Käyttö:
  NODE_API_KEY=kpn_sk_abc123 HUB_URL=ws://hub:3000/ws cargo run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:39:48 +03:00
Jaakko Vanhala
7f5bb45138 API-avain -autentikaatio natiivisolmuille
Natiivisolmujen (node_type: native) auth-viesti vaatii api_key-kentän
joka vastaa hubin NODE_API_KEY-ympäristömuuttujaa. Virheellinen avain
sulkee WebSocket-yhteyden.

Selainsolmut eivät vaadi avainta (Origin-validointi suojaa niitä).
Jos NODE_API_KEY ei ole asetettu, kaikki natiivisolmut hyväksytään
(kehitysympäristö).

Käyttö:
  Hub:  NODE_API_KEY=kpn_sk_abc123 cargo run
  Node: NODE_API_KEY=kpn_sk_abc123 cargo run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:39:05 +03:00
973d7a69c7 Galleria: hahmokuvaukset ja ehdotetut roolit kaikille 21 avatarille
Jokaiselle avatarille lisätty:
- Ehdotettu rooli (Arkkitehti, CI/CD, UI/UX, Tech Lead, SRE jne.)
- Persoonakuvaus joka kuvaa hahmon luonnetta ja erikoisosaamista

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:57:21 +03:00
aebc64e76e Tofuist poistettu Playgroundilta (jää Agent Builder -esimerkiksi)
Poistettu: avatar-kortti, gallery-head, agentPrompts-entry,
avatarMap, värimapit. Tofuist on edelleen Agent Builderin
esimerkkiagentti ja docs/tofu-cheatsheet.md säilytetään.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:43:15 +03:00
48c832c61b Module import absoluuttiseksi: ./pkg/node.js → /pkg/node.js
Suhteellinen polku rikkoi sivun kun navigoitiin suoraan
/avatars/ tai muuhun alihakemistoon (selain yritti ladata
/avatars/pkg/node.js jota ei ollut).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:37:47 +03:00
8435bd32a9 Hahmogalleria: #gallery -tabi kaikilla 21 avatarilla
- Uusi välilehti: Galleria (navigointi #gallery)
- Grid-näkymä: avatar, nimi, rooli, tiedostonimi
- Klikkaa korttia → kopioi polku leikepöydälle

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:33:25 +03:00
ece41dd622 Export CrewAI: generoi ZIP-projektin agenttitiimistä
Export-nappi Agent Builderissa generoi kipina-crew.zip sisältäen:
- crew.py: CrewAI agentit, tehtävät ja sequential pipeline
- Dockerfile: Python 3.12 + uv
- docker-compose.yml: Ollama + healthcheck + mallilataus + crew
- pyproject.toml: crewai[tools] riippuvuus
- .env: Ollama-asetukset
- README.md: käynnistysohjeet

Käynnistys: docker compose run crew uv run python crew.py "projekti"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:26:27 +03:00
c7f3b0d79f Agent Builder: tooltip-ohjeistukset kaikille kentille
- System Prompt: miten kirjoittaa hyvä prompti, esimerkit hyvä/huono
- Temperature: mitä arvot tarkoittavat (0.0–1.0+)
- Top-k: tokenivalinnan laajuus
- Max tokens: vastauksen pituus
- Malli: Ollama-malliesimerkit
- Docs: referenssidokumentin käyttö
- CSS .builder-tip ::after tooltip (white-space: pre-wrap)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:20:28 +03:00
8905b50f41 Builder-funktiot window-scopeen (onclick vaatii globaalin)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:54:01 +03:00
43b0612004 Agent Builder 2026-04-08 10:51:35 +03:00
599ac2d2d9 Agent Builder: kaikki 21 avataria valittavissa
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:50:49 +03:00
d1975bd55c Uudet Gemini-generoidut avatar-kuvat (13 kpl)
bear, beaver, chameleon, elephant, gecko, lion, mantis, owl,
penguin, serpent, spider, tortoise, walrus

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:50:21 +03:00
24a8139d3e Agent Builder UI: #builder -tabi, lomake, CRUD-integraatio
- Uusi välilehti: Agent Builder (navigointi #builder)
- Agenttilista: ladataan /api/v1/agents, näytetään kortteina
- Lomake: avatar-valitsin, rooli-template, malli, väri, docs, prompt, parametrit
- Tallenna → POST /api/v1/agents, Poista → DELETE /api/v1/agents/:id
- Avatar-grid: 8 valmista hahmoa valittavissa

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:45:31 +03:00
21aac49a52 Agent Builder: SQLite-taulu + REST API (GET/POST/DELETE)
- DB skeemaversio 3: agents-taulu (id, name, avatar, role, model, color,
  docs, prompt, temperature, top_k, max_tokens, repetition_penalty)
- CRUD: upsert_agent, get_agents, delete_agent
- API: GET/POST /api/v1/agents, DELETE /api/v1/agents/:id
- Oletusagentteja (is_default=1) ei voi poistaa

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:43:22 +03:00
8a5f1b753c AGENTBUILDER.md: Agent Builder -suunnitelma ja building blockit
Kuvaa hahmolomakkeen arkkitehtuurin: agenttiskeema, rooli-templatet,
malli-valitsin, avatar-grid, docs-kenttä, localStorage-tallennus,
export/import ja 4-vaiheinen toteutussuunnitelma.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:39:25 +03:00
1b0b5eb198 Eksaktit mallinimet agenteille: qwen-coder → qwen2.5-coder:7b
- Kaikki agentPrompts.model vaihdettu 'qwen-coder' → 'qwen2.5-coder:7b'
- Native-node selected_task: 'qwen2.5-coder:7b'
- Hub-reititys: qwen-perhe matchaa keskenään (selain qwen-coder-05b,
  natiivi qwen2.5-coder:7b) taaksepäin yhteensopivuuden vuoksi

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:33:43 +03:00
44c8a189b6 Tofuist malli qwen2.5-coder:7b, hub-reititys laajennettu
- Tofuist-agentin model vaihdettu qwen-coder → qwen2.5-coder:7b
- Hub: qwen2.5-coder:* matchaa nyt qwen-coder*-solmuille ja päinvastoin

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:31:53 +03:00
1a58324689 Tofuist-agentti: OpenTofu/IaC-asiantuntija gecko-avatarilla
- Uusi agentti: Tofuist (gecko-avatar, oranssinkulta #e3a336)
- System prompt: HCL-koodi, moduulit, lifecycle, state encryption
- docs-kenttä: lataa automaattisesti /docs/tofu-cheatsheet.md referenssiksi
- kpnRun: tukee nyt agentin docs-kenttää (haetaan kerran, cachetetaan)
- OpenTofu-dokumentaatio haettu GitHubista + tiivistetty cheatsheet
- Avatar, gallery-head, värimapit ja pipeline-tuet lisätty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:29:42 +03:00
afc7f9bcee Versio 0.2.4
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:52:21 +03:00
5d2027b2ca Native-node: automaattinen Ollama-haistelu käynnistyksessä
Jos OLLAMA_URL ei ole asetettu, kokeillaan järjestyksessä:
1. localhost:11434 (paikallinen Ollama)
2. 127.0.0.1:11434
3. ollama:11434 (Docker-verkko)
4. host.docker.internal:11434 (Docker-kontti → isäntä)

Ensimmäinen joka vastaa /api/version-kutsuun valitaan.
Timeout 2s per kokeilu. Jos OLLAMA_URL on asetettu, sitä käytetään suoraan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:41:44 +03:00
8a4d515eed Client-compose: Ollama lisätty jokaiseen profiiliin (nvidia/amd/cpu)
Ollama-palvelu puuttui client-composesta — native-node yritti yhdistää
ollamaan jota ei ollut. Nyt jokaisessa profiilissa on oma Ollama
(nvidia: latest+GPU, amd: rocm+/dev/kfd, cpu: latest) network alias
'ollama' jotta native-node löytää sen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:40:33 +03:00
987a370a05 Docker: Cargo.lock valinnainen, Ollama AMD ROCm -tuella
- Dockerfile.native-node: Cargo.lock kopioidaan glob-patternilla (ei kaadu
  jos puuttuu, esim. .gitignore poistaa sen)
- docker-compose: Ollama vaihdettu ollama/ollama:rocm -imageen AMD GPU:lle,
  /dev/kfd + /dev/dri laitemappaukset, poistettu nvidia deploy.resources

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:33:27 +03:00
f75e7f07e9 Opas: tokenisointiesimerkki korvattu oikealla kuvakaappauksella
- Staattinen tekstitokenisointiesimerkki korvattu kuvalla joka
  näyttää värikoodatut tokenit EN/FI-vertailussa
- Markdown-renderöijään lisätty ![alt](src) kuvatuki
- Kuva: static/images/tokenization-example.png

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:20:50 +03:00
eb6f720fcc Wasm tokenize_js() exportti oppaan live-tokenizeria varten
Lisätty #[wasm_bindgen] tokenize_js(text) → JSON-funktio joka lataa
tokenizerin IndexedDB:stä tai HuggingFacesta tarvittaessa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:18:07 +03:00
e25d0ea8f2 Versio 0.2.3
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:15:05 +03:00
4d1e89da34 Avatar-kuvakkeet 48px → 50px (+5%)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:14:14 +03:00
bf535b6256 Raportti: syntaksikorostus, CSS-tooltipsit rivityksellä, parannettu layout
- Lisätty highlightCode() — regex-pohjainen korostus .py, .html, Dockerfile
- Swimlane-badget: title-attr → CSS ::after tooltip (white-space:pre-wrap)
- Tiedostokortit: kieli + rivimäärä headeriin
- Yleinen ilme: pyöristetyt kulmat, monospace-fontti, parempi line-height
- Pipeline-vaiheet: selkeämmät erottimet ja väripaletti

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:10:37 +03:00
29e1c440c6 Native-node osoittaa nyt tuotantohubiin (kipina.studio)
HUB_URL vaihdettu ws://agentic-poc:3000/ws → wss://kipina.studio/ws
jotta lokaali GPU-solmu palvelee tuotantoympäristöä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:05:19 +03:00
560bee1369 Laskentaverkko ja Koodilaboratorio -tabit takaisin näkyviin
Tabit olivat piilotettu display:none-tyylillä, mikä rikkoi myös
hash-navigoinnin (#codelab, #network).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:01:21 +03:00
b074e0cb49 Avatar-korttien opacity nostettu 0.5 → 0.8
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:55:46 +03:00
9307c75516 Pysty/vaaka-toggle avatareille
Lisätty nappi jolla org-chartin voi vaihtaa vaaka- ja pystyasennon välillä.
Vaaka on oletuksena (kompakti), pysty lisää vertical-CSS-luokan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:55:07 +03:00
86191fbb6c Avatarit vaakariviin: kompakti layout, säästää pystytilaa
Org-chart muutettu vertikaalisesta hierarkiasta horisontaaliseksi riviksi:
Asiakas → Manageri → [Koodari, Data, QA, DevOps, Tarkkailija]
- Connector-viivat muutettu pystysuuntaisista vaakanuoliksi
- Avatar-kortit 72px leveitä (oli 130px), kuvat 48px (oli 80px)
- Roolikuvaus poistettu korteista — pelkkä nimi riittää
- Tarkkailija siirretty absolute-asemasta rivin loppuun
- Responsive-tyylit päivitetty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:53:25 +03:00
a6a94f7688 Avatar-logiikka: poistettu kilpaileva llm_prompt-handler, korjattu vilkkumisjärjestys
Ongelma: kaksi erillistä avatar-aktivointilogiikkaa kilpaili:
1) pipelineStep() — oikea agentti statuksen perusteella
2) llm_prompt-handler — arvasi agentin prompt-tekstistä (usein väärin → DevOps vilkkui QA:n/Datan sijaan)

Korjaus:
- Poistettu llm_prompt- ja llm_done-handlerien avatar-heuristiikat
- pipelineStep() hoitaa kaiken: active → syttyy, done → sammuu heti
- Pipeline-lopussa kaikki avataret sammutetaan eksplisiittisesti

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:42:02 +03:00
8d5c5440d2 Avatar-vilahdus: oikea agentti aktivoituu vuorollaan, manageri-bugi korjattu
pipelineStep() aktivoi nyt oikean agentin avatarin (sekä card että gallery-head)
kun status on 'active', ja poistaa 'done'-statuksella sekunnin viiveellä.
Poistettu llm_done-handlerin turha manageri-aktivointi joka vilkutti aina manageria.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:38:08 +03:00
a12bd7ce7f One-liner koodi: system prompt vaatii rivinvaihdot + staattinen tarkistus
Ollaman system prompt: 'Use proper newlines and indentation'.
Staattinen analyysi: havaitsee jos .py-tiedosto on yhdellä rivillä.
Native node vaatii rebuildin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:07:51 +03:00
9ac90aa540 pyproject.toml esimerkki: lisätty httpx ja pytest riippuvuuksiin
TestClient vaatii httpx:n, testit vaativat pyTestin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:01:35 +03:00
32065d5818 Korjattu </script> index.html-esimerkissä joka katkaisi pääsivun JS:n
Sama bugi kuin aiemmin: template-literalin </script> sulkee
ulomman script-tagin. Pilkottu: '<'+'/script>'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:00:53 +03:00
321943ee3c Esimerkkiprojektit: täysi CRUD + HTML UI root-osoitteessa
Few-shot esimerkit päivitetty:
- main.py: GET/POST/PUT/DELETE + FileResponse("/") index.html:lle
- index.html: yksinkertainen UI fetch()-kutsuilla API:in
- /api/ -prefiksi JSON-endpointeille
- Esimerkkipromptit kuvaavat CRUD-operaatiot eksplisiittisesti

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:49:38 +03:00
1b75c89320 Raportti: Mermaid → custom swimlane (kompakti, tooltipsit)
Agenteittain ryhmitelty visualisointi:
  Manageri  ✓ Suunnittelu
  Koodari   ✓ models.py → ✓ main.py → ✓ index.html
  DevOps    ✓ Review → ✓ Dockerfile → ✓ Compose → ✓ README
  QA        ✓ Testit → ✓ Validointi
Hover näyttää selityksen + output-esikatselun.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:39:23 +03:00
01622a960f Mermaid-kaavio LR (vaakasuunta) + tooltipsit joka vaiheelle
graph TD → graph LR: kaavio kiemurtelee vasemmalta oikealle.
Hover näyttää vaiheen selityksen ja output-esikatselun.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:33:23 +03:00
4e4efda67d Korjattu </script> template-literalissa joka katkaisi pääsivun JS:n
Template-stringin sisällä oleva </script> sulki ulomman script-tagin.
Pilkottu string-konkatenoimalla: '</'+' script>'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:25:36 +03:00
f5db2eb034 Esimerkkiprojektit HTML UI:lla + Mermaid-kaavio raporttiin + tooltips
- Esimerkkipromptit sisältävät nyt HTML-käyttöliittymän
- Manageri generoi index.html tiedoston, Dockerfile kopioi sen
- README: docker compose up → http://localhost:8000
- Raporttiin Mermaid-kaavio agenttien workflowsta (CDN)
- Pipeline-vaiheiden hover näyttää selityksen

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:06:15 +03:00
77c8d46e7b Staattinen analyysi: tiedostojen väliset importit tarkistetaan
from db import get_db → tarkistaa onko get_db määritelty db.py:ssä
(def, class tai muuttuja). Löytää puuttuvat exportit ennen Docker-buildia.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:34:41 +03:00
f14eba1b49 Projektiraportti: HTML-dokumentaatio pipeline-vaiheista ja tiedostoista
Pipeline generoi HTML-raportin joka sisältää:
- Pipeline-vaiheet prompteineen ja tuloksineen (avattavat details)
- Staattisen analyysin tulokset
- Kaikki generoidut tiedostot korostettuina
- Raportti avautuu uuteen välilehteen linkistä terminaalissa
- Projektikorttiin lisätty 📄 Raportti -nappi

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:22:03 +03:00
6d15298418 Parannettu staattinen analyysi + docker-compose template
Staattinen analyysi: tarkistaa 15 yleistä symbolia (Boolean, Text,
Depends, HTTPException jne.) puuttuvista importeista.
Models-esimerkki: lisätty Boolean ja Text importteihin.
docker-compose.yml: template ilman LLM:ää — ei enää version tai
turhaa PostgreSQL-containeria SQLite-projektissa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:16:19 +03:00
cea1961183 Testaaja: staattinen analyysi + strukturoitu LLM-arviointi
Staattinen analyysi selaimessa (ennen LLM:ää):
- Käyttämättömät importit
- Puuttuvat importit (FastAPI, Session)
- Tyhjät funktiot

LLM-arviointi 5 kohdalla (esimerkkivastauksineen):
1. Imports, 2. Database, 3. Endpoints, 4. Error handling, 5. Security

Korjausluuppi käynnistyy jos ✗ löytyy tai staattinen analyysi huomauttaa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:13:15 +03:00
21a8015ea3 SQLAlchemy esimerkki: declarative_base() → DeclarativeBase (v2.0+)
declarative_base() on poistettu SQLAlchemy 2.0:sta.
Päivitetty molemmat esimerkit käyttämään class Base(DeclarativeBase).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:07:23 +03:00
c3991193d9 Pipeline-vaiheet rivittyvät + fixedDockerfile viittaus korjattu
Poistettu fixedDockerfile-viittaus joka kaatoi pipelinen ennen
renderProjectCard:ia → ZIP ei generoitunut.
Pipeline-vaiheet käyttävät nyt flex-wrap:ia eikä overflow-x:ää.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:03:14 +03:00
02c6d67218 Korjausvaihe ei ylikirjoita Dockerfilea — template pysyy
QA-validoinnin korjausvaihe antoi LLM:n generoida uuden Dockerfilen
joka sekoitti pip:n ja uv:n. Nyt korjaus kohdistuu vain .py ja
pyproject.toml -tiedostoihin. Dockerfile pysyy templatena.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:58:49 +03:00
de1cf009fa pyproject.toml: stdlib-moduulit (sqlite3, os, sys) kielletty riippuvuuksista
Malli laittoi sqlite3:n dependencies-listaan → uv sync epäonnistui.
Koodarin, QA:n ja validoinnin prompteihin lisätty selkeä kielto.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:53:10 +03:00
060f36f479 ZIP-lataus: null-tarkistus tiedostoille + virheilmoitus
Lisätty guard puuttuvalle projectFiles-datalle ja null-safe content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:46:34 +03:00
e2ec0fa43d v0.2.2: responsiivinen UI, Ollama-proxy, mixed content korjaus
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:23:35 +03:00
8752c0f465 Responsiivinen korkeus: terminaali ja UI skaalautuvat viewport-korkeuteen
Terminaali: clamp(200px, 35vh, 500px) — skaalautuu ikkunan mukaan.
<900px korkeus: pienempi otsikko, tiiviimmät avataret, matalampi terminaali.
>1200px korkeus: isompi terminaali ja promptikenttä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:22:15 +03:00
8c95282654 Tiivistetty layout: terminaali 500→300px, pienemmät marginaalit
Mahtuu 1440px korkeuteen ilman vieritystä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:20:06 +03:00
a1bc1af646 Hardware API: Ollama-fallback kun wgpu ei tunnista GPU:ta Dockerissa
/api/v1/hardware tarkistaa nyt myös Ollaman tilan fallbackina.
kpn models näyttää ladattujen mallien määrän ja ✓ oikein.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:18:00 +03:00
6b27cbbade kpn load: rehellinen viesti — 'valittu' eikä 'ladattu ja aktiivinen'
Frontend ei tiedä onko malli oikeasti ladattu Ollamaan.
Nyt näytetään 'valittu — natiivisolmu lataa mallin' ja
varoitus ensimmäisen pyynnön hitaudesta.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:04:11 +03:00
4d9c51a86f .gitignore: *.db — ajonaikaiset tietokannat pois versionhallinnasta
nodes.db muuttui jatkuvasti ja esti deployn.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:59:35 +03:00
66d1e8c4b1 Ollama-kutsut hubin kautta: ei mixed content HTTPS-sivulla
Lisätty GET /api/v1/ollama/tags proxy-endpoint hubiin.
Poistettu suorat http://hostname:11434 -kutsut frontendistä.
Hub välittää Ollama-kutsut sisäisessä Docker-verkossa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:57:48 +03:00
2eeac255f6 Piilotetut paneelit tavoitettavissa hashilla: #network, #laskentaverkko, #codelab
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:52:57 +03:00
6097cfc263 Ylimääräiset rönsyt karsittu. Playground suht ok. 2026-04-07 08:51:47 +03:00
8aed9f97a2 Puhuvat päät ja simulaatio-nappi piilotettu (koodi säilytetty)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:45:39 +03:00
c0ccd76a4c v0.2.1: Ollama-integraatio, pipeline, prompt-editori
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:41:28 +03:00
d2edb38879 Alaotsikko: 'Hajautettu WebGPU Laskentaverkko' → 'AI-ohjelmistokehitystiimi'
Simulaatio-viittaukset poistettu näkyvistä. Käännökset päivitetty.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:40:22 +03:00
2755794554 Agenttien valinta: klikkaus = yksi, Shift+klikkaus = multi-select
Normaali klikkaus valitsee yhden agentin (poistaa muut valinnat).
Shift+klikkaus lisää/poistaa agentin valinnasta yhteistä promptia varten.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:36:52 +03:00
dbb37b3c60 Laskentaverkko ja Koodilaboratorio piilotettu, Agents oletuksena
Simulaatio-välilehdet piilotettu display:none:lla (koodi säilytetty).
Agents & CLI on nyt oletusvälilehti.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:34:57 +03:00
0e7497b627 Oletuspromptit agentille + klikattavat pipeline-vaiheet
- Jokaisella agentilla on nyt oletusprompt joka näkyy heti modalissa
- Muokatut promptit tallentuvat localStorageen
- Pipeline-vaiherivin (✓ Suunnittelu → ✓ models.py → ...) klikkaus
  avaa modalin jossa näkyy kyseisen vaiheen prompti + tulos
- 📋 Näytä prompti -nappi näkyy aina kun agentti on valittu

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:30:56 +03:00
6b756e2e83 Prompt-editori modal: avain-arvo-parit, editoitavat kentät
Klikkaa agenttia → 'Näytä viimeisin prompti' → modal-ikkuna jossa
prompti on pilkottu rakenteellisiin kenttiin (Project, CONSTRAINTS,
EXAMPLE jne.). Editoitavat kentät sinisellä ✏️, lukitut harmaalla 🔒.
'Aja uudelleen' kokoaa promptin kentistä ja ajaa sen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:24:29 +03:00
5a52f5113c QA validointi: listaa jokaisen tarkistuksen tuloksen ✓/✗
Aiemmin QA vastasi vain 'OK'. Nyt prompti vaatii raportin jokaisesta
6 tarkistuksesta (Dockerfile, deps, ports, README, testit, pyproject)
esimerkkivastauksen kanssa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:18:46 +03:00
7b0660e46e Korjattu illegal break: if(!task_id) break → if(task_id) { ... }
break ei ole sallittu if/else-lohkossa. Kääritty avatar-aktivointi
if(data.task_id) -ehtoon.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:16:02 +03:00
b35600b417 Few-shot esimerkit pipeline-prompteissa: manageri, koodari, QA
Pienet mallit tuottavat huomattavasti parempaa koodia kun promptissa
on konkreettinen esimerkki oikeasta vastauksesta. Lisätty:
- Manageri: esimerkki tiedostolistasta
- Koodari: esimerkki main.py ja models.py -tiedostoista
- QA: esimerkki pytest + TestClient -testeistä

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:12:25 +03:00
7693269e5d Dockerfile generoidaan templatesta, ei LLM:llä — ei enää pip/uv sekaannuksia
Malli sekoitti pip:n ja uv:n syntaksin (pip install --system ei toimi).
Nyt Dockerfile rakennetaan suoraan templatesta generoiduista tiedostoista:
pyproject.toml → uv sync, requirements.txt → uv pip install, tai fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:11:00 +03:00
702c9170ad Avatareiden aktivointi vain task_id:llisistä viesteistä
Hubin automaattiset 10s-broadcastit aktivoivat managerin avatarin.
Nyt tarkistetaan data.task_id ennen avatar-päivitystä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:10:00 +03:00
3feed22055 Agenttien promptit näkyvissä ja editoitavissa + Aja uudelleen -nappi
Klikkaa agenttia → näet viimeisimmän pipeline-promptin tekstikentässä.
Voit editoida promptia ja painaa 'Aja uudelleen' ajamaan sen samalla
mallilla. Pipeline tallentaa nyt koko promptin (ei vain kuvausta).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:03:48 +03:00
75310c989e QA validointivaihe: tarkistaa tiedostojen yhteensopivuuden
Uusi vaihe DevOps-vaiheiden jälkeen: QA tarkistaa että
Dockerfile, docker-compose, README ja testit viittaavat
oikeisiin tiedostoihin ja riippuvuuksiin. Jos ongelmia löytyy,
DevOps korjaa Dockerfilen automaattisesti.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:43:13 +03:00
743946a391 Dockerfile-prompti dynaaminen: tarkistaa onko pyproject.toml generoitu
Jos pyproject.toml puuttuu, käytetään uv pip install suoraan.
COPY-rivi listaa vain oikeasti olemassa olevat .py-tiedostot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:41:54 +03:00
0bd5faa684 API rate limit 10→30 pyyntöä/min: pipeline tarvitsee ~12 vaihetta
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:35:26 +03:00
e0c8c3586b Mallin vaihto: spinner-indikaattori + pelkkä numero oikotienä
kpn load näyttää spinnerin kun Ollama lataa mallia.
Pelkkä numero (esim. '4') toimii oikotienä 'kpn load 4':lle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:33:14 +03:00
3a1c5c723c kpn models: numerot + ladattu-tila yhtenäisessä listassa
Sama lista kuin kpn load, mutta näyttää myös mitkä mallit
on ladattu Ollamaan (✓) ja WASM-tilan. Numerot toimivat
suoraan kpn load -komennolla.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:30:39 +03:00
3139d1ac65 kpn models: näyttää Ollamasta ladatut mallit + WASM-tilan
Hakee Ollaman /api/tags-endpointista ladatut mallit kokoneen,
parametreineen ja kvantisointitasoineen. WASM-tila näkyy myös.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:29:09 +03:00
49a1629646 TODO.md: turvallisuus, yksityisyys ja väärinkäytön esto
Hajautetun verkon riskit dokumentoitu: tulosten validointi,
promptien salaus, reputaatiojärjestelmä, rate limiting, token-talous.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:26:52 +03:00
13008ac693 ny mänöö hyvin 2026-04-07 07:20:57 +03:00
30e81875db Reconnect yhdellä rivillä: ei floodata terminaalia
Sama rivi päivittyy laskurilla: '↻ Yhdistetään uudelleen... (3)'
Rivi poistetaan kun yhteys palautuu.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:18:13 +03:00
73bcd3143a WebSocket auto-reconnect: yhteys palautuu 3s kuluttua katkoksesta
connectHub() luo uuden WebSocketin ja asettaa onopen/onclose/onmessage.
onclose käynnistää 3s timerin joka kutsuu connectHub() uudelleen.
Terminaaliin tulee '↻ Yhdistetään uudelleen...' -viesti.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:14:52 +03:00
216b95d15c kpn load: laitteiston VRAM/RAM tarkistus, liian isot mallit merkitään
Hub: uusi GET /api/v1/hardware palauttaa natiivisolmun GPU/RAM-tiedot.
Frontend: kpn load hakee laitteistotiedon ja näyttää mallit joihin
laite riittää. Liian isot mallit näkyvät yliviivattuina + varoitus.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:08:34 +03:00
34ef19472a kpn load: Ollama-mallin vaihto lennossa (0.5b → 32b)
- Hub: uusi POST /api/v1/model endpoint, broadcastaa change_model
- Native node: kuuntelee change_model, kutsuu Ollaman pull + vaihtaa mallin
- Frontend: kpn load näyttää 5 mallia, numero vaihtaa Ollaman mallin
- Selain-WASM pysyy 0.5B:nä (kpn load 1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:05:57 +03:00
54a5af96c7 Tab-autokorjaus: korjattu ohitettu autocorrect Tab-handlerissa
Tab-painallus meni suoraan dropdown-getCandidatesiin eikä kutsunut
autocorrectiä. Nyt Tab yrittää ensin korjata typon, sitten täydentää.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:00:06 +03:00
842153a7ec uv-paketinhallinta: Dockerfile, README ja pyproject.toml käyttävät uv:tä
Dockerfile kopioi uv:n ghcr.io/astral-sh/uv:latest -imagesta.
README ohjeistaa uv sync + uv run. pyproject.toml pysyy ennallaan
(uv-yhteensopiva formaatti).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:45:08 +03:00
5c25c7f9c1 DevOps Dockerfile-prompti: pip-only, ei poetryä/condaa
Malli generoi poetry.lock-riippuvaisen Dockerfilen. Nyt prompti
kertoo tarkan riippuvuuksien asennustavan (pyproject.toml/requirements.txt/pip)
ja antaa valmiin CMD-rivin. Yksivaiheinen build riittää Pythonille.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:44:03 +03:00
ac698a766e DevOps-agentti: Dockerfile + docker-compose.yml + README pipeline-vaiheina
DevOps generoi nyt kolme tiedostoa:
- Dockerfile (multi-stage build, python:3.12-slim)
- docker-compose.yml (palvelut, volumet, portit)
- README.md (quick start docker compose up)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:41:34 +03:00
f1b57a6c53 Tab korjaa kirjoitusvirheet + fuzzy-match alikomennoille
Tab-painallus yrittää ensin autokorjausta (typo-taulukko + Levenshtein),
sitten normaalia tab-completionia. Myös alikomennot korjautuvat
fuzzy-matchilla (esim. "kpn rnu" → "kpn run").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:37:51 +03:00
b70cdbd24d Terminaalin autokorjaus: knp→kpn, kpn rnu→kpn run jne.
Typo-taulukko yleisimmille kirjoitusvirheille + Levenshtein-etäisyys
tuntemattomille ensimmäisille sanoille (max 2 merkin ero → kpn).
Korjaus näytetään terminaalissa keltaisella.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:35:30 +03:00
01d8b597e1 ZIP CRC-32 checksum lisätty: purkaminen ei enää epäonnistu
Local file header ja central directory entry -tietueista puuttui
CRC-32 kenttä. Lisätty crc32()-funktio ja kirjoitetaan checksum
molempiin ZIP-rakenteisiin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:31:42 +03:00
f2ca4890df Dockerfile: touch main.rs ennen buildia, estää stub-binaryn jäämisen
Cargo ei rekompiloi jos vanha binääri on olemassa.
touch pakottaa uudelleenkäännöksen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:26:33 +03:00
3eb0c4d939 Ollama-integraatio: GPU-inferenssi NVIDIA/AMD/Apple, ei Candle-rajoitteita
- docker-compose: Ollama-container GPU:lla + persistent volume malleille
- native-node: Candle poistettu, kutsuu Ollaman HTTP API:a (async)
- Dockerfile: yksinkertaistettu, ei CUDA SDK:ta (Ollama hoitaa GPU:n)
- Tukee kaikkia malleja: qwen2.5-coder:1.5b/3b/7b/14b/32b
- OLLAMA_MODEL ympäristömuuttujalla vaihdetaan malli
- kpn models näyttää Ollama-mallit nopeustiedoilla

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:22:11 +03:00
d8443792a3 kpn load ja kpn models selkeytetty: selain vs natiivi
kpn load lataa 0.5B selaimeen (ainoa joka toimii WASM:ssa).
kpn models näyttää molemmat vaihtoehdot nopeustiedoilla.
Ei enää harhaanjohtavia numerovalintoja.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:06:12 +03:00
ae379bdda4 Zippi korjattu 2026-04-07 06:00:49 +03:00
ed02e47158 ZIP-lataus korjattu: tiedostot globaaliin muuttujaan data-attribuutin sijaan
JSON data-attribuutissa heittomerkit katkaisivat HTML:n.
Nyt projectFiles[cardId] tallentaa tiedostot muistiin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:00:17 +03:00
959dc532bb native-laskentaan säätöä 2026-04-07 05:20:54 +03:00
1ef7f7c956 max_tokens per vaihe: manageri 200, koodari 512, testaaja 200, QA 512, DevOps 256
Hub ja natiivisolmu tukevat nyt max_tokens-kenttää API-pyynnöissä.
Pipeline-vaiheet käyttävät sopivan kokoisia token-rajoja.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:03:26 +03:00
e6e1f60935 Pipeline: QA kirjoittaa testit + DevOps tekee README:n
Uudet vaiheet koodiarvioinnin jälkeen:
- QA: kirjoittaa test_app.py (pytest, max 3 testiä)
- DevOps: kirjoittaa README.md (asennus, käynnistys, testaus)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:00:04 +03:00
322c98ff59 Pipeline-promptit: rajoitteet kerrottu managerille ja koodarille
Manageri tietää nyt 400 tokenin rajan per tiedosto ja pitää
tiedostomäärän max 3:ssa. Koodari kirjoittaa lyhyttä, fokusoidusti.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:58:51 +03:00
406e2226f0 Native node max_tokens 64→512: koodi ei jää kesken
64 tokenia riitti vain funktion alkuun. 512 mahdollistaa
kokonaisten tiedostojen generoinnin pipeline-vaiheissa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:57:42 +03:00
9d7496157c Native node CPU-moodi: Candle 0.8 RMS-norm ei tue CUDA:a
candle-core 0.8 ei sisällä rms-norm CUDA-kerneliä → inferenssi epäonnistui.
Vaihdettu CPU:ksi joka on silti ~10-20× nopeampi kuin selaimen WASM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:52:50 +03:00
d332b7e910 Hub priorisoi natiivisolmut (GPU) selainsolmujen edelle
Lisätty node_types HashMap joka seuraa solmutyyppiä (native/browser).
API reitittää tehtävät ensin vapaalle natiivisolmulle, sitten selaimelle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:45:20 +03:00
8e55a15d66 bugifiksejä 2026-04-06 21:34:03 +03:00
4e3134d908 CUDA_COMPUTE_CAP=89: bindgen_cuda ei tarvitse nvidia-smi:tä buildissa
candle-kernels build vaatii GPU-arkkitehtuurin tunnistusta.
nvidia-smi ei ole saatavilla Docker build -vaiheessa, joten asetetaan
CUDA_COMPUTE_CAP manuaalisesti (RTX 4090 = sm_89).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:26:03 +03:00
cd45db001a Dockerfile.native-node: lisätty cli/ workspace-jäsen
Cargo workspace vaatii kaikkien jäsenten Cargo.toml:n kopioinnin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:23:48 +03:00
4ad8a8793e Native node CUDA Docker: nvidia/cuda base + GPU runtime
Dockerfile käyttää nvidia/cuda:12.6.3 -imagea jossa CUDA-kirjastot
ovat valmiina. docker-compose lisää runtime: nvidia + NVIDIA_VISIBLE_DEVICES.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:02:43 +03:00
b2694c232e Poistettu 1.5B Q4 -vaihtoehto: GGUF dequantisointi liian hidas WASM:ssa
1.5B Q4_K_M: ~33s/token (0.03 tok/s) — käyttökelvoton
0.5B F32:    ~2.5s/token (0.4 tok/s)  — käyttökelpoinen

kpn load lataa nyt suoraan 0.5B:n ilman valintalistaa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 20:19:34 +03:00
ba58236c52 Worker console.log välitetään pääsäikeelle → UI-kuuntelijat toimivat
Workerin WASM-logit (lataus, malli valmis, inferenssi) eivät näkyneet
pääsäikeessä. Nyt console.log on ylikirjoitettu Workerissa lähettämään
viestit postMessage:lla, ja pääsäie syöttää ne omaan console.log:iin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 20:12:42 +03:00
861f2a6902 Worker ES module: importScripts → import (wasm-pack --target web)
wasm-pack --target web generoi ES module -syntaksia (export).
Worker käyttää nyt type:'module' ja import-lauseita importScripts:n sijaan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 20:04:53 +03:00
11fd5b0c9e jotain tulee 2026-04-06 20:00:55 +03:00
b3646ae5d3 Web Worker: WASM-inferenssi erillisessä säikeessä, UI ei jäädy
- Poistettu kaikki web_sys::window() -kutsut Rust WASM:sta
- Uudet Worker-yhteensopivat apufunktiot: perf_now(), worker_fetch(), sleep_ms()
- worker.js lataa ja ajaa WASM-moduulin erillisessä säikeessä
- ensureCoderNode käynnistää Workerin pääsäikeen sijaan
- Selaimen UI pysyy responsiivisena inferenssin aikana

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:59:09 +03:00
fc95cf8c1b Terminaaliin varoitus inferenssin aikana + yield ennen blokkia
Käyttäjälle näytetään '(selain voi hidastua)' kun inferenssi alkaa.
setTimeout yield varmistaa statusrivin piirtämisen ennen WASM-blokkia.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:31:25 +03:00
1ae1bf98e2 API timeout nostettu 120s → 600s: WASM-inferenssi on hidasta
Kvantisoidun 1.5B-mallin inferenssi on ~0.2 tok/s WASM:ssa.
Pipeline-tehtävät vaativat pidemmän odotusajan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:29:15 +03:00
f567fd3f8a Mallin automaattinen lataus poistettu — käyttäjä käynnistää kpn load:lla
Aiemmin localStorage muisti edellisen latauksen ja käynnisti mallin
automaattisesti sivulle tullessa. Nyt käyttäjä päättää itse milloin lataa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 18:03:17 +03:00
38367eac97 Terminaaliin latauksen tilaindikaattori (spinner + vaihe)
Mallin latauksen aikana terminaalissa näkyy animoitu spinner
ja nykyinen vaihe: WASM → tokenizer → malli (%) → rakennus → valmis.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:29:33 +03:00
20716186bc Hub: qwen-coder reititys tunnistaa kaikki coder-solmut (05b, 3b, 1.5b)
API etsi vain 'qwen-coder-05b' tai 'qwen-coder', ei 'qwen-coder-3b'.
Nyt task.starts_with('qwen-coder') matchaa kaikki variantit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:27:49 +03:00
4e810ed4a2 Kaikki agentit käyttävät qwen-coder -mallia + valmis-viesti deduplikoitu
QA ja DevOps käyttivät smollm-135m:ää jota ei ole selaimessa ladattuna.
Nyt kaikki agentit käyttävät ladattua qwen-coder-mallia.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:23:59 +03:00
91ff9e00f9 kvantisointia 2026-04-06 16:15:56 +03:00
e652bf7ab6 1.5B Q4_K_M: vaihdettu 3B→1.5B koska 3B ei mahdu WASM:iin (~1 GB vs ~2 GB)
3B GGUF vaati ~5 GB muistia parsinnassa → SIGILL WASM:n 4 GB rajalla.
1.5B Q4_K_M on ~1 GB ja mahtuu turvallisesti selaimeen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:14:41 +03:00
eb69893124 WASM release-build: GGUF dequantize vaatii optimointeja
Debug-moodi aiheutti SIGILL (Illegal Instruction) GGUF-tensorien
dequantisoinnissa. Release-build ratkaisee ongelman.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:07:02 +03:00
d18314bfc8 GGUF Q4_K_M -tuki 3B-mallille: kvantisoidtu versio (~1.9 GB) mahtuu selaimeen
Safetensors-muotoinen 3B (~6.2 GB) aiheutti WASM capacity overflow.
Nyt käytetään candle quantized_qwen2 -moduulia GGUF-tiedoston lataamiseen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:54:23 +03:00
99b011e399 Isomman qwen-mallin lataus 2026-04-06 13:40:19 +03:00
Jaakko Vanhala
3976bb6251 IP-yhteysraja nostettu 4→10: mahdollistaa useamman laitteen samasta IP:stä
Jokainen selain tarvitsee 2 WebSocket-yhteyttä (UI + coder-node).
Vanha raja 4 esti toisen koneen yhdistämisen samasta IP:stä (esim. kotiverkko).
Uusi raja 10 riittää 5 samanaikaiselle selaimelle / laitteelle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:56:36 +03:00
Jaakko Vanhala
0c32fecdc4 GUIDE.md: laajennettu tokenisaatio-osio suomi/englanti-vertailulla
Lisätty konkreettiset esimerkit Qwen2.5-Coder -tokenisaattorilla:
- Koodi-esimerkki: print vs. tulosta
- Kolme lauseparia taulukossa (The cat sat / Kissa istui jne.)
- Merkkejä/token -sarake näyttää tehokkuuseron
- Selitys miksi englanti on 30-50% tehokkaampaa
- Miksi tämä merkitsee: nopeus, konteksti, ymmärrys

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:41:12 +03:00
Jaakko Vanhala
801cc0371d Yhtenäinen kirjoitusasu: Qwen2.5-Coder:0.5B ja Qwen2.5-Coder:3B (kaksoispiste)
Korjattu agents-sivun status-palkki, codelab-loading ja GUIDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:27:22 +03:00
Jaakko Vanhala
176f2d6915 Mermaid-kaaviot oppaaseen + mallitiedot agents-sivun status-palkkiin
GUIDE.md:n ASCII-kaaviot korvattu Mermaid-kaavioilla:
- Projekti-pipeline: flowchart TD värikoodatuilla vaiheilla
- Prompttirakenne: system → agent → user → prefill ketju

Mermaid ladataan CDN:stä ja renderöidään automaattisesti dark-teemalla.
Fallback: kaavion lähdekoodi näkyy tekstinä jos Mermaid ei lataudu.

Agents-sivun compute-status näyttää nyt tarkan mallitiedon:
- "Qwen2.5-Coder-0.5B" tai "Qwen2.5-Coder-3B"
- Tooltip: parametrimäärä, runtime, max tokenit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:23:19 +03:00
Jaakko Vanhala
dd1945ab28 Opas-välilehti: GUIDE.md renderöidään sivustolle omana näkymänä
Uusi "Opas"-välilehti (panel-guide) lataa GUIDE.md:n fetchillä ja
renderöi sen inline markdown→HTML -parserilla:
- Otsikot (h1-h3) GitHub-tyylisesti
- Koodiblokit highlight.js-korostuksella
- Taulukot (header + body, border-collapse)
- Listat (bullet + numeroitu)
- Inline-muotoilu: **bold**, *italic*, `code`
- Horisontaaliviivat

GUIDE.md siirretty static/-hakemistoon jotta hub servaa sen suoraan.
Navigointi: #guide hash tai klikkaa "Opas"-välilehteä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:20:54 +03:00
Jaakko Vanhala
262fee3b49 GUIDE.md: opettavainen yhteenveto kielimalleista, tokeneista ja laadun parantamisesta
Kattaa:
- Kielimallit ja parametrimäärät (135M → 1800B vertailu)
- Tokenit: mitä ne ovat, miksi kieli vaikuttaa, token-budjetti
- Prompttirakenne: system/agent/user/prefill + miksi englanniksi
- Prefill-tekniikka: miten se toimii ja miksi se säästää tokeneita
- Sampling: temperature, top-k, repetition penalty selitettyinä
- Stop-sekvenssit: milloin generointi loppuu
- Projekti-pipeline: agenttitiimin työnkulku kaaviona
- Laadun parantaminen 10 eri keinolla:
  1. Isompi malli
  2. Paremmat promptit
  3. Kontekstin hallinta
  4. Iterointi (review-luuppi)
  5. Erikoistetut system promptit
  6. Few-shot esimerkit
  7. Temperature-säätö tehtävän mukaan
  8. Ensemble (sama prompti usealle mallille)
  9. Post-processing
  10. Fine-tuning (LoRA)
- Välimuistiarkkitehtuuri: miksi toinen lataus on nopea
- Käytännön lukuja: token-määrät, ajat, kustannukset

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:16:36 +03:00
Jaakko Vanhala
aa7540a6bf Prompt Inspector: [>]-nappi status-rivillä näyttää mitä mallille lähetettiin
Jokaisen kpnRun-tuloksen status-rivillä on [>]-nappi joka avaa inspektor-paneelin:
- system: inferenssin system prompt
- shared: kaikille agenteille yhteinen prompti (jos asetettu)
- agent: valitun agentin system prompt
- user: käyttäjän/pipelinen prompti (kokonaisuudessaan, scrollattava)
- prefill: ``` (ChatML prefill-tekniikka)
- Token-estimaatti: ~N tok in → M tok out

Paneeli avautuu/sulkeutuu klikkaamalla. Näyttää eksaktisti saman
mitä malli saa syötteeksi — hyödyllinen debuggaukseen ja promptien
kehittämiseen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:00:11 +03:00
Jaakko Vanhala
762066102a PROMPTS.md: kaikki järjestelmän promptit dokumentoitu eksaktisti
Kattaa kaikki 9 osa-aluetta:
1. Inferenssin system prompt (ChatML)
2. Agenttikohtaiset system promptit (7 agenttia)
3. Projekti-pipeline promptit (5 vaihetta + erikoistapaukset)
4. Yksinkertaisen pipelinen promptit
5. Yksittäiset komennot (run, hello, warmup)
6. Stop-sekvenssit (10 kpl)
7. Vastauksen siivous (4 vaihetta)
8. ChatML-promptin koostaminen (prefill-tekniikka)
9. Sampling-parametrit

Jokainen prompti on eksaktissa muodossaan muuttujamerkinnöillä.
Parsintasäännöt ja erikoistapaukset (pyproject.toml, requirements.txt)
dokumentoitu yksityiskohtaisesti.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 07:53:01 +03:00
Jaakko Vanhala
bef5b6fc3c uv/pyproject.toml tuki projektipipelineen, requirements.txt fallbackina
Managerin prompti ohjaa käyttämään pyproject.toml:ia (.toml sallittu).
Koodari saa pyproject.toml-tiedostolle eksplisiittisen esimerkkiformaatin
jossa [project] + dependencies + [project.scripts] start-komennolla.
requirements.txt toimii edelleen fallbackina jos malli tuottaa sen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 07:43:47 +03:00
Jaakko Vanhala
095b72d2d6 Managerin prompti: riippuvuusjärjestys (models.py ennen main.py)
Lisätty sääntö: "List dependencies first, then main app" jotta
koodari saa kirjoitettua riippuvuudet (models, schemas) ensin
ja pääsovelluksen (main.py) saa kontekstiksi oikeat importit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:52:25 +03:00
Jaakko Vanhala
4cb6128a27 Tiedostoparsinta: hyväksyy myös pelkät tiedostonimet ilman kuvausta
Manageri tuottaa toisinaan pelkän listan (app.py, requirements.txt)
ilman "filename: description" -formaattia. Parsija hyväksyy nyt molemmat.
Koodarin prompti vahvistettu: "Use the exact libraries mentioned in the
project description" estää Flaskiin vaihtamisen kun tehtävä sanoo FastAPI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:44:16 +03:00
Jaakko Vanhala
4dff534fbf Projektikortti: tiedostovälilehdet, kopioi per tiedosto, lataa ZIP
Pipeline-tulokset renderöidään interaktiivisena projektikorttina terminaaliin:

- Tiedostovälilehdet (klikkaa vaihtaaksesi: main.py | models.py | ...)
- Syntaksikorostus (highlight.js) jokaisessa tiedostossa
- "Kopioi"-nappi per tiedosto (leikepöydälle)
- "Kopioi kaikki" -nappi (kaikki tiedostot yhtenä tekstinä)
- "Lataa ZIP" -nappi (selaimessa generoitu ZIP ilman ulkoisia kirjastoja)

ZIP-generointi on toteutettu puhtaalla JavaScriptillä (uncompressed store)
ilman JSZip- tai muita riippuvuuksia.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:37:10 +03:00
Jaakko Vanhala
d5ab6272d3 Paranneltu project-pipelinen promptit ja tiedostoparsinta
Managerin prompti:
- Selkeämpi formaatti: "filename.py: what this file contains"
- Eksplisiittiset säännöt: max 4 tiedostoa, ei polkuja, vain tiedostonimet
- Sallitut tiedostopäätteet: .py, .txt, .json, .html

Tiedostoparsinta tiukennettu:
- Hylkää polut (chucknorris/fastapi/...) — vaatii ettei sisällä /
- Vaatii tiedostopäätteen (.xyz)
- Ei välilyöntejä nimessä

Koodarin prompti:
- "Project:" konteksti ensin, sitten tarkka tiedostokohtainen ohje
- "Write correct, working code. No explanations."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:34:23 +03:00
Jaakko Vanhala
2e7b86deeb Pipeline-vaiheiden visuaalinen seuranta agenttinäkymässä
Terminaalin yläpuolelle ilmestyy pipeline-progress-palkki:
  ✓ Suunnittelu → ✓ models.py → ◷ main.py → ◯ Review

Jokainen vaihe on hover-tooltip joka näyttää:
- Vaiheen nimi ja agentti (värikoodattu)
- Input: mitä agentti sai syötteeksi
- Output: mitä agentti tuotti (esikatselu 150 merkkiä)

Myös agenttien avatar-korttien tooltip päivittyy reaaliaikaisesti
näyttämään viimeisimmän vaiheen input/output.

Palkki tyhjenee automaattisesti uuden pipelinen alkaessa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:32:59 +03:00
Jaakko Vanhala
a6e49870d6 Monivaiheinen projektipipeline: kpn project -komento
Uusi kpn project -komento rakentaa ohjelmistoprojektin tiedosto kerrallaan:

1. Manageri pilkkoo projektin tiedostoiksi (max 5)
   → parsii "FILENAME: description" -rivit
2. Koodari generoi jokaisen tiedoston erikseen
   → saa kontekstina aiemmin generoidut tiedostot
3. Testaaja arvioi koko projektin
   → etsii bugeja ja puutteita
4. Korjausluuppi: jos testaaja löytää ongelmia
   → koodari saa review-palautteen ja korjaa
   → testaaja arvioi uudelleen

Fallback: jos manageri ei tuota tiedostolistaa, generoidaan yhtenä kokonaisuutena.

kpn pipeline säilyy yksinkertaisena 3-vaiheisena (manageri → koodari → testaaja).

Esimerkkejä:
  kpn project "FastAPI + SQLite REST API for users"
  kpn project "Flask todo app with database"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:30:38 +03:00
Jaakko Vanhala
d68882249e Token-raja 256→512: mahdollistaa pidemmät kooditiedostot
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:11:09 +03:00
Jaakko Vanhala
6a587cd080 Terminaalin status-rivit tiivistetty: yksi rivi per tehtävä jota päivitetään
Ennen (3 riviä):
  → qwen-coder käsittelee...
  → Reititetty solmulle #2
  ✓ Qwen2.5-Coder-0.5B-Instruct (75 tok)

Jälkeen (1 rivi jota päivitetään):
  → qwen-coder käsittelee...  →  → Reititetty solmulle #2 ▌  →  ✓ Qwen2.5-Coder 75 tok · 12.0s · 6.3 tok/s

Sama div päivitetään pyynnön elinkaaren läpi: käsittelee → reititys → valmis/virhe.
task_routed-viestit päivittävät samaa riviä uusien rivien sijaan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 18:34:07 +03:00
Jaakko Vanhala
f17fcf0f9d Terminaalin koodivastaus tiivistetty yhdelle riville, klikkaus laajentaa
Tuloste näyttää ensimmäisen koodirivin + "(+N riviä)":
  ✓ Qwen2.5-Coder (75 tok)
  ▶ fn fibonacci(n: usize) -> usize { (+8 riviä)

Klikkaus laajentaa/sulkee koko koodin highlight.js-korostuksella
ja vasemman reunan indikaattoriviivalla.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 18:29:18 +03:00
Jaakko Vanhala
ac15336c9f Stop-sekvenssit: katkaistaan myös "// Example usage" ja "# Example" kommentit
Malli tuottaa toisinaan esimerkkikoodia funktioiden jälkeen joka ei ole osa
varsinaista vastausta. Nyt generointi katkeaa ennen näitä.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 18:21:40 +03:00
445 changed files with 83253 additions and 1786 deletions

12
.gitignore vendored
View File

@@ -34,3 +34,15 @@ Cargo.lock
*.pdb
# End of https://www.toptal.com/developers/gitignore/api/rust,linux
# Ajonaikaiset tietokannat
*.db
# Lokitiedostot
*.log
# Wanha versio
temp/
# Muut
zipit/**

157
TEMPLATING.md Normal file
View File

@@ -0,0 +1,157 @@
# Templating — rakennuspalaset koodigeneroinnissa
## Perusperiaate
Kielimalli päättää **mitä** rakennetaan (entiteetit, kentät, tyypit, yhteydet).
Template-funktiot päättävät **miten** se rakennetaan (importit, engine setup, testikonfiguraatio).
```
Projektikuvaus → LLM → JSON-speksi → Templateit → Koodi → Validointi
```
LLM:n kontribuutio on yksi JSON-rakenne. Kaikki muu on determinististä —
sama speksi tuottaa aina saman koodin.
## Miksi tämä toimii
Pienen kielimallin (0.5B7B) vahvuudet ja heikkoudet ovat epäsymmetrisiä:
| Tehtävä | LLM:n kyky | Ratkaisu |
|---------|-----------|----------|
| Tunnista entiteetit kuvauksesta | Hyvä | LLM tekee |
| Valitse kenttätyypit | Hyvä | LLM tekee |
| Muista importit oikein | Huono | Template tekee |
| SQLite connect_args | Huono | Template tekee |
| Testikonfiguraatio | Huono | Template tekee |
| Dockerfile-rakenne | Huono | Template tekee |
Annetaan mallin tehdä se missä se on hyvä. Hoidetaan loput mekaanisesti.
## JSON-speksi
Kielimallin ainoa tuotos on JSON joka kuvaa projektin rakenteen:
```json
{
"project_name": "library-app",
"entities": [
{
"name": "Author",
"table_name": "authors",
"fields": [
{"name": "name", "sa_type": "String(255)", "py_type": "str", "nullable": false, "default": null}
]
},
{
"name": "Book",
"table_name": "books",
"fields": [
{"name": "title", "sa_type": "String(255)", "py_type": "str", "nullable": false, "default": null},
{"name": "author_id", "sa_type": "Integer", "py_type": "int", "nullable": false, "default": null}
]
}
],
"relationships": [
{"from": "Book", "field": "author_id", "to": "Author", "type": "many-to-one"}
],
"extra_imports": []
}
```
Speksin laatu ratkaisee kaiken. Hyvä speksi → hyvä projekti. Huono speksi →
teknisesti toimiva mutta sisällöllisesti väärä projekti.
## Architect-promptin rooli
Architect-agentti (JSON-speksin generoija) on kriittisin kohta koko pipelinessa.
Sitä ohjataan neljällä keinolla:
1. **Chain-of-thought** — malli miettii ensin entiteetit, sitten kentät,
sitten yhteydet, vasta lopuksi JSON
2. **Domain-esimerkit** — Todo, verkkokauppa, blogi — malli näkee miltä
hyvä speksi näyttää eri domaineissa
3. **Anti-patternit** — turhat ID-kentät, Enum-tyypit, suomenkieliset nimet
4. **Yhteyssäännöt** — jokainen `_id`-kenttä tarvitsee relationship-merkinnän
Isompi malli tässä yhdessä kohdassa parantaisi kaikkien projektien laatua.
## Templateit
Jokainen template on funktio joka ottaa speksin ja palauttaa koodia:
```
tmplModels(spec) → models.py (SQLAlchemy, ForeignKey, relationship)
tmplSchemas(spec) → schemas.py (Pydantic Create/Response/Detail)
tmplMain(spec) → main.py (FastAPI CRUD + nested endpoints + FK-validointi)
tmplTests(spec) → test_main.py (pytest + TestClient + helper-funktiot)
tmplPyproject(spec) → pyproject.toml (PEP 621)
tmplDockerfile() → Dockerfile (uv + non-root user)
```
Templateit generoivat automaattisesti:
- ForeignKey-constraintit ja relationship()-määrittelyt
- Nested endpointit (`GET /authors/{id}/books/`)
- FK-validointi (404 jos parent-entiteettiä ei ole)
- Detail-schemat (Book + author-data mukana)
- Test-helperit jotka luovat parent-entiteetit ensin
- Bad FK -testit (varmistaa että orpo-validointi toimii)
## Validointi
Generoitu koodi validoidaan mekaanisesti ennen käyttöä:
- Syntaksitarkistus (AST parse)
- Projektin sisäiset importit (löytyykö nimi lähdetiedostosta)
- SQLite connect_args
- Relatiiviset importit (kielletty)
- Testien rakenne (ei saa kopioida appia)
- pyproject.toml (ei poetryä)
- Dockerfile (ei poetryä, uv cache -oikeudet)
Docker-testi ajaa koko projektin: build → pytest → API smoke test.
## Rajoitukset
Templateit kattavat rakenteellisesti tunnetut projektit:
| Stack | Kattavuus |
|-------|-----------|
| FastAPI + SQLAlchemy CRUD | Toimii hyvin |
| Streamlit + DuckDB dashboard | Toimii hyvin |
| Muu | Ei templatea → ei toimi |
**Ei kata:**
- Custom business-logiikka (algoritmit, laskenta, ML)
- Epätyypilliset arkkitehtuurit (WebSocket, graafit, tapahtumapohjaiset)
- Frontend-sovellukset (React, Vue)
- Mikä tahansa mitä template ei tunne
Arvio: templateit kattavat ~20% kaikista mahdollisista projekteista, mutta juuri
sen 20% mitä opiskelu- ja prototyyppiympäristöissä tarvitaan useimmin.
## Laajentaminen
Uuden stackin lisääminen vaatii:
1. Uudet template-funktiot (käsityö, ~200400 riviä per stack)
2. JSON-speksin laajennos (uudet kentät jos tarvitaan)
3. Validointisäännöt uudelle stackille
4. Docker-testikonfiguraatio
Jokainen template on staattinen — se ei opi eikä sopeudu. Kattavuus kasvaa
vain kirjoittamalla lisää templateja.
## Hybridi: seuraava askel
Paras lopputulos syntyisi yhdistelmällä:
```
Speksi → Template (runko) → LLM (business-logiikka) → Validointi
```
Template tuottaa toimivan CRUD-pohjan. LLM lisää domain-kohtaisen logiikan
pienissä palasissa (yksi funktio kerrallaan). Mekaaninen validointi
tarkistaa jokaisen lisäyksen.
Tämä palauttaa LLM:n epäluotettavuuden takaisin peliin, mutta rajattuna:
virheet ovat paikallisia (yksi funktio) eivätkä rakenteellisia (koko projekti).

475
docker-errors.log Normal file
View File

@@ -0,0 +1,475 @@
[INFO]: Checking for the Wasm target...
info: downloading component rust-std
[INFO]: Compiling to Wasm...
Compiling node v0.1.0 (/app/node)
warning: unused imports: `DType`, `Device`, and `Tensor`
--> node/src/smollm.rs:1:19
|
1 | use candle_core::{Device, Tensor, DType};
| ^^^^^^ ^^^^^^ ^^^^^
|
= note: `#[warn(unused_imports)]` (part of `#[warn(unused)]`) on by default
warning: unused import: `candle_nn::VarBuilder`
--> node/src/smollm.rs:2:5
|
2 | use candle_nn::VarBuilder;
| ^^^^^^^^^^^^^^^^^^^^^
warning: unused imports: `Cache`, `LlamaConfig`, `LlamaEosToks`, and `Llama`
--> node/src/smollm.rs:3:42
|
3 | use candle_transformers::models::llama::{Llama, LlamaConfig, LlamaEosToks, Cache};
| ^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^^ ^^^^^
warning: unused imports: `DType`, `Device`, and `Tensor`
--> node/src/phi3.rs:1:19
|
1 | use candle_core::{Device, Tensor, DType};
| ^^^^^^ ^^^^^^ ^^^^^
warning: unused import: `candle_nn::VarBuilder`
--> node/src/phi3.rs:2:5
|
2 | use candle_nn::VarBuilder;
| ^^^^^^^^^^^^^^^^^^^^^
warning: unused imports: `Config as Phi3Config` and `Model as Phi3Model`
--> node/src/phi3.rs:3:41
|
3 | use candle_transformers::models::phi3::{Config as Phi3Config, Model as Phi3Model};
| ^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^
warning: unused import: `wasm_bindgen::JsCast`
--> node/src/phi3.rs:4:5
|
4 | use wasm_bindgen::JsCast;
| ^^^^^^^^^^^^^^^^^^^^
warning: unused import: `crate::storage`
--> node/src/phi3.rs:9:5
|
9 | use crate::storage;
| ^^^^^^^^^^^^^^
warning: unused import: `Int`
--> node/src/burn_smollm/attention.rs:2:46
|
2 | use burn::tensor::{backend::Backend, Tensor, Int};
| ^^^
warning: unused imports: `Mlp` and `RmsNorm`
--> node/src/burn_smollm/attention.rs:4:22
|
4 | use super::modules::{RmsNorm, Mlp};
| ^^^^^^^ ^^^
warning: use of deprecated struct `burn::tensor::Data`: the internal data format has changed, please use `TensorData` instead
--> node/src/smollm.rs:174:23
|
174 | burn::tensor::Data::new(input_ids.iter().map(|&x| x as i32).collect::<Vec<_>>(), [input_len].into()),
| ^^^^
|
= note: `#[warn(deprecated)]` on by default
warning: use of deprecated struct `burn::tensor::Data`: the internal data format has changed, please use `TensorData` instead
--> node/src/smollm.rs:200:27
|
200 | burn::tensor::Data::new(vec![next_token as i32], [1].into()),
| ^^^^
warning: use of deprecated struct `burn::tensor::Data`: the internal data format has changed, please use `TensorData` instead
--> node/src/burn_smollm/loader.rs:1:46
|
1 | use burn::tensor::{backend::Backend, Tensor, Data};
| ^^^^
warning: use of deprecated struct `burn::tensor::Data`: the internal data format has changed, please use `TensorData` instead
--> node/src/burn_smollm/loader.rs:17:16
|
17 | let data = Data::new(vec, shape_out_in.into());
| ^^^^
warning: use of deprecated struct `burn::tensor::Data`: the internal data format has changed, please use `TensorData` instead
--> node/src/burn_smollm/loader.rs:32:16
|
32 | let data = Data::new(vec, shape.into());
| ^^^^
warning: use of deprecated struct `burn::tensor::Data`: the internal data format has changed, please use `TensorData` instead
--> node/src/burn_smollm/loader.rs:45:16
|
45 | let data = Data::new(vec, shape.into());
| ^^^^
error[E0061]: this function takes 2 arguments but 1 argument was supplied
--> node/src/smollm.rs:124:9
|
124 | burn_wgpu::init_async::<burn_wgpu::AutoGraphicsApi>(&Default::default()).await;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^--------------------- argument #2 of type `RuntimeOptions` is missing
|
note: function defined here
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cubecl-wgpu-0.2.0/src/runtime.rs:116:14
|
116 | pub async fn init_async<G: GraphicsApi>(device: &WgpuDevice, options: RuntimeOptions) {
| ^^^^^^^^^^
help: provide the argument
|
124 | burn_wgpu::init_async::<burn_wgpu::AutoGraphicsApi>(&Default::default(), /* RuntimeOptions */).await;
| ++++++++++++++++++++++
error[E0277]: the trait bound `TensorData: From<burn::tensor::Data<i32, 1>>` is not satisfied
--> node/src/smollm.rs:174:9
|
173 | let mut input_tensor = burn::tensor::Tensor::<B, 1, burn::tensor::Int>::from_data(
| ---------------------------------------------------------- required by a bound introduced by this call
174 | burn::tensor::Data::new(input_ids.iter().map(|&x| x as i32).collect::<Vec<_>>(), [input_len].into()),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `From<burn::tensor::Data<i32, 1>>` is not implemented for `TensorData`
|
= help: the following other types implement trait `From<T>`:
`TensorData` implements `From<&[E]>`
`TensorData` implements `From<&[usize]>`
`TensorData` implements `From<[E; A]>`
`TensorData` implements `From<[[E; B]; A]>`
`TensorData` implements `From<[[[E; C]; B]; A]>`
`TensorData` implements `From<[[[[E; D]; C]; B]; A]>`
`TensorData` implements `From<[[[[[Elem; E]; D]; C]; B]; A]>`
`TensorData` implements `From<[usize; A]>`
= note: required for `burn::tensor::Data<i32, 1>` to implement `Into<TensorData>`
note: required by a bound in `burn::tensor::Tensor::<B, D, K>::from_data`
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:719:12
|
717 | pub fn from_data<T>(data: T, device: &B::Device) -> Self
| --------- required by a bound in this associated function
718 | where
719 | T: Into<TensorData>,
| ^^^^^^^^^^^^^^^^ required by this bound in `Tensor::<B, D, K>::from_data`
error[E0061]: this method takes 2 arguments but 0 arguments were supplied
--> node/src/smollm.rs:183:51
|
183 | let next_token_tensor = last_logits.argmax(2).flatten::<1>();
| ^^^^^^^^^^^^-- two arguments of type `usize` and `usize` are missing
|
note: method defined here
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:292:12
|
292 | pub fn flatten<const D2: usize>(self, start_dim: usize, end_dim: usize) -> Tensor<B, D2, K> {
| ^^^^^^^
help: provide the arguments
|
183 | let next_token_tensor = last_logits.argmax(2).flatten::<1>(/* usize */, /* usize */);
| ++++++++++++++++++++++++
error[E0277]: the trait bound `TensorData: From<burn::tensor::Data<i32, 1>>` is not satisfied
--> node/src/smollm.rs:200:13
|
199 | let mut input_tensor = burn::tensor::Tensor::<B, 1, burn::tensor::Int>::from_data(
| ---------------------------------------------------------- required by a bound introduced by this call
200 | burn::tensor::Data::new(vec![next_token as i32], [1].into()),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `From<burn::tensor::Data<i32, 1>>` is not implemented for `TensorData`
|
= help: the following other types implement trait `From<T>`:
`TensorData` implements `From<&[E]>`
`TensorData` implements `From<&[usize]>`
`TensorData` implements `From<[E; A]>`
`TensorData` implements `From<[[E; B]; A]>`
`TensorData` implements `From<[[[E; C]; B]; A]>`
`TensorData` implements `From<[[[[E; D]; C]; B]; A]>`
`TensorData` implements `From<[[[[[Elem; E]; D]; C]; B]; A]>`
`TensorData` implements `From<[usize; A]>`
= note: required for `burn::tensor::Data<i32, 1>` to implement `Into<TensorData>`
note: required by a bound in `burn::tensor::Tensor::<B, D, K>::from_data`
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:719:12
|
717 | pub fn from_data<T>(data: T, device: &B::Device) -> Self
| --------- required by a bound in this associated function
718 | where
719 | T: Into<TensorData>,
| ^^^^^^^^^^^^^^^^ required by this bound in `Tensor::<B, D, K>::from_data`
error[E0061]: this method takes 2 arguments but 0 arguments were supplied
--> node/src/smollm.rs:207:50
|
207 | let next_token_tensor = logits.argmax(2).flatten::<1>();
| ^^^^^^^^^^^^-- two arguments of type `usize` and `usize` are missing
|
note: method defined here
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:292:12
|
292 | pub fn flatten<const D2: usize>(self, start_dim: usize, end_dim: usize) -> Tensor<B, D2, K> {
| ^^^^^^^
help: provide the arguments
|
207 | let next_token_tensor = logits.argmax(2).flatten::<1>(/* usize */, /* usize */);
| ++++++++++++++++++++++++
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:58:13
|
58 | q = q.reshape([batch, seq_len, self.num_heads, self.head_dim]).swap_dims(1, 2);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `3`, found `4`
|
= note: expected struct `burn::tensor::Tensor<_, 3>`
found struct `burn::tensor::Tensor<_, 4>`
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:59:13
|
59 | k = k.reshape([batch, seq_len, self.num_kv_heads, self.head_dim]).swap_dims(1, 2);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `3`, found `4`
|
= note: expected struct `burn::tensor::Tensor<_, 3>`
found struct `burn::tensor::Tensor<_, 4>`
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:60:13
|
60 | v = v.reshape([batch, seq_len, self.num_kv_heads, self.head_dim]).swap_dims(1, 2);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `3`, found `4`
|
= note: expected struct `burn::tensor::Tensor<_, 3>`
found struct `burn::tensor::Tensor<_, 4>`
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:63:31
|
63 | q = self.rope.forward(q, offset);
| ------- ^ expected `4`, found `3`
| |
| arguments to this method are incorrect
|
= note: expected struct `burn::tensor::Tensor<_, 4>`
found struct `burn::tensor::Tensor<_, 3>`
note: method defined here
--> node/src/burn_smollm/rope.rs:35:12
|
35 | pub fn forward(&self, x: Tensor<B, 4>, offset: usize) -> Tensor<B, 4> {
| ^^^^^^^ ---------------
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:63:13
|
63 | q = self.rope.forward(q, offset);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `3`, found `4`
|
= note: expected struct `burn::tensor::Tensor<_, 3>`
found struct `burn::tensor::Tensor<_, 4>`
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:64:31
|
64 | k = self.rope.forward(k, offset);
| ------- ^ expected `4`, found `3`
| |
| arguments to this method are incorrect
|
= note: expected struct `burn::tensor::Tensor<_, 4>`
found struct `burn::tensor::Tensor<_, 3>`
note: method defined here
--> node/src/burn_smollm/rope.rs:35:12
|
35 | pub fn forward(&self, x: Tensor<B, 4>, offset: usize) -> Tensor<B, 4> {
| ^^^^^^^ ---------------
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:64:13
|
64 | k = self.rope.forward(k, offset);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `3`, found `4`
|
= note: expected struct `burn::tensor::Tensor<_, 3>`
found struct `burn::tensor::Tensor<_, 4>`
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:68:41
|
68 | c.k = Tensor::cat(vec![c.k, k], 2);
| ^ expected `4`, found `3`
|
= note: expected struct `burn::tensor::Tensor<_, 4>`
found struct `burn::tensor::Tensor<_, 3>`
error[E0308]: mismatched types
--> node/src/burn_smollm/attention.rs:69:41
|
69 | c.v = Tensor::cat(vec![c.v, v], 2);
| ^ expected `4`, found `3`
|
= note: expected struct `burn::tensor::Tensor<_, 4>`
found struct `burn::tensor::Tensor<_, 3>`
error[E0308]: `if` and `else` have incompatible types
--> node/src/burn_smollm/attention.rs:72:13
|
67 | let (k, v) = if let Some(mut c) = cache {
| ______________________-
68 | | c.k = Tensor::cat(vec![c.k, k], 2);
69 | | c.v = Tensor::cat(vec![c.v, v], 2);
70 | | (c.k.clone(), c.v.clone())
| | -------------------------- expected because of this
71 | | } else {
72 | | (k.clone(), v.clone())
| | ^^^^^^^^^^^^^^^^^^^^^^ expected `4`, found `3`
73 | | };
| |_________- `if` and `else` have incompatible types
|
= note: expected tuple `(burn::tensor::Tensor<_, 4>, burn::tensor::Tensor<_, 4>)`
found tuple `(burn::tensor::Tensor<_, 3>, burn::tensor::Tensor<_, 3>)`
error[E0282]: type annotations needed
--> node/src/burn_smollm/attention.rs:75:38
|
75 | let new_cache = KVCache { k: k.clone(), v: v.clone() };
| ^ cannot infer type
error[E0282]: type annotations needed
--> node/src/burn_smollm/attention.rs:75:52
|
75 | let new_cache = KVCache { k: k.clone(), v: v.clone() };
| ^ cannot infer type
error[E0277]: the trait bound `TensorData: From<burn::tensor::Data<f32, 2>>` is not satisfied
--> node/src/burn_smollm/loader.rs:18:44
|
18 | let t_burn = Tensor::<B, 2>::from_data(data, device);
| ------------------------- ^^^^ the trait `From<burn::tensor::Data<f32, 2>>` is not implemented for `TensorData`
| |
| required by a bound introduced by this call
|
= help: the following other types implement trait `From<T>`:
`TensorData` implements `From<&[E]>`
`TensorData` implements `From<&[usize]>`
`TensorData` implements `From<[E; A]>`
`TensorData` implements `From<[[E; B]; A]>`
`TensorData` implements `From<[[[E; C]; B]; A]>`
`TensorData` implements `From<[[[[E; D]; C]; B]; A]>`
`TensorData` implements `From<[[[[[Elem; E]; D]; C]; B]; A]>`
`TensorData` implements `From<[usize; A]>`
= note: required for `burn::tensor::Data<f32, 2>` to implement `Into<TensorData>`
note: required by a bound in `burn::tensor::Tensor::<B, D, K>::from_data`
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:719:12
|
717 | pub fn from_data<T>(data: T, device: &B::Device) -> Self
| --------- required by a bound in this associated function
718 | where
719 | T: Into<TensorData>,
| ^^^^^^^^^^^^^^^^ required by this bound in `Tensor::<B, D, K>::from_data`
error[E0277]: the trait bound `TensorData: From<burn::tensor::Data<f32, 1>>` is not satisfied
--> node/src/burn_smollm/loader.rs:33:53
|
33 | Ok(Param::from_tensor(Tensor::<B, 1>::from_data(data, device)))
| ------------------------- ^^^^ the trait `From<burn::tensor::Data<f32, 1>>` is not implemented for `TensorData`
| |
| required by a bound introduced by this call
|
= help: the following other types implement trait `From<T>`:
`TensorData` implements `From<&[E]>`
`TensorData` implements `From<&[usize]>`
`TensorData` implements `From<[E; A]>`
`TensorData` implements `From<[[E; B]; A]>`
`TensorData` implements `From<[[[E; C]; B]; A]>`
`TensorData` implements `From<[[[[E; D]; C]; B]; A]>`
`TensorData` implements `From<[[[[[Elem; E]; D]; C]; B]; A]>`
`TensorData` implements `From<[usize; A]>`
= note: required for `burn::tensor::Data<f32, 1>` to implement `Into<TensorData>`
note: required by a bound in `burn::tensor::Tensor::<B, D, K>::from_data`
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:719:12
|
717 | pub fn from_data<T>(data: T, device: &B::Device) -> Self
| --------- required by a bound in this associated function
718 | where
719 | T: Into<TensorData>,
| ^^^^^^^^^^^^^^^^ required by this bound in `Tensor::<B, D, K>::from_data`
error[E0277]: the trait bound `TensorData: From<burn::tensor::Data<f32, 2>>` is not satisfied
--> node/src/burn_smollm/loader.rs:47:53
|
47 | Ok(Param::from_tensor(Tensor::<B, 2>::from_data(data, device)))
| ------------------------- ^^^^ the trait `From<burn::tensor::Data<f32, 2>>` is not implemented for `TensorData`
| |
| required by a bound introduced by this call
|
= help: the following other types implement trait `From<T>`:
`TensorData` implements `From<&[E]>`
`TensorData` implements `From<&[usize]>`
`TensorData` implements `From<[E; A]>`
`TensorData` implements `From<[[E; B]; A]>`
`TensorData` implements `From<[[[E; C]; B]; A]>`
`TensorData` implements `From<[[[[E; D]; C]; B]; A]>`
`TensorData` implements `From<[[[[[Elem; E]; D]; C]; B]; A]>`
`TensorData` implements `From<[usize; A]>`
= note: required for `burn::tensor::Data<f32, 2>` to implement `Into<TensorData>`
note: required by a bound in `burn::tensor::Tensor::<B, D, K>::from_data`
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:719:12
|
717 | pub fn from_data<T>(data: T, device: &B::Device) -> Self
| --------- required by a bound in this associated function
718 | where
719 | T: Into<TensorData>,
| ^^^^^^^^^^^^^^^^ required by this bound in `Tensor::<B, D, K>::from_data`
error[E0599]: no function or associated item named `arange` found for struct `burn::tensor::Tensor<B, 1>` in the current scope
--> node/src/burn_smollm/rope.rs:19:33
|
19 | let t = Tensor::<B, 1>::arange(0..max_seq_len as i64, device).float().unsqueeze::<2>().transpose();
| ^^^^^^ function or associated item not found in `burn::tensor::Tensor<B, 1>`
|
note: if you're trying to build a new `burn::tensor::Tensor<B, 1>` consider using one of the following associated functions:
burn::tensor::Tensor::<B, D, K>::new
burn::tensor::Tensor::<B, D, K>::from_primitive
burn::tensor::Tensor::<B, D, K>::empty
burn::tensor::Tensor::<B, D, K>::from_data
and 9 others
--> /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/burn-tensor-0.14.0/src/tensor/api/base.rs:24:10
|
24 | #[derive(new, Clone, Debug)]
| ^^^
...
55 | pub fn from_primitive(tensor: K::Primitive<D>) -> Self {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
60 | pub fn empty<S: Into<Shape<D>>>(shape: S, device: &B::Device) -> Self {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
717 | / pub fn from_data<T>(data: T, device: &B::Device) -> Self
718 | | where
719 | | T: Into<TensorData>,
| |____________________________^
= note: the function or associated item was found for
- `burn::tensor::Tensor<B, 1, burn::tensor::Int>`
= note: this error originates in the derive macro `new` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: variable does not need to be mutable
--> node/src/burn_smollm/loader.rs:70:13
|
70 | let mut layer = &mut model.layers[i];
| ----^^^^^
| |
| help: remove this `mut`
|
= note: `#[warn(unused_mut)]` (part of `#[warn(unused)]`) on by default
warning: unused variable: `batch`
--> node/src/burn_smollm/model.rs:79:14
|
79 | let [batch, seq_len] = input_ids.dims();
| ^^^^^ help: if this is intentional, prefix it with an underscore: `_batch`
|
= note: `#[warn(unused_variables)]` (part of `#[warn(unused)]`) on by default
warning: unused variable: `seq_len`
--> node/src/burn_smollm/model.rs:79:21
|
79 | let [batch, seq_len] = input_ids.dims();
| ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_seq_len`
Some errors have detailed explanations: E0061, E0277, E0282, E0308, E0599.
For more information about an error, try `rustc --explain E0061`.
warning: `node` (lib) generated 19 warnings
error: could not compile `node` (lib) due to 21 previous errors; 19 warnings emitted
Error: Compiling your crate to WebAssembly failed
Caused by: Compiling your crate to WebAssembly failed
Caused by: failed to execute `cargo build`: exited with exit status: 101
full command: cd "/app/node" && "cargo" "build" "--lib" "--release" "--target" "wasm32-unknown-unknown"

View File

@@ -0,0 +1,4 @@
FROM rust:latest
RUN apt-get update && apt-get install -y pkg-config libssl-dev cmake && rm -rf /var/lib/apt/lists/*
WORKDIR /work
ENTRYPOINT ["sh", "-c", "cp -r /src/* . && cargo test 2>&1"]

View File

@@ -0,0 +1,4 @@
FROM golang:1.23-alpine
RUN apk add --no-cache gcc musl-dev
WORKDIR /work
ENTRYPOINT ["sh", "-c", "cp -r /src/* . && go mod tidy 2>&1 && go test -v -count=1 ./... 2>&1"]

View File

@@ -0,0 +1,5 @@
FROM python:3.14-slim
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
WORKDIR /work
ENV PYTHONPATH=/work
ENTRYPOINT ["sh", "-c", "uv init --no-readme --python '>=3.14' 2>/dev/null && rm -f hello.py main.py && uv add fastapi 'uvicorn[standard]' sqlalchemy pytest httpx 2>/dev/null && cp /src/*.py . && rm -f app.db test.db && uv run pytest test_main.py -v --tb=short 2>&1"]

View File

@@ -0,0 +1,95 @@
# Kipinä CodeBench
LLM-koodingenerointibenchmark. Testaa Ollama-mallien kykyä generoida toimivia FastAPI+SQLAlchemy-projekteja ja ajaa testit Docker-kontissa.
## Pikastart
```bash
# 1. Rakenna Docker-testikontti
docker build -t kipina-pytest -f Dockerfile.pytest .
# 2. Aja benchmark
node benchmark.mjs --ollama http://localhost:11434 --scenarios all
# 3. Avaa raportti
open /tmp/kipina-benchmark/report.html
```
## Pipeline
```
1. LLM → vaatimusmäärittely (prompts/client.md)
2. LLM → JSON-speksi (prompts/spec.md)
3. LLM → 4 Python-tiedostoa (prompts/code.md + golden-examples/)
4. Staattinen validointi + LLM-korjaus (prompts/fix.md)
5. Docker: uv init + uv add + pytest
```
## CLI-argumentit
| Argumentti | Oletus | Kuvaus |
|-----------|--------|--------|
| `--ollama` | `http://localhost:11434` | Ollama-palvelimen URL |
| `--hub` | - | Hub-reitti (vaihtoehto Ollamalle) |
| `--models` | kaikki | Pilkuilla erotettu mallilista |
| `--scenarios` | `default` (todo) | `all` = todo, users, blog |
| `--output` | `/tmp/kipina-benchmark` | Tuloshakemisto |
## Hakemistorakenne
```
kipina-codebench/
├── benchmark.mjs ← runner
├── Dockerfile.pytest ← Python 3.14 + uv testikontti
├── report-template.html ← HTML-raporttipohja
├── package.json
├── prompts/ ← muokattavat promptit
│ ├── client.md ← vaatimusmäärittely
│ ├── spec.md ← JSON-speksi
│ ├── code.md ← koodigenerointi
│ └── fix.md ← korjaus
├── golden-examples/ ← referenssitoteutukset
│ ├── todo/ ← taso 1: perus-CRUD (6 testiä)
│ ├── blog/ ← taso 2: relaatiot (13 testiä)
│ └── DOCUMENTATION.md ← zensical-dokumentointiohjeet
└── results/ ← tallennetut tulokset
```
## Promptien muokkaus
Promptit ovat `prompts/`-kansiossa Markdown-tiedostoina. Muokkaa suoraan — benchmark lataa ne käynnistyksessä.
Esimerkki: lisää sääntö `prompts/code.md`:hen:
```
- Tests: PUT/update test data MUST include ALL required fields
```
## Kultaiset esimerkit
`golden-examples/todo/` syötetään LLM:lle referenssinä. Malli näkee tarkalleen millaista koodia odotetaan:
- SQLAlchemy 2.0 (DeclarativeBase, Mapped, mapped_column)
- Pydantic v2 (ConfigDict)
- Python 3.14 syntaksi (str | None)
- Uniikki testidata per testi
Lisää uusia esimerkkejä luomalla hakemisto (esim. `golden-examples/shop/`).
## Pisteytys
| Komponentti | Pisteet | Peruste |
|---|---|---|
| Speksi OK | 10p | JSON-speksi onnistui |
| Koodi generoitu | 10p | Kaikki 4 tiedostoa syntyneet |
| Testit | 060p | passed/total × 60 |
| Korjaukset | 020p | 0 kierrosta = 20p, 1 = 10p, 2+ = 0p |
Tähdet: ★★★★★ (90+), ★★★★☆ (70+), ★★★☆☆ (50+), ★★☆☆☆ (25+), ★☆☆☆☆ (1+)
## Käyttö git-submodulena
```bash
git submodule add <repo-url> tools/codebench
cd tools/codebench
docker build -t kipina-pytest -f Dockerfile.pytest .
node benchmark.mjs --ollama http://localhost:11434 --scenarios all
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,84 @@
# Dokumentointiohjeet — Zensical
Hyvä dokumentointi kertoo **mitä asia ON**, ei mitä se tekee. Se on kuin zen-koan: lyhyt, tarkka, riittävä.
## Periaatteet
1. **Yksi rivi riittää.** Jos tarvitset kappaleen, koodi on liian monimutkainen.
2. **Kerro mitä, älä miten.** `"""Tietokantamallit — SQLAlchemy 2.0, SQLite."""` ei `"""This module creates database models using SQLAlchemy..."""`
3. **Älä toista koodia.** Jos funktio on `create_todo`, docstring ei ole "Creates a todo".
4. **Suomi tai englanti, ei molempia.** Valitse yksi kieli per projekti.
5. **Ei täytesanoja.** "This module provides functionality for" → poista.
## Mitä dokumentoidaan
| Kohde | Dokumentointi | Esimerkki |
|-------|--------------|-----------|
| **Moduuli** (.py) | Aina. Yksi rivi: mitä tiedosto sisältää. | `"""Pydantic v2 -skeemat — Create ja Response."""` |
| **Luokka** | Aina. Mitä entiteetti edustaa. | `"""Tehtävä — otsikko, deadline, prioriteetti."""` |
| **Funktio** | Vain jos nimi ei kerro kaikkea. | `get_db``"""Tietokantasessio per pyyntö."""` |
| **CRUD-endpoint** | Ei. Nimi + HTTP-metodi riittää. | `create_todo`, `list_todos` — itsedokumentoivia |
| **Testi** | Ei. Testin nimi on dokumentaatio. | `test_get_todo_not_found` — selvä |
| **Konfiguraatio** | Kommentti vain jos arvo yllättää. | `check_same_thread: False # SQLite + FastAPI` |
## Mitä EI dokumentoida
- Importteja
- Ilmeisiä parametreja (`item_id: int`)
- Tyyppivihjeitä jotka kertovat saman asian
- Geneerisiä "boilerplate"-docstringejä
## Esimerkkejä
### Hyvä (zensical)
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
...
def get_db():
"""Tietokantasessio per pyyntö."""
...
```
### Huono (verbose)
```python
"""
This module defines the database models for the Todo application.
It uses SQLAlchemy ORM to create the database tables and provides
the session factory for database connections.
"""
class Todo(Base):
"""
Represents a todo item in the database.
Attributes:
id: The unique identifier for the todo item.
title: The title of the todo item.
...
"""
...
```
### Huono (tyhjä)
```python
# Ei docstringejä ollenkaan — lukija ei tiedä mikä tiedoston rooli on
class Todo(Base):
__tablename__ = "todos"
...
```
## Tarkistuslista
Generoitu koodi on hyvin dokumentoitu kun:
- [ ] Jokainen .py-tiedosto alkaa yksirivisellä docstringillä
- [ ] Jokainen luokka kertoo mitä entiteetti edustaa
- [ ] Docstringit ovat saman kielen kuin muu koodi
- [ ] CRUD-endpointeilla ei ole turhia docstringejä
- [ ] Kommentteja on vain siellä missä koodi yllättää

View File

@@ -0,0 +1,123 @@
# Golden Examples — referenssitoteutukset
Kultaiset esimerkit ovat **täydellisiä, testattuja** FastAPI-projekteja joita LLM käyttää mallina koodigeneroinnissa. Malli näkee esimerkin ja tuottaa vastaavan rakenteen uudelle projektille.
## Uuden esimerkin luominen
### 1. Luo hakemisto
```bash
mkdir golden-examples/shop
```
Nimeä hakemisto skenaarion mukaan (todo, blog, shop, booking...).
### 2. Luo 4 tiedostoa
| Tiedosto | Sisältö |
|----------|---------|
| `models.py` | SQLAlchemy 2.0 -mallit (DeclarativeBase, Mapped, mapped_column) |
| `schemas.py` | Pydantic v2 -skeemat (ConfigDict, `str \| None` -syntaksi) |
| `main.py` | FastAPI CRUD -endpointit (POST 201, GET, GET/:id 404, PUT, DELETE 204) |
| `test_main.py` | Pytest + TestClient, erillinen test.db, uniikki data per testi |
### 3. Noudata konventioita
**Python-versio:** >=3.14
**SQLAlchemy 2.0** (ei legacy):
```python
# Oikein
class Base(DeclarativeBase):
pass
class Todo(Base):
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
status: Mapped[str] = mapped_column(String(20), default="pending")
# Väärin
Base = declarative_base()
id = Column(Integer, primary_key=True)
```
**Pydantic v2** (ei v1):
```python
# Oikein
class TodoResponse(TodoCreate):
id: int
model_config = ConfigDict(from_attributes=True)
# Väärin
class Config:
orm_mode = True
```
**Tyypitys:**
```python
# Oikein
description: Mapped[str | None] = mapped_column(Text, default=None)
# Väärin
description: Mapped[Optional[str]]
```
**Dokumentointi (zensical):**
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
```
Yksi rivi riittää. Kerro mitä asia ON, älä mitä se tekee. Katso [DOCUMENTATION.md](DOCUMENTATION.md).
**Testidata — uniikki ja kuvaava:**
```python
# Oikein
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
# Väärin — geneerinen data
def test_create_todo():
response = client.post("/todos/", json={"title": "test", "priority": 1})
```
### 4. Testaa Docker-kontissa
```bash
rm -rf /tmp/golden-test && mkdir /tmp/golden-test
cp golden-examples/shop/*.py /tmp/golden-test/
docker run --rm -v /tmp/golden-test:/src:ro kipina-pytest
```
**Kaikkien testien pitää mennä läpi.** Ei varoituksia, ei deprecation-viestejä.
### 5. Vaikeustasot
| Taso | Esimerkit | Haaste |
|------|-----------|--------|
| 1 — Perus-CRUD | `todo/`, `users/`, `notes/` | Yksi entiteetti |
| 2 — Relaatiot | `blog/`, `library/`, `school/` | Foreign key, 23 entiteettiä |
| 3 — Liiketoimintalogiikka | `shop/`, `booking/` | Custom endpointit, validointi |
Aloita tasosta 1 ja etene. Tason 1 esimerkkien pitää olla yksinkertaisia — ne opettavat mallille perusrakenteen.
## Miten esimerkit vaikuttavat
Benchmark lataa `todo/`-esimerkin ja syöttää sen LLM:lle osana koodingenerointipromptia:
```
REFERENCE IMPLEMENTATION (todo project — follow this exact structure):
=== models.py ===
<todo/models.py sisältö>
=== schemas.py ===
...
```
Malli näkee tarkan esimerkin ja tuottaa vastaavan rakenteen uudelle projektille. Mitä parempi esimerkki, sitä parempi tulos.

View File

@@ -0,0 +1,110 @@
"""FastAPI CRUD — kaksi endpoint-settiä, Author ja Post."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Author, Post
from schemas import AuthorCreate, AuthorResponse, PostCreate, PostResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
# --- Author ---
@app.post("/authors/", response_model=AuthorResponse, status_code=201)
def create_author(item: AuthorCreate, db: Session = Depends(get_db)):
db_item = Author(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/authors/", response_model=list[AuthorResponse])
def list_authors(db: Session = Depends(get_db)):
return db.query(Author).all()
@app.get("/authors/{item_id}", response_model=AuthorResponse)
def get_author(item_id: int, db: Session = Depends(get_db)):
item = db.query(Author).filter(Author.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Author not found")
return item
@app.put("/authors/{item_id}", response_model=AuthorResponse)
def update_author(item_id: int, item: AuthorCreate, db: Session = Depends(get_db)):
db_item = db.query(Author).filter(Author.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Author not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/authors/{item_id}", status_code=204)
def delete_author(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Author).filter(Author.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Author not found")
db.delete(db_item)
db.commit()
# --- Post ---
@app.post("/posts/", response_model=PostResponse, status_code=201)
def create_post(item: PostCreate, db: Session = Depends(get_db)):
db_item = Post(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/posts/", response_model=list[PostResponse])
def list_posts(db: Session = Depends(get_db)):
return db.query(Post).all()
@app.get("/posts/{item_id}", response_model=PostResponse)
def get_post(item_id: int, db: Session = Depends(get_db)):
item = db.query(Post).filter(Post.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Post not found")
return item
@app.put("/posts/{item_id}", response_model=PostResponse)
def update_post(item_id: int, item: PostCreate, db: Session = Depends(get_db)):
db_item = db.query(Post).filter(Post.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Post not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/posts/{item_id}", status_code=204)
def delete_post(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Post).filter(Post.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Post not found")
db.delete(db_item)
db.commit()

View File

@@ -0,0 +1,45 @@
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, ForeignKey-relaatiot."""
from datetime import datetime
from sqlalchemy import String, Text, DateTime, ForeignKey, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Author(Base):
"""Kirjoittaja — nimi, sähköposti ja bio."""
__tablename__ = "authors"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
name: Mapped[str] = mapped_column(String(255))
email: Mapped[str] = mapped_column(String(255), unique=True)
bio: Mapped[str | None] = mapped_column(Text, default=None)
posts: Mapped[list["Post"]] = relationship(back_populates="author")
class Post(Base):
"""Blogipostaus — otsikko, sisältö, kirjoittaja, julkaisuaika ja tila."""
__tablename__ = "posts"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
content: Mapped[str] = mapped_column(Text)
author_id: Mapped[int] = mapped_column(ForeignKey("authors.id"))
published_at: Mapped[datetime | None] = mapped_column(DateTime, default=None)
status: Mapped[str] = mapped_column(String(20), default="draft")
author: Mapped["Author"] = relationship(back_populates="posts")
Base.metadata.create_all(bind=engine)

View File

@@ -0,0 +1,37 @@
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import datetime
from pydantic import BaseModel, ConfigDict
class AuthorCreate(BaseModel):
"""Uuden kirjoittajan luonti. Pakolliset: name, email."""
name: str
email: str
bio: str | None = None
class AuthorResponse(AuthorCreate):
"""Palautettava kirjoittaja — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)
class PostCreate(BaseModel):
"""Uuden postauksen luonti. Pakolliset: title, content, author_id."""
title: str
content: str
author_id: int
published_at: datetime | None = None
status: str = "draft"
class PostResponse(PostCreate):
"""Palautettava postaus — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)

View File

@@ -0,0 +1,164 @@
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def _create_author(name="Eino Leino", email=None):
"""Apufunktio kirjoittajan luomiseen testeissä."""
if email is None:
email = f"{name.lower().replace(' ', '.')}@example.com"
return client.post(
"/authors/", json={"name": name, "email": email}
).json()
# --- Author-testit ---
def test_create_author():
response = client.post(
"/authors/",
json={"name": "Aleksis Kivi", "email": "aleksis@example.com", "bio": "Suomen kansalliskirjailija"},
)
assert response.status_code == 201
assert response.json()["name"] == "Aleksis Kivi"
assert response.json()["bio"] == "Suomen kansalliskirjailija"
assert "id" in response.json()
def test_list_authors():
_create_author("Minna Canth", "minna.canth@example.com")
response = client.get("/authors/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_author_by_id():
created = _create_author("Väinö Linna", "vaino.linna@example.com")
response = client.get(f"/authors/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_author_not_found():
response = client.get("/authors/99999")
assert response.status_code == 404
def test_update_author():
created = _create_author("Vanha Nimi", "vanha.nimi@example.com")
response = client.put(
f"/authors/{created['id']}",
json={"name": "Uusi Nimi", "email": "uusi.nimi@example.com"},
)
assert response.status_code == 200
assert response.json()["name"] == "Uusi Nimi"
def test_delete_author():
created = _create_author("Poistettava Kirjailija", "poistettava@example.com")
response = client.delete(f"/authors/{created['id']}")
assert response.status_code == 204
response = client.get(f"/authors/{created['id']}")
assert response.status_code == 404
# --- Post-testit ---
def test_create_post():
author = _create_author("Tove Jansson", "tove.jansson@example.com")
response = client.post(
"/posts/",
json={"title": "Muumipeikko ja pyrstötähti", "content": "Eräänä aamuna...", "author_id": author["id"]},
)
assert response.status_code == 201
assert response.json()["title"] == "Muumipeikko ja pyrstötähti"
assert response.json()["author_id"] == author["id"]
assert response.json()["status"] == "draft"
def test_list_posts():
author = _create_author("Juhani Aho", "juhani.aho@example.com")
client.post(
"/posts/",
json={"title": "Rautatie", "content": "Junasta kertova novelli.", "author_id": author["id"]},
)
response = client.get("/posts/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_post_by_id():
author = _create_author("Elias Lönnrot", "elias.lonnrot@example.com")
created = client.post(
"/posts/",
json={"title": "Kalevala", "content": "Vaka vanha Väinämöinen.", "author_id": author["id"]},
).json()
response = client.get(f"/posts/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_post_not_found():
response = client.get("/posts/99999")
assert response.status_code == 404
def test_update_post():
author = _create_author("Joel Lehtonen", "joel.lehtonen@example.com")
created = client.post(
"/posts/",
json={"title": "Vanha otsikko", "content": "Alkuperäinen teksti.", "author_id": author["id"]},
).json()
response = client.put(
f"/posts/{created['id']}",
json={"title": "Päivitetty otsikko", "content": "Muokattu teksti.", "author_id": author["id"], "status": "published"},
)
assert response.status_code == 200
assert response.json()["title"] == "Päivitetty otsikko"
assert response.json()["status"] == "published"
def test_delete_post():
author = _create_author("Aino Kallas", "aino.kallas@example.com")
created = client.post(
"/posts/",
json={"title": "Poistettava postaus", "content": "Tämä poistetaan.", "author_id": author["id"]},
).json()
response = client.delete(f"/posts/{created['id']}")
assert response.status_code == 204
response = client.get(f"/posts/{created['id']}")
assert response.status_code == 404
def test_post_belongs_to_author():
author = _create_author("Sofi Oksanen", "sofi.oksanen@example.com")
post = client.post(
"/posts/",
json={"title": "Puhdistus", "content": "Romaani Virosta.", "author_id": author["id"]},
).json()
assert post["author_id"] == author["id"]

View File

@@ -0,0 +1,204 @@
# Example 1: Todo App (single entity)
## models.py
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)
```
## schemas.py
```python
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
id: int
model_config = ConfigDict(from_attributes=True)
```
## test_main.py — exactly 6 tests per entity
```python
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine("sqlite:///./test.db", connect_args={"check_same_thread": False})
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try: yield db
finally: db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha"}).json()
response = client.put(f"/todos/{created['id']}", json={"title": "Uusi"})
assert response.status_code == 200
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
```
# Example 2: Blog (two entities with ForeignKey)
NOTE: ForeignKey is imported from sqlalchemy, NOT from sqlalchemy.orm!
## models.py
```python
from datetime import datetime
from sqlalchemy import String, Text, DateTime, ForeignKey, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Author(Base):
__tablename__ = "authors"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
name: Mapped[str] = mapped_column(String(255))
email: Mapped[str] = mapped_column(String(255), unique=True)
bio: Mapped[str | None] = mapped_column(Text, default=None)
posts: Mapped[list["Post"]] = relationship(back_populates="author")
class Post(Base):
__tablename__ = "posts"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
content: Mapped[str] = mapped_column(Text)
author_id: Mapped[int] = mapped_column(ForeignKey("authors.id"))
published_at: Mapped[datetime | None] = mapped_column(DateTime, default=None)
status: Mapped[str] = mapped_column(String(20), default="draft")
author: Mapped["Author"] = relationship(back_populates="posts")
Base.metadata.create_all(bind=engine)
```
## schemas.py
```python
from datetime import datetime
from pydantic import BaseModel, ConfigDict
class AuthorCreate(BaseModel):
name: str
email: str
bio: str | None = None
class AuthorResponse(AuthorCreate):
id: int
model_config = ConfigDict(from_attributes=True)
class PostCreate(BaseModel):
title: str
content: str
author_id: int
published_at: datetime | None = None
status: str = "draft"
class PostResponse(PostCreate):
id: int
model_config = ConfigDict(from_attributes=True)
```
## test_main.py — 6 tests per entity, create parent FIRST for child tests
```python
client = TestClient(app) # same setup as above
def _create_author(name="Kirjailija", email=None):
if email is None:
email = f"{name.lower().replace(' ', '.')}@example.com"
return client.post("/authors/", json={"name": name, "email": email}).json()
def test_create_author():
response = client.post("/authors/", json={"name": "Aleksis Kivi", "email": "aleksis@example.com"})
assert response.status_code == 201
def test_list_authors():
_create_author("Minna Canth", "minna@example.com")
response = client.get("/authors/")
assert response.status_code == 200
assert len(response.json()) >= 1
# ... (same pattern: get_by_id, not_found, update, delete)
def test_create_post():
author = _create_author("Tove Jansson", "tove@example.com")
response = client.post("/posts/", json={"title": "Artikkeli", "content": "Sisältö", "author_id": author["id"]})
assert response.status_code == 201
def test_update_post():
author = _create_author("Joel Lehtonen", "joel@example.com")
created = client.post("/posts/", json={"title": "Vanha", "content": "Teksti", "author_id": author["id"]}).json()
response = client.put(f"/posts/{created['id']}", json={"title": "Uusi", "content": "Muokattu", "author_id": author["id"]})
assert response.status_code == 200
def test_delete_post():
author = _create_author("Aino Kallas", "aino@example.com")
created = client.post("/posts/", json={"title": "Poistettava", "content": "Poistetaan", "author_id": author["id"]}).json()
response = client.delete(f"/posts/{created['id']}")
assert response.status_code == 204
```

View File

@@ -0,0 +1,325 @@
# Todo — reference implementation (Go + Chi + SQLite)
This is a complete example. Generate equivalent structure for the given project.
Use ONLY the fields from the JSON spec — do not add extras.
## go.mod
Chi v5 router, modernc.org/sqlite (pure Go, no CGO).
```
module todo-go
go 1.23.0
toolchain go1.23.12
require (
github.com/go-chi/chi/v5 v5.2.1
modernc.org/sqlite v1.37.1
)
require (
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/ncruces/go-strftime v0.1.9 // indirect
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 // indirect
golang.org/x/sys v0.33.0 // indirect
modernc.org/libc v1.65.7 // indirect
modernc.org/mathutil v1.7.1 // indirect
modernc.org/memory v1.11.0 // indirect
)
```
## models.go
Data structs: Todo (full row), CreateTodo (POST), UpdateTodo (PUT, all fields optional pointers).
```go
package main
// Todo represents a task with priority and status tracking.
type Todo struct {
ID int64 `json:"id"`
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority int64 `json:"priority"`
Status string `json:"status"`
}
// CreateTodo is the request body for creating a new todo.
type CreateTodo struct {
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}
// UpdateTodo is the request body for updating an existing todo.
type UpdateTodo struct {
Title *string `json:"title,omitempty"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}
```
## handlers.go
CRUD handlers as closures taking *sql.DB. Key patterns: INSERT RETURNING, sql.ErrNoRows for 404, RowsAffected for delete.
```go
package main
import (
"database/sql"
"encoding/json"
"net/http"
"strconv"
"github.com/go-chi/chi/v5"
)
// POST — decode JSON, defaults with nil-check, INSERT RETURNING, StatusCreated.
func createTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
var input CreateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest); return
}
priority := int64(1)
if input.Priority != nil { priority = *input.Priority }
status := "pending"
if input.Status != nil { status = *input.Status }
var todo Todo
err := db.QueryRow(
`INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?) RETURNING id, title, description, due_date, priority, status`,
input.Title, input.Description, input.DueDate, priority, status,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusCreated)
json.NewEncoder(w).Encode(todo)
}
}
// GET list — db.Query + rows.Scan loop, empty slice not nil.
func listTodos(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
rows, err := db.Query("SELECT id, title, description, due_date, priority, status FROM todos")
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
defer rows.Close()
todos := []Todo{}
for rows.Next() {
var t Todo
rows.Scan(&t.ID, &t.Title, &t.Description, &t.DueDate, &t.Priority, &t.Status)
todos = append(todos, t)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todos)
}
}
// GET by id — QueryRow + sql.ErrNoRows → 404.
func getTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var todo Todo
err := db.QueryRow(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err == sql.ErrNoRows { http.Error(w, "not found", http.StatusNotFound); return }
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todo)
}
}
// PUT — fetch existing, merge with input nil-checks, UPDATE RETURNING.
func updateTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var existing Todo
err := db.QueryRow("SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&existing.ID, &existing.Title, &existing.Description, &existing.DueDate, &existing.Priority, &existing.Status)
if err == sql.ErrNoRows { http.Error(w, "not found", http.StatusNotFound); return }
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
var input UpdateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest); return
}
if input.Title != nil { existing.Title = *input.Title }
if input.Description != nil { existing.Description = input.Description }
if input.DueDate != nil { existing.DueDate = input.DueDate }
if input.Priority != nil { existing.Priority = *input.Priority }
if input.Status != nil { existing.Status = *input.Status }
var updated Todo
err = db.QueryRow(
`UPDATE todos SET title=?, description=?, due_date=?, priority=?, status=? WHERE id=?
RETURNING id, title, description, due_date, priority, status`,
existing.Title, existing.Description, existing.DueDate, existing.Priority, existing.Status, id,
).Scan(&updated.ID, &updated.Title, &updated.Description, &updated.DueDate, &updated.Priority, &updated.Status)
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(updated)
}
}
// DELETE — Exec + RowsAffected == 0 → 404, else 204.
func deleteTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
result, err := db.Exec("DELETE FROM todos WHERE id = ?", id)
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
rows, _ := result.RowsAffected()
if rows == 0 { http.Error(w, "not found", http.StatusNotFound); return }
w.WriteHeader(http.StatusNoContent)
}
}
```
## main.go
Entry point: SQLite connection, table init, Chi router on port 3000.
```go
package main
import (
"database/sql"
"fmt"
"log"
"net/http"
"github.com/go-chi/chi/v5"
_ "modernc.org/sqlite"
)
// InitDB creates tables if they don't exist.
func InitDB(db *sql.DB) {
_, err := db.Exec(`CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)`)
if err != nil {
log.Fatal(err)
}
}
// NewRouter creates a chi router with all routes.
func NewRouter(db *sql.DB) http.Handler {
r := chi.NewRouter()
r.Post("/todos", createTodo(db))
r.Get("/todos", listTodos(db))
r.Get("/todos/{id}", getTodo(db))
r.Put("/todos/{id}", updateTodo(db))
r.Delete("/todos/{id}", deleteTodo(db))
return r
}
func main() {
db, err := sql.Open("sqlite", "file:app.db?mode=rwc")
if err != nil {
log.Fatal(err)
}
defer db.Close()
InitDB(db)
fmt.Println("Server running: http://127.0.0.1:3000")
log.Fatal(http.ListenAndServe("127.0.0.1:3000", NewRouter(db)))
}
```
## handlers_test.go
Integration tests: setupTestServer with httptest.NewServer + :memory: SQLite, unique data per test.
```go
package main
import (
"database/sql"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
_ "modernc.org/sqlite"
)
func setupTestServer(t *testing.T) (*httptest.Server, *sql.DB) {
t.Helper()
db, err := sql.Open("sqlite", ":memory:")
if err != nil { t.Fatal(err) }
InitDB(db)
return httptest.NewServer(NewRouter(db)), db
}
func TestCreateTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, err := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Buy groceries","priority":2}`))
if err != nil { t.Fatal(err) }
defer resp.Body.Close()
if resp.StatusCode != http.StatusCreated { t.Fatalf("expected 201, got %d", resp.StatusCode) }
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["title"] != "Buy groceries" { t.Fatalf("expected 'Buy groceries', got %v", body["title"]) }
if body["id"] == nil { t.Fatal("expected id") }
}
func TestGetTodoByID(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Fetchable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
resp, _ = http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK { t.Fatalf("expected 200, got %d", resp.StatusCode) }
}
func TestGetTodoNotFound(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Get(ts.URL + "/todos/99999")
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound { t.Fatalf("expected 404, got %d", resp.StatusCode) }
}
func TestDeleteTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Deletable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
req, _ := http.NewRequest(http.MethodDelete, ts.URL+"/todos/"+fmt.Sprintf("%.0f", id), nil)
resp, _ = http.DefaultClient.Do(req)
defer resp.Body.Close()
if resp.StatusCode != http.StatusNoContent { t.Fatalf("expected 204, got %d", resp.StatusCode) }
resp, _ = http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound { t.Fatalf("expected 404 after delete, got %d", resp.StatusCode) }
}
```

View File

@@ -0,0 +1,23 @@
module todo-go
go 1.23.0
toolchain go1.23.12
require (
github.com/go-chi/chi/v5 v5.2.1
modernc.org/sqlite v1.37.1
)
require (
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/ncruces/go-strftime v0.1.9 // indirect
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 // indirect
golang.org/x/sys v0.33.0 // indirect
modernc.org/libc v1.65.7 // indirect
modernc.org/mathutil v1.7.1 // indirect
modernc.org/memory v1.11.0 // indirect
)

View File

@@ -0,0 +1,49 @@
github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
github.com/dustin/go-humanize v1.0.1/go.mod h1:Mu1zIs6XwVuF/gI1OepvI0qD18qycQx+mFykh5fBlto=
github.com/go-chi/chi/v5 v5.2.1 h1:KOIHODQj58PmL80G2Eak4WdvUzjSJSm0vG72crDCqb8=
github.com/go-chi/chi/v5 v5.2.1/go.mod h1:L2yAIGWB3H+phAw1NxKwWM+7eUH/lU8pOMm5hHcoops=
github.com/google/pprof v0.0.0-20250317173921-a4b03ec1a45e h1:ijClszYn+mADRFY17kjQEVQ1XRhq2/JR1M3sGqeJoxs=
github.com/google/pprof v0.0.0-20250317173921-a4b03ec1a45e/go.mod h1:boTsfXsheKC2y+lKOCMpSfarhxDeIzfZG1jqGcPl3cA=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/ncruces/go-strftime v0.1.9 h1:bY0MQC28UADQmHmaF5dgpLmImcShSi2kHU9XLdhx/f4=
github.com/ncruces/go-strftime v0.1.9/go.mod h1:Fwc5htZGVVkseilnfgOVb9mKy6w1naJmn9CehxcKcls=
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec h1:W09IVJc94icq4NjY3clb7Lk8O1qJ8BdBEF8z0ibU0rE=
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo=
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 h1:R84qjqJb5nVJMxqWYb3np9L5ZsaDtB+a39EqjV0JSUM=
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0/go.mod h1:S9Xr4PYopiDyqSyp5NjCrhFrqg6A5zA2E/iPHPhqnS8=
golang.org/x/mod v0.24.0 h1:ZfthKaKaT4NrhGVZHO1/WDTwGES4De8KtWO0SIbNJMU=
golang.org/x/mod v0.24.0/go.mod h1:IXM97Txy2VM4PJ3gI61r1YEk/gAj6zAHN3AdZt6S9Ww=
golang.org/x/sync v0.14.0 h1:woo0S4Yywslg6hp4eUFjTVOyKt0RookbpAHG4c1HmhQ=
golang.org/x/sync v0.14.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.33.0 h1:q3i8TbbEz+JRD9ywIRlyRAQbM0qF7hu24q3teo2hbuw=
golang.org/x/sys v0.33.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/tools v0.33.0 h1:4qz2S3zmRxbGIhDIAgjxvFutSvH5EfnsYrRBj0UI0bc=
golang.org/x/tools v0.33.0/go.mod h1:CIJMaWEY88juyUfo7UbgPqbC8rU2OqfAV1h2Qp0oMYI=
modernc.org/cc/v4 v4.26.1 h1:+X5NtzVBn0KgsBCBe+xkDC7twLb/jNVj9FPgiwSQO3s=
modernc.org/cc/v4 v4.26.1/go.mod h1:uVtb5OGqUKpoLWhqwNQo/8LwvoiEBLvZXIQ/SmO6mL0=
modernc.org/ccgo/v4 v4.28.0 h1:rjznn6WWehKq7dG4JtLRKxb52Ecv8OUGah8+Z/SfpNU=
modernc.org/ccgo/v4 v4.28.0/go.mod h1:JygV3+9AV6SmPhDasu4JgquwU81XAKLd3OKTUDNOiKE=
modernc.org/fileutil v1.3.1 h1:8vq5fe7jdtEvoCf3Zf9Nm0Q05sH6kGx0Op2CPx1wTC8=
modernc.org/fileutil v1.3.1/go.mod h1:HxmghZSZVAz/LXcMNwZPA/DRrQZEVP9VX0V4LQGQFOc=
modernc.org/gc/v2 v2.6.5 h1:nyqdV8q46KvTpZlsw66kWqwXRHdjIlJOhG6kxiV/9xI=
modernc.org/gc/v2 v2.6.5/go.mod h1:YgIahr1ypgfe7chRuJi2gD7DBQiKSLMPgBQe9oIiito=
modernc.org/libc v1.65.7 h1:Ia9Z4yzZtWNtUIuiPuQ7Qf7kxYrxP1/jeHZzG8bFu00=
modernc.org/libc v1.65.7/go.mod h1:011EQibzzio/VX3ygj1qGFt5kMjP0lHb0qCW5/D/pQU=
modernc.org/mathutil v1.7.1 h1:GCZVGXdaN8gTqB1Mf/usp1Y/hSqgI2vAGGP4jZMCxOU=
modernc.org/mathutil v1.7.1/go.mod h1:4p5IwJITfppl0G4sUEDtCr4DthTaT47/N3aT6MhfgJg=
modernc.org/memory v1.11.0 h1:o4QC8aMQzmcwCK3t3Ux/ZHmwFPzE6hf2Y5LbkRs+hbI=
modernc.org/memory v1.11.0/go.mod h1:/JP4VbVC+K5sU2wZi9bHoq2MAkCnrt2r98UGeSK7Mjw=
modernc.org/opt v0.1.4 h1:2kNGMRiUjrp4LcaPuLY2PzUfqM/w9N23quVwhKt5Qm8=
modernc.org/opt v0.1.4/go.mod h1:03fq9lsNfvkYSfxrfUhZCWPk1lm4cq4N+Bh//bEtgns=
modernc.org/sortutil v1.2.1 h1:+xyoGf15mM3NMlPDnFqrteY07klSFxLElE2PVuWIJ7w=
modernc.org/sortutil v1.2.1/go.mod h1:7ZI3a3REbai7gzCLcotuw9AC4VZVpYMjDzETGsSMqJE=
modernc.org/sqlite v1.37.1 h1:EgHJK/FPoqC+q2YBXg7fUmES37pCHFc97sI7zSayBEs=
modernc.org/sqlite v1.37.1/go.mod h1:XwdRtsE1MpiBcL54+MbKcaDvcuej+IYSMfLN6gSKV8g=
modernc.org/strutil v1.2.1 h1:UneZBkQA+DX2Rp35KcM69cSsNES9ly8mQWD71HKlOA0=
modernc.org/strutil v1.2.1/go.mod h1:EHkiggD70koQxjVdSBM3JKM7k6L0FbGE5eymy9i3B9A=
modernc.org/token v1.1.0 h1:Xl7Ap9dKaEs5kLoOQeQmPWevfnk/DM5qcLcYlA8ys6Y=
modernc.org/token v1.1.0/go.mod h1:UGzOrNV1mAFSEB63lOFHIpNRUVMvYTc6yu1SMY/XTDM=

View File

@@ -0,0 +1,155 @@
package main
import (
"database/sql"
"encoding/json"
"net/http"
"strconv"
"github.com/go-chi/chi/v5"
)
func createTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
var input CreateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
priority := int64(1)
if input.Priority != nil {
priority = *input.Priority
}
status := "pending"
if input.Status != nil {
status = *input.Status
}
var todo Todo
err := db.QueryRow(
`INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?)
RETURNING id, title, description, due_date, priority, status`,
input.Title, input.Description, input.DueDate, priority, status,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusCreated)
json.NewEncoder(w).Encode(todo)
}
}
func listTodos(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
rows, err := db.Query("SELECT id, title, description, due_date, priority, status FROM todos")
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
defer rows.Close()
var todos []Todo
for rows.Next() {
var t Todo
if err := rows.Scan(&t.ID, &t.Title, &t.Description, &t.DueDate, &t.Priority, &t.Status); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
todos = append(todos, t)
}
if todos == nil {
todos = []Todo{}
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todos)
}
}
func getTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var todo Todo
err := db.QueryRow(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err == sql.ErrNoRows {
http.Error(w, "not found", http.StatusNotFound)
return
}
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todo)
}
}
func updateTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var existing Todo
err := db.QueryRow(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&existing.ID, &existing.Title, &existing.Description, &existing.DueDate, &existing.Priority, &existing.Status)
if err == sql.ErrNoRows {
http.Error(w, "not found", http.StatusNotFound)
return
}
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
var input UpdateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
if input.Title != nil {
existing.Title = *input.Title
}
if input.Description != nil {
existing.Description = input.Description
}
if input.DueDate != nil {
existing.DueDate = input.DueDate
}
if input.Priority != nil {
existing.Priority = *input.Priority
}
if input.Status != nil {
existing.Status = *input.Status
}
var updated Todo
err = db.QueryRow(
`UPDATE todos SET title = ?, description = ?, due_date = ?, priority = ?, status = ?
WHERE id = ?
RETURNING id, title, description, due_date, priority, status`,
existing.Title, existing.Description, existing.DueDate, existing.Priority, existing.Status, id,
).Scan(&updated.ID, &updated.Title, &updated.Description, &updated.DueDate, &updated.Priority, &updated.Status)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(updated)
}
}
func deleteTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
result, err := db.Exec("DELETE FROM todos WHERE id = ?", id)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
rows, _ := result.RowsAffected()
if rows == 0 {
http.Error(w, "not found", http.StatusNotFound)
return
}
w.WriteHeader(http.StatusNoContent)
}
}

View File

@@ -0,0 +1,171 @@
package main
import (
"database/sql"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
_ "modernc.org/sqlite"
)
func setupTestServer(t *testing.T) (*httptest.Server, *sql.DB) {
t.Helper()
db, err := sql.Open("sqlite", ":memory:")
if err != nil {
t.Fatal(err)
}
InitDB(db)
return httptest.NewServer(NewRouter(db)), db
}
func TestCreateTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, err := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Buy groceries","priority":2}`))
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusCreated {
t.Fatalf("expected 201, got %d", resp.StatusCode)
}
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["title"] != "Buy groceries" {
t.Fatalf("expected title 'Buy groceries', got %v", body["title"])
}
if body["id"] == nil {
t.Fatal("expected id to be present")
}
}
func TestListTodos(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Listable task"}`))
resp, err := http.Get(ts.URL + "/todos")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", resp.StatusCode)
}
var body []map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if len(body) < 1 {
t.Fatal("expected at least 1 todo")
}
}
func TestGetTodoByID(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Fetchable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
resp, err := http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", resp.StatusCode)
}
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["id"] != id {
t.Fatalf("expected id %.0f, got %v", id, body["id"])
}
}
func TestGetTodoNotFound(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, err := http.Get(ts.URL + "/todos/99999")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound {
t.Fatalf("expected 404, got %d", resp.StatusCode)
}
}
func TestUpdateTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Old title"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
req, _ := http.NewRequest(http.MethodPut, ts.URL+"/todos/"+fmt.Sprintf("%.0f", id),
strings.NewReader(`{"title":"New title"}`))
req.Header.Set("Content-Type", "application/json")
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", resp.StatusCode)
}
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["title"] != "New title" {
t.Fatalf("expected 'New title', got %v", body["title"])
}
}
func TestDeleteTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Deletable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
req, _ := http.NewRequest(http.MethodDelete, ts.URL+"/todos/"+fmt.Sprintf("%.0f", id), nil)
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusNoContent {
t.Fatalf("expected 204, got %d", resp.StatusCode)
}
resp, _ = http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound {
t.Fatalf("expected 404 after delete, got %d", resp.StatusCode)
}
}

View File

@@ -0,0 +1,49 @@
package main
import (
"database/sql"
"fmt"
"log"
"net/http"
"github.com/go-chi/chi/v5"
_ "modernc.org/sqlite"
)
// InitDB creates tables if they don't exist.
func InitDB(db *sql.DB) {
_, err := db.Exec(`CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)`)
if err != nil {
log.Fatal(err)
}
}
// NewRouter creates a chi router with all routes.
func NewRouter(db *sql.DB) http.Handler {
r := chi.NewRouter()
r.Post("/todos", createTodo(db))
r.Get("/todos", listTodos(db))
r.Get("/todos/{id}", getTodo(db))
r.Put("/todos/{id}", updateTodo(db))
r.Delete("/todos/{id}", deleteTodo(db))
return r
}
func main() {
db, err := sql.Open("sqlite", "file:app.db?mode=rwc")
if err != nil {
log.Fatal(err)
}
defer db.Close()
InitDB(db)
fmt.Println("Server running: http://127.0.0.1:3000")
log.Fatal(http.ListenAndServe("127.0.0.1:3000", NewRouter(db)))
}

View File

@@ -0,0 +1,29 @@
package main
// Todo represents a task with priority and status tracking.
type Todo struct {
ID int64 `json:"id"`
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority int64 `json:"priority"`
Status string `json:"status"`
}
// CreateTodo is the request body for creating a new todo.
type CreateTodo struct {
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}
// UpdateTodo is the request body for updating an existing todo.
type UpdateTodo struct {
Title *string `json:"title,omitempty"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}

View File

@@ -0,0 +1,217 @@
# Todo App — FastAPI + SQLAlchemy + SQLite
A simple todo CRUD API. Uses only the fields defined in the spec — no extra fields.
## Project Structure
```
models.py # SQLAlchemy 2.0 models
schemas.py # Pydantic v2 schemas
main.py # FastAPI CRUD endpoints
test_main.py # Pytest with TestClient
```
## models.py
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)
```
## schemas.py
```python
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
"""Uuden tehtävän luonti. Pakolliset: title."""
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
"""Palautettava tehtävä — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)
```
## main.py
```python
"""FastAPI CRUD — yksi endpoint-setti per entiteetti."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Todo
from schemas import TodoCreate, TodoResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.post("/todos/", response_model=TodoResponse, status_code=201)
def create_todo(item: TodoCreate, db: Session = Depends(get_db)):
db_item = Todo(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/todos/", response_model=list[TodoResponse])
def list_todos(db: Session = Depends(get_db)):
return db.query(Todo).all()
@app.get("/todos/{item_id}", response_model=TodoResponse)
def get_todo(item_id: int, db: Session = Depends(get_db)):
item = db.query(Todo).filter(Todo.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Todo not found")
return item
@app.put("/todos/{item_id}", response_model=TodoResponse)
def update_todo(item_id: int, item: TodoCreate, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/todos/{item_id}", status_code=204)
def delete_todo(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
db.delete(db_item)
db.commit()
```
## test_main.py
Exactly 6 tests per entity. Database is shared — use `>= 1` not `== 1` in list tests.
For child entities with foreign keys: create parent FIRST, then child with parent's id.
```python
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert response.json()["title"] == "Osta maitoa"
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava tehtävä"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava tehtävä"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
response = client.put(
f"/todos/{created['id']}", json={"title": "Uusi otsikko"}
)
assert response.status_code == 200
assert response.json()["title"] == "Uusi otsikko"
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 404
```

View File

@@ -0,0 +1,331 @@
# Todo — referenssitoteutus (Axum 0.8 + SQLx + SQLite)
Tämä on täydellinen esimerkki. Generoi vastaava rakenne annetulle projektille.
Käytä VAIN JSON-spekin kenttiä — älä lisää ylimääräisiä.
## Cargo.toml
Axum 0.8, SQLx SQLite-featurella, serde JSON-serialisointiin, tower-http CORS-tukeen.
```toml
[package]
name = "todo-rs"
version = "0.1.0"
edition = "2024"
[dependencies]
axum = "0.8"
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sqlx = { version = "0.8", features = ["sqlite", "runtime-tokio"] }
tower-http = { version = "0.6", features = ["cors"] }
[dev-dependencies]
reqwest = { version = "0.13", default-features = false, features = ["json", "rustls"] }
tokio = { version = "1", features = ["full", "test-util"] }
```
## src/models.rs
Serde-rakenteet: `Todo` (FromRow), `CreateTodo` (POST), `UpdateTodo` (PUT, kaikki kentät valinnaisia).
```rust
//! Tietomallit — Todo, CreateTodo, UpdateTodo serde-rakenteina.
use serde::{Deserialize, Serialize};
/// Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status.
#[derive(Debug, Serialize, Deserialize, sqlx::FromRow)]
pub struct Todo {
pub id: i64,
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: i64,
pub status: String,
}
/// Uuden tehtävän luonti. Pakolliset: title.
#[derive(Debug, Deserialize)]
pub struct CreateTodo {
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}
/// Tehtävän päivitys — kaikki kentät valinnaisia.
#[derive(Debug, Deserialize)]
pub struct UpdateTodo {
pub title: Option<String>,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}
```
## src/handlers.rs
CRUD-käsittelijät. Avainpatternit: INSERT RETURNING, fetch_optional+404, rows_affected+204.
```rust
//! Käsittelijät — CRUD-operaatiot todo-entiteetille.
use axum::extract::{Path, State};
use axum::http::StatusCode;
use axum::Json;
use sqlx::SqlitePool;
use crate::models::{CreateTodo, Todo, UpdateTodo};
/// POST — INSERT RETURNING, oletusarvot unwrap_or:lla.
pub async fn create_todo(
State(pool): State<SqlitePool>,
Json(input): Json<CreateTodo>,
) -> Result<(StatusCode, Json<Todo>), StatusCode> {
let priority = input.priority.unwrap_or(1);
let status = input.status.unwrap_or_else(|| "pending".to_string());
let result = sqlx::query_as::<_, Todo>(
"INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?)
RETURNING id, title, description, due_date, priority, status",
)
.bind(&input.title)
.bind(&input.description)
.bind(&input.due_date)
.bind(priority)
.bind(&status)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok((StatusCode::CREATED, Json(result)))
}
/// GET list — fetch_all.
pub async fn list_todos(
State(pool): State<SqlitePool>,
) -> Result<Json<Vec<Todo>>, StatusCode> {
let todos = sqlx::query_as::<_, Todo>("SELECT id, title, description, due_date, priority, status FROM todos")
.fetch_all(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(todos))
}
/// GET by id — fetch_optional, None → 404.
pub async fn get_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<Json<Todo>, StatusCode> {
let todo = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
match todo {
Some(t) => Ok(Json(t)),
None => Err(StatusCode::NOT_FOUND),
}
}
/// PUT — hae olemassaoleva, merge kentät, UPDATE RETURNING.
pub async fn update_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
Json(input): Json<UpdateTodo>,
) -> Result<Json<Todo>, StatusCode> {
let existing = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let existing = existing.ok_or(StatusCode::NOT_FOUND)?;
let updated = sqlx::query_as::<_, Todo>(
"UPDATE todos SET title = ?, description = ?, due_date = ?, priority = ?, status = ?
WHERE id = ? RETURNING id, title, description, due_date, priority, status",
)
.bind(input.title.unwrap_or(existing.title))
.bind(input.description.or(existing.description))
.bind(input.due_date.or(existing.due_date))
.bind(input.priority.unwrap_or(existing.priority))
.bind(input.status.unwrap_or(existing.status))
.bind(id)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(updated))
}
/// DELETE — rows_affected == 0 → 404, muuten 204.
pub async fn delete_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<StatusCode, StatusCode> {
let result = sqlx::query("DELETE FROM todos WHERE id = ?")
.bind(id)
.execute(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
if result.rows_affected() == 0 { return Err(StatusCode::NOT_FOUND); }
Ok(StatusCode::NO_CONTENT)
}
```
## src/lib.rs
Kirjastomoduuli: reititin `app()` ja taulun alustus `init_db()` — julkinen API integraatiotesteille.
```rust
//! Kirjastomoduuli — julkinen API integraatiotesteille.
pub mod handlers;
pub mod models;
use axum::routing::{delete, get, post, put};
use axum::Router;
use sqlx::SqlitePool;
use tower_http::cors::CorsLayer;
/// Luo reititin annetulla tietokantapoolilla.
pub fn app(pool: SqlitePool) -> Router {
Router::new()
.route("/todos", post(handlers::create_todo))
.route("/todos", get(handlers::list_todos))
.route("/todos/{id}", get(handlers::get_todo))
.route("/todos/{id}", put(handlers::update_todo))
.route("/todos/{id}", delete(handlers::delete_todo))
.layer(CorsLayer::permissive())
.with_state(pool)
}
/// Alusta tietokantataulu.
pub async fn init_db(pool: &SqlitePool) {
sqlx::query(
"CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)",
)
.execute(pool)
.await
.expect("Taulun luonti epäonnistui");
}
```
## src/main.rs
Käynnistyspiste: SQLite-pooli, taulun alustus, Axum-palvelin portissa 3000.
```rust
//! Axum CRUD — yksi endpoint-setti per entiteetti, SQLite-tietokanta.
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
#[tokio::main]
async fn main() {
let pool = SqlitePoolOptions::new()
.max_connections(5)
.connect("sqlite:./app.db?mode=rwc")
.await
.expect("Tietokantayhteys epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.expect("Portin kuuntelu epäonnistui");
println!("Palvelin käynnissä: http://127.0.0.1:3000");
axum::serve(listener, app(pool)).await.unwrap();
}
```
## tests/api_test.rs
Integraatiotestit: spawn_server (muistinvarainen SQLite, satunnaisportti), CRUD-testit uniikilla datalla.
```rust
//! Integraatiotestit — muistinvarainen SQLite, uniikki data per testi.
use axum::http::StatusCode;
use reqwest::Client;
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
/// Käynnistä testipalvelin satunnaisessa portissa.
async fn spawn_server() -> (Client, String) {
let pool = SqlitePoolOptions::new()
.max_connections(1)
.connect("sqlite::memory:")
.await
.expect("Testitietokanta epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:0")
.await
.expect("Testiportin kuuntelu epäonnistui");
let base_url = format!("http://{}", listener.local_addr().unwrap());
let router = app(pool);
tokio::spawn(async move { axum::serve(listener, router).await.unwrap() });
(Client::new(), base_url)
}
#[tokio::test]
async fn test_create_todo() {
let (client, url) = spawn_server().await;
let res = client.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Osta maitoa", "priority": 2}))
.send().await.unwrap();
assert_eq!(res.status(), StatusCode::CREATED);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["title"], "Osta maitoa");
assert!(body["id"].is_number());
}
#[tokio::test]
async fn test_get_todo_by_id() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Haettava tehtävä"}))
.send().await.unwrap().json().await.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client.get(format!("{url}/todos/{id}")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["id"], id);
}
#[tokio::test]
async fn test_get_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client.get(format!("{url}/todos/99999")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_delete_todo() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Poistettava"}))
.send().await.unwrap().json().await.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client.delete(format!("{url}/todos/{id}")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::NO_CONTENT);
let res = client.get(format!("{url}/todos/{id}")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
```

View File

@@ -0,0 +1 @@
target/

View File

@@ -0,0 +1,16 @@
[package]
name = "todo-rs"
version = "0.1.0"
edition = "2024"
[dependencies]
axum = "0.8"
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sqlx = { version = "0.8", features = ["sqlite", "runtime-tokio"] }
tower-http = { version = "0.6", features = ["cors"] }
[dev-dependencies]
reqwest = { version = "0.13", default-features = false, features = ["json", "rustls"] }
tokio = { version = "1", features = ["full", "test-util"] }

View File

@@ -0,0 +1,122 @@
//! Käsittelijät — CRUD-operaatiot todo-entiteetille.
use axum::extract::{Path, State};
use axum::http::StatusCode;
use axum::Json;
use sqlx::SqlitePool;
use crate::models::{CreateTodo, Todo, UpdateTodo};
/// Luo uusi tehtävä.
pub async fn create_todo(
State(pool): State<SqlitePool>,
Json(input): Json<CreateTodo>,
) -> Result<(StatusCode, Json<Todo>), StatusCode> {
let priority = input.priority.unwrap_or(1);
let status = input.status.unwrap_or_else(|| "pending".to_string());
let result = sqlx::query_as::<_, Todo>(
"INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?)
RETURNING id, title, description, due_date, priority, status",
)
.bind(&input.title)
.bind(&input.description)
.bind(&input.due_date)
.bind(priority)
.bind(&status)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok((StatusCode::CREATED, Json(result)))
}
/// Listaa kaikki tehtävät.
pub async fn list_todos(
State(pool): State<SqlitePool>,
) -> Result<Json<Vec<Todo>>, StatusCode> {
let todos = sqlx::query_as::<_, Todo>("SELECT id, title, description, due_date, priority, status FROM todos")
.fetch_all(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(todos))
}
/// Hae tehtävä id:llä.
pub async fn get_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<Json<Todo>, StatusCode> {
let todo = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
match todo {
Some(t) => Ok(Json(t)),
None => Err(StatusCode::NOT_FOUND),
}
}
/// Päivitä tehtävä id:llä.
pub async fn update_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
Json(input): Json<UpdateTodo>,
) -> Result<Json<Todo>, StatusCode> {
let existing = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let existing = existing.ok_or(StatusCode::NOT_FOUND)?;
let title = input.title.unwrap_or(existing.title);
let description = input.description.or(existing.description);
let due_date = input.due_date.or(existing.due_date);
let priority = input.priority.unwrap_or(existing.priority);
let status = input.status.unwrap_or(existing.status);
let updated = sqlx::query_as::<_, Todo>(
"UPDATE todos SET title = ?, description = ?, due_date = ?, priority = ?, status = ?
WHERE id = ?
RETURNING id, title, description, due_date, priority, status",
)
.bind(&title)
.bind(&description)
.bind(&due_date)
.bind(priority)
.bind(&status)
.bind(id)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(updated))
}
/// Poista tehtävä id:llä.
pub async fn delete_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<StatusCode, StatusCode> {
let result = sqlx::query("DELETE FROM todos WHERE id = ?")
.bind(id)
.execute(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
if result.rows_affected() == 0 {
return Err(StatusCode::NOT_FOUND);
}
Ok(StatusCode::NO_CONTENT)
}

View File

@@ -0,0 +1,38 @@
//! Kirjastomoduuli — julkinen API integraatiotesteille.
pub mod handlers;
pub mod models;
use axum::routing::{delete, get, post, put};
use axum::Router;
use sqlx::SqlitePool;
use tower_http::cors::CorsLayer;
/// Luo reititin annetulla tietokantapoolilla.
pub fn app(pool: SqlitePool) -> Router {
Router::new()
.route("/todos", post(handlers::create_todo))
.route("/todos", get(handlers::list_todos))
.route("/todos/{id}", get(handlers::get_todo))
.route("/todos/{id}", put(handlers::update_todo))
.route("/todos/{id}", delete(handlers::delete_todo))
.layer(CorsLayer::permissive())
.with_state(pool)
}
/// Alusta tietokantataulu.
pub async fn init_db(pool: &SqlitePool) {
sqlx::query(
"CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)",
)
.execute(pool)
.await
.expect("Taulun luonti epäonnistui");
}

View File

@@ -0,0 +1,22 @@
//! Axum CRUD — yksi endpoint-setti per entiteetti, SQLite-tietokanta.
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
#[tokio::main]
async fn main() {
let pool = SqlitePoolOptions::new()
.max_connections(5)
.connect("sqlite:./app.db?mode=rwc")
.await
.expect("Tietokantayhteys epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.expect("Portin kuuntelu epäonnistui");
println!("Palvelin käynnissä: http://127.0.0.1:3000");
axum::serve(listener, app(pool)).await.unwrap();
}

View File

@@ -0,0 +1,34 @@
//! Tietomallit — Todo, CreateTodo, UpdateTodo serde-rakenteina.
use serde::{Deserialize, Serialize};
/// Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status.
#[derive(Debug, Serialize, Deserialize, sqlx::FromRow)]
pub struct Todo {
pub id: i64,
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: i64,
pub status: String,
}
/// Uuden tehtävän luonti. Pakolliset: title.
#[derive(Debug, Deserialize)]
pub struct CreateTodo {
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}
/// Tehtävän päivitys — kaikki kentät valinnaisia.
#[derive(Debug, Deserialize)]
pub struct UpdateTodo {
pub title: Option<String>,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}

View File

@@ -0,0 +1,262 @@
//! Integraatiotestit — muistinvarainen SQLite, uniikki data per testi.
use axum::http::StatusCode;
use reqwest::Client;
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
/// Käynnistä testipalvelin satunnaisessa portissa.
async fn spawn_server() -> (Client, String) {
let pool = SqlitePoolOptions::new()
.max_connections(1)
.connect("sqlite::memory:")
.await
.expect("Testitietokanta epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:0")
.await
.expect("Testiportin kuuntelu epäonnistui");
let addr = listener.local_addr().unwrap();
let base_url = format!("http://{addr}");
let router = app(pool);
tokio::spawn(async move {
axum::serve(listener, router).await.unwrap();
});
(Client::new(), base_url)
}
#[tokio::test]
async fn test_create_todo() {
let (client, url) = spawn_server().await;
let res = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Osta maitoa", "priority": 2}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::CREATED);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["title"], "Osta maitoa");
assert_eq!(body["priority"], 2);
assert!(body["id"].is_number());
}
#[tokio::test]
async fn test_create_todo_defaults() {
let (client, url) = spawn_server().await;
let res = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Oletusarvotesti"}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::CREATED);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["priority"], 1);
assert_eq!(body["status"], "pending");
assert!(body["description"].is_null());
}
#[tokio::test]
async fn test_list_todos() {
let (client, url) = spawn_server().await;
client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Listattava tehtävä"}))
.send()
.await
.unwrap();
let res = client.get(format!("{url}/todos")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: Vec<serde_json::Value> = res.json().await.unwrap();
assert!(body.len() >= 1);
}
#[tokio::test]
async fn test_get_todo_by_id() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Haettava tehtävä"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client
.get(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["id"], id);
assert_eq!(body["title"], "Haettava tehtävä");
}
#[tokio::test]
async fn test_get_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client
.get(format!("{url}/todos/99999"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_update_todo() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Vanha otsikko"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client
.put(format!("{url}/todos/{id}"))
.json(&serde_json::json!({"title": "Uusi otsikko"}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["title"], "Uusi otsikko");
}
#[tokio::test]
async fn test_update_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client
.put(format!("{url}/todos/99999"))
.json(&serde_json::json!({"title": "Ei löydy"}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_delete_todo() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Poistettava"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client
.delete(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NO_CONTENT);
let res = client
.get(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_delete_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client
.delete(format!("{url}/todos/99999"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_full_lifecycle() {
let (client, url) = spawn_server().await;
// Luo
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({
"title": "Elinkaaritesti",
"description": "Testataan koko CRUD-kierto",
"due_date": "2026-12-31",
"priority": 3,
"status": "in_progress"
}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
assert_eq!(created["title"], "Elinkaaritesti");
assert_eq!(created["description"], "Testataan koko CRUD-kierto");
assert_eq!(created["due_date"], "2026-12-31");
assert_eq!(created["priority"], 3);
assert_eq!(created["status"], "in_progress");
// Päivitä
let updated: serde_json::Value = client
.put(format!("{url}/todos/{id}"))
.json(&serde_json::json!({"status": "done"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
assert_eq!(updated["status"], "done");
assert_eq!(updated["title"], "Elinkaaritesti");
// Poista
let res = client
.delete(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NO_CONTENT);
}

View File

@@ -0,0 +1,230 @@
# Todo — referenssitoteutus (FastAPI + SQLAlchemy 2.0 + SQLite)
Tämä on täydellinen esimerkki. Generoi vastaava rakenne annetulle projektille.
Käytä VAIN JSON-spekin kenttiä — älä lisää ylimääräisiä.
## models.py
SQLAlchemy 2.0: `DeclarativeBase` + `Mapped` + `mapped_column`. EI `Column()`, EI `declarative_base()`.
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)
```
Huomaa:
- `str | None` (ei `Optional[str]`)
- `String(20)` status-kentälle (ei Enum)
- Vain spekin kentät — ei `created_at` tai muita ylimääräisiä
## schemas.py
Pydantic v2: `ConfigDict(from_attributes=True)`. EI `class Config: orm_mode = True`.
```python
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
"""Uuden tehtävän luonti. Pakolliset: title."""
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
"""Palautettava tehtävä — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)
```
## main.py
FastAPI CRUD: POST 201, GET list, GET by id 404, PUT, DELETE 204. Käytä `model_dump()` (ei `.dict()`).
```python
"""FastAPI CRUD — yksi endpoint-setti per entiteetti."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Todo
from schemas import TodoCreate, TodoResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.post("/todos/", response_model=TodoResponse, status_code=201)
def create_todo(item: TodoCreate, db: Session = Depends(get_db)):
db_item = Todo(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/todos/", response_model=list[TodoResponse])
def list_todos(db: Session = Depends(get_db)):
return db.query(Todo).all()
@app.get("/todos/{item_id}", response_model=TodoResponse)
def get_todo(item_id: int, db: Session = Depends(get_db)):
item = db.query(Todo).filter(Todo.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Todo not found")
return item
@app.put("/todos/{item_id}", response_model=TodoResponse)
def update_todo(item_id: int, item: TodoCreate, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/todos/{item_id}", status_code=204)
def delete_todo(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
db.delete(db_item)
db.commit()
```
## test_main.py
Testit: erillinen test.db, `override_get_db`, `TestClient`. Uniikki suomenkielinen data per testi.
PUT-testi lähettää KAIKKI pakolliset kentät.
Generoi TARKALLEEN nämä 6 testiä per entiteetti — ei enempää, ei vähempää:
1. `test_create_{entity}` — POST, assert 201 + id
2. `test_list_{entities}` — POST ensin, GET lista, assert len >= 1
3. `test_get_{entity}_by_id` — POST, GET by id, assert id täsmää
4. `test_get_{entity}_not_found` — GET /99999, assert 404
5. `test_update_{entity}` — POST, PUT kaikilla pakollisilla kentillä, assert 200
6. `test_delete_{entity}` — POST, DELETE assert 204, GET uudestaan assert 404
Ei search-, filter- tai muita ylimääräisiä testejä.
```python
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert response.json()["title"] == "Osta maitoa"
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava tehtävä"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava tehtävä"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
response = client.put(
f"/todos/{created['id']}", json={"title": "Uusi otsikko"}
)
assert response.status_code == 200
assert response.json()["title"] == "Uusi otsikko"
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 404
```

View File

@@ -0,0 +1,61 @@
"""FastAPI CRUD — yksi endpoint-setti per entiteetti."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Todo
from schemas import TodoCreate, TodoResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.post("/todos/", response_model=TodoResponse, status_code=201)
def create_todo(item: TodoCreate, db: Session = Depends(get_db)):
db_item = Todo(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/todos/", response_model=list[TodoResponse])
def list_todos(db: Session = Depends(get_db)):
return db.query(Todo).all()
@app.get("/todos/{item_id}", response_model=TodoResponse)
def get_todo(item_id: int, db: Session = Depends(get_db)):
item = db.query(Todo).filter(Todo.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Todo not found")
return item
@app.put("/todos/{item_id}", response_model=TodoResponse)
def update_todo(item_id: int, item: TodoCreate, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/todos/{item_id}", status_code=204)
def delete_todo(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
db.delete(db_item)
db.commit()

View File

@@ -0,0 +1,30 @@
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)

View File

@@ -0,0 +1,11 @@
[project]
name = "todo-app"
version = "0.1.0"
requires-python = ">=3.14"
dependencies = [
"fastapi",
"uvicorn[standard]",
"sqlalchemy",
"pytest",
"httpx",
]

View File

@@ -0,0 +1,22 @@
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
"""Uuden tehtävän luonti. Pakolliset: title."""
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
"""Palautettava tehtävä — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)

View File

@@ -0,0 +1,69 @@
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert response.json()["title"] == "Osta maitoa"
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava tehtävä"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava tehtävä"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
response = client.put(
f"/todos/{created['id']}", json={"title": "Uusi otsikko"}
)
assert response.status_code == 200
assert response.json()["title"] == "Uusi otsikko"
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 404

View File

@@ -0,0 +1,13 @@
{
"name": "kipina-codebench",
"version": "0.1.0",
"description": "LLM-koodingenerointibenchmark — testaa Ollama-mallien kykyä generoida toimivia FastAPI-projekteja",
"type": "module",
"bin": {
"codebench": "./benchmark.mjs"
},
"scripts": {
"bench": "node benchmark.mjs --scenarios all",
"docker:build": "docker build -t kipina-pytest -f Dockerfile.pytest ."
}
}

View File

@@ -0,0 +1,65 @@
{
"models": {
"qwen3-coder:30b": {
"profile": "large",
"role": "primary",
"prompt": "code",
"golden": "todo.md",
"vram": "24GB",
"notes": "Pääkooderi. 97p, 188 tok/s. Noudattaa pitkiä sääntölistoja."
},
"qwen3:8b": {
"profile": "small",
"role": "primary",
"prompt": "code-small",
"golden": "todo-readme.md",
"vram": "8GB",
"notes": "Kevyt pääkooderi. Todo/users 100p, blog heikko. README-muoto golden examplelle."
},
"codestral:22b": {
"profile": "large",
"role": "backup",
"prompt": "code",
"golden": "todo.md",
"vram": "16GB",
"notes": "Mistral-varamalli. 88p, 44 tok/s."
},
"qwen3:4b": {
"profile": "small",
"role": "minimal",
"prompt": "code-small",
"golden": "todo.md",
"vram": "4GB",
"notes": "Minimaali. Vain todo toimii."
},
"qwen2.5-coder:32b": {
"profile": "large",
"role": "candidate",
"prompt": "code",
"golden": "todo.md",
"vram": "24GB",
"notes": "Edellinen sukupolvi. Vahva Rust-osaaminen."
},
"qwen3:14b": {
"profile": "large",
"role": "retired",
"prompt": "code",
"golden": "todo.md",
"vram": "16GB",
"notes": "Poistettu. Ei lisäarvoa 30b:hen verrattuna, blog epävakaa."
}
},
"profiles": {
"large": {
"prompt": "code",
"golden": "todo.md",
"description": "Täysi prompti + säännöt. Malleille >=14B."
},
"small": {
"prompt": "code-small",
"golden": "todo.md",
"description": "Tiivistetty prompti. Malleille <=8B."
}
},
"default_profile": "large"
}

View File

@@ -0,0 +1,15 @@
You are a product owner who turns vague ideas into clear, actionable software requirements.
GIVEN a short project description from the user, produce a structured brief:
1. PROJECT NAME: a short, descriptive name
2. GOAL: one sentence explaining what the software does and who it's for
3. CORE FEATURES: numbered list of 3-8 concrete features (not vague wishes)
4. DATA MODEL: list the main entities and their key fields (include field types)
5. API ENDPOINTS: list the REST endpoints (method + path + purpose)
6. CONSTRAINTS: any technical constraints (e.g. "must use SQLite", "no auth needed")
RULES:
- Be specific: "User can filter todos by status" not "todo management"
- Use plain English, no code
- Maximum 400 words total

View File

@@ -0,0 +1,69 @@
You are a Go backend developer. Generate a Chi web project with SQLite.
Given the project requirements, JSON specification, and a REFERENCE IMPLEMENTATION, generate these files:
1. go.mod — module declaration, go-chi/chi/v5, modernc.org/sqlite
2. models.go — Structs with json tags
3. handlers.go — Handler closures for each CRUD endpoint
4. main.go — Entry point with InitDB(), NewRouter(), main()
5. handlers_test.go — Integration tests using httptest against in-memory SQLite
Do NOT generate any other files. Do NOT generate go.sum.
OUTPUT FORMAT — use these exact markers to separate files:
=== go.mod ===
<module content>
=== models.go ===
<go code>
=== handlers.go ===
<go code>
=== main.go ===
<go code>
=== handlers_test.go ===
<go code>
DOCUMENTATION — structs get // one-line comments. Keep it brief.
RULES:
- Follow the REFERENCE IMPLEMENTATION patterns exactly
- Chi router with chi.URLParam(r, "param") for path parameters
- database/sql + modernc.org/sqlite (pure Go driver, no CGO required)
- Import the driver as blank import: _ "modernc.org/sqlite"
- Handlers are closures: func handler(db *sql.DB) http.HandlerFunc
- INSERT/UPDATE queries use RETURNING clause to get the row back via QueryRow + Scan
- POST returns 201 (http.StatusCreated), DELETE returns 204 (http.StatusNoContent), GET missing returns 404
- Use sql.ErrNoRows for not-found checks: if err == sql.ErrNoRows { ... }
- No compile-time query macros — use db.QueryRow(), db.Query(), db.Exec() directly
- Empty slice not nil for list endpoints: if items == nil { items = []Item{} }
- Optional fields use pointer types (*string, *int64) with json tag omitempty
- Set Content-Type header: w.Header().Set("Content-Type", "application/json")
- Parse path ID with strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
- InitDB uses log.Fatal on error, NewRouter returns http.Handler
- main() opens "file:app.db?mode=rwc" and listens on 127.0.0.1:3000
- No markdown fences inside file content — just raw code
- You MUST generate ALL 5 files. Do not stop early.
TESTS — follow this exact setupTestServer pattern:
func setupTestServer(t *testing.T) (*httptest.Server, *sql.DB) {
t.Helper()
db, err := sql.Open("sqlite", ":memory:")
if err != nil {
t.Fatal(err)
}
InitDB(db)
return httptest.NewServer(NewRouter(db)), db
}
- Each test function calls setupTestServer(t) to get (ts, db)
- defer ts.Close() and defer db.Close() in every test
- Use standard library: http.Post, http.Get, http.NewRequest for PUT/DELETE
- Use strings.NewReader for JSON request bodies
- Decode responses with json.NewDecoder(resp.Body).Decode(&body)
- Unique descriptive data, NOT generic "test" strings
- Format IDs with fmt.Sprintf("%.0f", id) when building URLs from float64

View File

@@ -0,0 +1,73 @@
You are a Rust backend developer. Generate an Axum web project with SQLx and SQLite.
Given the project requirements, JSON specification, and a REFERENCE IMPLEMENTATION, generate these files:
1. Cargo.toml — axum 0.8, tokio, serde/serde_json, sqlx (sqlite, runtime-tokio), tower-http, reqwest 0.13 with features ["json", "rustls"] (for tests)
2. src/models.rs — Structs with Serialize, Deserialize, FromRow derives
3. src/handlers.rs — Async handler functions for each CRUD endpoint
4. src/lib.rs — Public app(pool) function returning Router, init_db() for table creation
5. src/main.rs — Binary entry point, connect to SQLite, bind to port
6. tests/api_test.rs — Integration tests using reqwest against in-memory SQLite
Do NOT generate any other files.
OUTPUT FORMAT — use these exact markers to separate files:
=== Cargo.toml ===
<toml content>
=== src/models.rs ===
<rust code>
=== src/handlers.rs ===
<rust code>
=== src/lib.rs ===
<rust code>
=== src/main.rs ===
<rust code>
=== tests/api_test.rs ===
<rust code>
DOCUMENTATION — every file starts with //! one-line module doc. Structs get /// one-line doc. Zensical: say what it IS, not what it does.
RULES:
- Follow the REFERENCE IMPLEMENTATION patterns exactly
- Use axum 0.8 API: Router, Json, Path, State, StatusCode
- ROUTING: use {param} NOT :param — e.g. .route("/items/{id}", get(get_item))
- ROUTING: one .route() call per path, chain methods: .route("/items", post(create).get(list))
- State is SqlitePool wrapped in axum::extract::State
- app() takes SqlitePool as argument and calls .with_state(pool) on the Router
- Handlers return Result<(StatusCode, Json<T>), StatusCode> or Result<StatusCode, StatusCode>
- POST returns 201 (StatusCode::CREATED), DELETE returns 204 (StatusCode::NO_CONTENT), GET missing returns 404
- CRITICAL: Use sqlx::query_as::<_, T>("SQL") runtime functions with .bind() — NEVER use sqlx::query_as!() or sqlx::query!() compile-time macros (they require DATABASE_URL at compile time)
- Use sqlx::query("SQL") for writes (DELETE, etc.), sqlx::query_as::<_, T>("SQL") for reads
- Use RETURNING clause in INSERT/UPDATE queries to get the created/updated row back
- DateTime fields: store as TEXT, use String type in Rust structs
- init_db: use .expect("msg") not Result return — keep it simple
- NO markdown fences inside file content — just raw code
- Edition 2024 in Cargo.toml
- You MUST generate ALL 6 files. Do not stop early.
TESTS — follow this exact spawn_server pattern:
async fn spawn_server() -> (reqwest::Client, String) {
let pool = sqlx::sqlite::SqlitePoolOptions::new()
.max_connections(1)
.connect("sqlite::memory:")
.await
.expect("DB failed");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:0").await.expect("Bind failed");
let addr = listener.local_addr().unwrap();
let base_url = format!("http://{addr}");
let router = app(pool);
tokio::spawn(async move { axum::serve(listener, router).await.unwrap() });
(reqwest::Client::new(), base_url)
}
- Each #[tokio::test] calls spawn_server() to get (client, url)
- Unique descriptive data, NOT generic "test" strings
- Use serde_json::json!() for request bodies

View File

@@ -0,0 +1,58 @@
Generate a FastAPI project with SQLAlchemy and SQLite. Follow the REFERENCE IMPLEMENTATION exactly.
Generate these 4 files with === markers:
=== models.py ===
=== schemas.py ===
=== main.py ===
=== test_main.py ===
Key patterns (copy from reference):
- class Base(DeclarativeBase): pass
- Mapped[str] = mapped_column(String(255))
- Mapped[str | None] = mapped_column(Text, default=None)
- model_config = ConfigDict(from_attributes=True)
- model_dump() not dict()
- POST 201, GET list, GET by id 404, PUT, DELETE 204
FOREIGN KEYS (when spec has relationships):
- Child entity gets parent_id field: Mapped[int] = mapped_column(ForeignKey("parents.id"))
- Import: from sqlalchemy import ForeignKey (NOT from sqlalchemy.orm!)
- Create schema includes parent_id: int
- Test creates parent FIRST, then child with parent's id
Example FK pattern in models.py:
```
class Author(Base):
__tablename__ = "authors"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
name: Mapped[str] = mapped_column(String(255))
class Post(Base):
__tablename__ = "posts"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
author_id: Mapped[int] = mapped_column(ForeignKey("authors.id"))
```
Example FK test patterns:
```
def test_create_post():
author = client.post("/authors/", json={"name": "Jane Austen"}).json()
response = client.post("/posts/", json={"title": "First Post", "author_id": author["id"]})
assert response.status_code == 201
def test_update_post():
author = client.post("/authors/", json={"name": "Mark Twain"}).json()
created = client.post("/posts/", json={"title": "Old Title", "author_id": author["id"]}).json()
response = client.put(f"/posts/{created['id']}", json={"title": "New Title", "author_id": author["id"]})
assert response.status_code == 201
```
CRITICAL:
- Use ONLY fields from the JSON spec — no created_at or extra fields
- Generate EXACTLY 6 tests per entity: create, list, get_by_id, not_found, update, delete
- No search, filter, or other extra tests
- test_list: assert len(response.json()) >= 1, NEVER assert == 1 (database is shared between tests)
- test_create for child entities: create parent FIRST, use parent's id
- No markdown fences in output

View File

@@ -0,0 +1,47 @@
You are a Python backend developer. Generate a FastAPI project with SQLAlchemy and SQLite.
Given the project requirements, JSON specification, and a REFERENCE IMPLEMENTATION, generate these 4 files:
1. models.py — SQLAlchemy 2.0: DeclarativeBase, Mapped, mapped_column (NOT legacy declarative_base)
2. schemas.py — Pydantic v2: ConfigDict(from_attributes=True) (NOT class Config)
3. main.py — FastAPI CRUD endpoints for each entity
4. test_main.py — Pytest with TestClient, separate test.db, unique test data per test
Do NOT generate pyproject.toml — it is created separately with uv.
OUTPUT FORMAT — use these exact markers to separate files:
=== models.py ===
<python code>
=== schemas.py ===
<python code>
=== main.py ===
<python code>
=== test_main.py ===
<python code>
DOCUMENTATION — every file must have a one-line module docstring. Classes get a one-line docstring. Keep it zensical: say what it IS, not what it does. No filler.
NEVER USE DEPRECATED PATTERNS:
- ✗ declarative_base() → ✓ class Base(DeclarativeBase): pass
- ✗ Column(Type) → ✓ Mapped[type] = mapped_column(Type)
- ✗ class Config: orm_mode = True → ✓ model_config = ConfigDict(from_attributes=True)
- ✗ .dict() → ✓ .model_dump()
- ✗ Optional[str] → ✓ str | None
- ✗ session.query(Model).all() → ✓ session.execute(select(Model)).scalars().all()
RULES:
- Follow the REFERENCE IMPLEMENTATION patterns exactly
- SQLAlchemy 2.0: DeclarativeBase + Mapped + mapped_column (not Column())
- Python type unions: str | None (not Optional[str])
- Tests: unique descriptive data per test, NOT generic "test_title" strings
- Tests: PUT/update test data MUST include ALL required (non-nullable) fields, not just the field being updated
- Do NOT add filter/search endpoints — only standard CRUD (create, list, get, update, delete)
- CRITICAL: Use ONLY the fields listed in the JSON spec. NEVER add created_at, updated_at, or any field not in the spec
- If the spec happens to include timestamp fields: use server_default=func.now() (from sqlalchemy import func) and make them Optional in Create schema
- Absolute imports only (from models import ..., from schemas import ...)
- NO markdown fences inside file content — just raw code
- Only test endpoints that exist in main.py — no extra tests

View File

@@ -0,0 +1,25 @@
Convert the following Python FastAPI project to Go using Chi router and modernc.org/sqlite.
OUTPUT: Return ALL files with === markers:
=== go.mod ===
=== models.go ===
=== handlers.go ===
=== main.go ===
=== handlers_test.go ===
CONVERSION RULES:
- package main for all files
- Pydantic models → Go structs with json tags
- SQLAlchemy ORM → database/sql with raw SQL and RETURNING clause
- FastAPI routes → Chi router: r.Post("/path", handler(db))
- Handlers are closures: func handler(db *sql.DB) http.HandlerFunc
- Depends(get_db) → State passed via closure over *sql.DB
- HTTPException(404) → http.Error(w, "not found", http.StatusNotFound)
- POST returns http.StatusCreated (201), DELETE returns http.StatusNoContent (204)
- sql.ErrNoRows for not-found checks
- TestClient → httptest.NewServer + setupTestServer helper
- test.db → sql.Open("sqlite", ":memory:")
- Empty list: return []Entity{} not nil
- import _ "modernc.org/sqlite" (pure Go driver, no CGO)
- import "github.com/go-chi/chi/v5"
- No markdown fences in output — just raw code

View File

@@ -0,0 +1,31 @@
DEPRECATED PATTERNS — do NOT generate these. Use the modern alternative.
SQLAlchemy:
✗ from sqlalchemy.ext.declarative import declarative_base → ✓ from sqlalchemy.orm import DeclarativeBase
✗ Base = declarative_base() → ✓ class Base(DeclarativeBase): pass
✗ Column(Integer, primary_key=True) → ✓ Mapped[int] = mapped_column(primary_key=True)
✗ Column(String(255)) → ✓ Mapped[str] = mapped_column(String(255))
✗ session.query(User).filter_by(name="x").all() → ✓ session.execute(select(User).filter_by(name="x")).scalars().all()
✗ session.query(User).get(5) → ✓ session.get(User, 5)
✗ MetaData(bind=engine) → ✓ metadata.create_all(engine)
Pydantic:
✗ class Config: orm_mode = True → ✓ model_config = ConfigDict(from_attributes=True)
✗ .dict() → ✓ .model_dump()
✗ .json() → ✓ .model_dump_json()
✗ parse_obj() → ✓ model_validate()
@validator → ✓ @field_validator
@root_validator → ✓ @model_validator
✗ Optional[str] (auto-None in v1) → ✓ str | None = None (explicit default in v2)
✗ ConstrainedInt → ✓ Annotated[int, Field(ge=0)]
FastAPI:
✗ status_code=201 → ✓ status_code=status.HTTP_201_CREATED (readable)
✗ Manual exception strings → ✓ HTTPException(status_code=404, detail="Not found")
✗ .dict() in handlers → ✓ .model_dump() (Pydantic v2)
Python:
✗ Optional[str] → ✓ str | None (PEP 604, Python 3.10+)
✗ List[str] → ✓ list[str] (PEP 585, Python 3.9+)
✗ Dict[str, int] → ✓ dict[str, int]
✗ Tuple[int, ...] → ✓ tuple[int, ...]

View File

@@ -0,0 +1 @@
You are a Python code fixer. Return ONLY the corrected Python file. No markdown fences, no explanations — just valid Python code.

View File

@@ -0,0 +1,36 @@
REFERENCE PATTERNS (follow exactly):
STACK: SQLAlchemy 2.0 + Pydantic v2 + FastAPI + SQLite
models.py:
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase): pass
Fields: Mapped[type] = mapped_column(SqlType, default=...)
Nullable: Mapped[str | None] = mapped_column(Text, default=None)
Status: Mapped[str] = mapped_column(String(20), default="pending")
FK: Mapped[int] = mapped_column(ForeignKey("table.id"))
End: Base.metadata.create_all(bind=engine)
schemas.py:
class EntityCreate(BaseModel): fields with defaults
class EntityResponse(EntityCreate):
id: int
model_config = ConfigDict(from_attributes=True)
main.py:
def get_db(): yield SessionLocal(); finally close
POST /{table}/ → 201, model_dump()
GET /{table}/ → list
GET /{table}/{id} → 404 if not found
PUT /{table}/{id} → model_dump(), setattr loop
DELETE /{table}/{id} → 204
test_main.py:
test.db + override_get_db + TestClient
Unique descriptive data per test ("Buy milk", "Fetchable task"...)
test_create → 201 + assert "id" in json
test_list → post first, get, assert len >= 1
test_get_by_id → post, get by id, assert id matches
test_not_found → get /99999 → 404
test_update → post, put with ALL required fields, assert 200
test_delete → post, delete 204, get again → 404

View File

@@ -0,0 +1,43 @@
REFERENCE PATTERNS (follow exactly):
STACK: Axum 0.8 + SQLx + SQLite + Tokio + Serde
Cargo.toml:
edition = "2024"
deps: axum 0.8, tokio (full), serde (derive), serde_json, sqlx (sqlite, runtime-tokio), tower-http (cors)
dev: reqwest 0.13 (rustls)
src/models.rs:
#[derive(Debug, Serialize, Deserialize, FromRow)]
struct Entity { id: i64, field: String, optional: Option<String> }
struct CreateEntity { field: String, optional: Option<String> }
Status fields: String with default "pending"
src/handlers.rs:
async fn create(State(pool), Json(input)) -> (StatusCode, Json<Entity>)
POST → StatusCode::CREATED, sqlx::query("INSERT...").execute + query_as last_insert_rowid
GET list → query_as("SELECT * FROM table").fetch_all
GET by id → query_as.fetch_optional, return 404 if None
PUT → query("UPDATE...SET...WHERE id=?"), rows_affected == 0 → 404
DELETE → StatusCode::NO_CONTENT, rows_affected == 0 → 404
src/lib.rs:
pub fn app(pool: SqlitePool) -> Router
pub async fn init_db(pool: &SqlitePool) → CREATE TABLE IF NOT EXISTS
Routes: .route("/{table}", post(create).get(list))
.route("/{table}/{id}", get(get_one).put(update).delete(delete_one))
src/main.rs:
SqlitePool::connect("sqlite:./app.db"), init_db, bind 0.0.0.0:3000
tests/api_test.rs:
Each test: SqlitePool::connect("sqlite::memory:"), init_db, app(pool)
Spawn on random port: TcpListener::bind("127.0.0.1:0"), axum::serve
reqwest::Client for HTTP calls
Unique descriptive data ("Buy milk", "Fetchable task"...)
test_create → 201 + assert id exists
test_list → post first, get, assert len >= 1
test_get_by_id → post, get, assert id matches
test_not_found → 404
test_update → post, put with ALL fields, assert 200
test_delete → post, delete 204, get → 404

View File

@@ -0,0 +1,19 @@
You design database schemas. Output ONLY the schema in this exact format, nothing else.
FORMAT (one entity per line):
project: project-name
entity EntityName (table_name): field1 type, field2 type, field3 type=default
entity ChildName (table_name): field1 type, parent_id int->ParentName, field2 type
TYPES: string, text, int, float, bool, date, datetime
RULES:
- id is automatic, do NOT include it
- FK fields end with _id and use -> to reference parent
- Parent entities BEFORE children
- Max 7 fields per entity, max 3 entities
- Status fields: string with =default (e.g. status string=draft)
EXAMPLE:
project: blog-api
entity Author (authors): name string, email string, bio text
entity Post (posts): title string, content text, author_id int->Author, published_at datetime, status string=draft

View File

@@ -0,0 +1,17 @@
You design database schemas. Output ONLY valid JSON, no explanations.
SCHEMA:
{"project_name":"name","entities":[{"name":"Entity","table_name":"entities","fields":[{"name":"field","type":"string","nullable":false,"default":null}]}],"relationships":[{"from":"Child","field":"parent_id","to":"Parent"}]}
FIELD TYPES: string, text, int, float, bool, date, datetime
- Status fields: type "string", default "draft" or "pending"
- id is automatic — do NOT include it
- FK fields: type "int", name ends with _id
RULES:
- Parent entities BEFORE children in array
- Every _id field needs a relationship entry
- Max 7 fields, max 3 entities
- English names only
EXAMPLE: Blog → Author: name(string), email(string) / Post: title(string), content(text), author_id(int)→Author, status(string,default="draft")

View File

@@ -0,0 +1,31 @@
You are a software architect who designs database schemas for Python web applications.
THINK STEP BY STEP before outputting JSON:
1. What are the main ENTITIES (nouns) in this project?
2. What FIELDS does each entity need? (name, type, required?)
3. Which entities REFERENCE each other? (e.g. "a Book belongs to an Author" → Book has author_id)
4. Are there Date/DateTime fields? → add extra_imports
Then output ONLY valid JSON (no explanations before or after).
SCHEMA:
{"project_name":"short-name","description":"One sentence","entities":[{"name":"EntityName","table_name":"entity_names","fields":[{"name":"field_name","sa_type":"String(255)","py_type":"str","nullable":false,"default":null}]}],"relationships":[{"from":"ChildEntity","field":"parent_id","to":"ParentEntity","type":"many-to-one"}],"extra_imports":[]}
FIELD RULES:
- sa_type: String(N), Text, Integer, Date, DateTime, Boolean, Float
- py_type: str, int, float, bool, date, datetime — append " | None" if nullable
- Status fields: use String(20) with default value, NEVER Enum
- Every entity gets "id" automatically — do NOT add id or redundant ID fields
- Use snake_case for field names
RELATIONSHIP RULES:
- If entity A "belongs to" entity B → A has b_id field (Integer, nullable=false) + relationship entry
- EVERY _id field MUST have a matching relationship entry
- Parent entities must appear BEFORE children in the entities array
- If no relationships, set "relationships": []
AVOID: redundant ID fields, generic names, more than 7 fields or 3 entities, non-English entity/field names (ALWAYS English even if description is Finnish)
EXAMPLES (adapt, don't copy):
Todo app → Todo: title(str), description(Text|None), due_date(Date|None), status(String20="pending")
Blog → Author: name,email,bio(Text|None) / Post: title, content(Text), author_id→Author, published_at(DateTime|None), status(String20="draft")

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = /*DATA_PLACEHOLDER*/[];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,422 @@
[
{
"model": "qwen3.5:9b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 3,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 65901,
"totalTokens": 5056,
"avgTokPerSec": 82.99139473832963,
"promptChars": 12334,
"promptTokensEst": 3084,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3.5:9b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 1,
"fixRounds": 2,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 74087,
"totalTokens": 5645,
"avgTokPerSec": 83.57073831360164,
"promptChars": 10757,
"promptTokensEst": 2689,
"score": 20,
"stars": "★☆☆☆☆",
"error": null
},
{
"model": "qwen3.5:9b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 49830,
"totalTokens": 3803,
"avgTokPerSec": 83.26266260763309,
"promptChars": 10826,
"promptTokensEst": 2707,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "gemma4:e4b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 57032,
"totalTokens": 4924,
"avgTokPerSec": 106.02334905805122,
"promptChars": 11313,
"promptTokensEst": 2828,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "gemma4:e4b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 5,
"testsFailed": 2,
"totalDurationMs": 54307,
"totalTokens": 5060,
"avgTokPerSec": 106.89447491163497,
"promptChars": 11225,
"promptTokensEst": 2806,
"score": 83,
"stars": "★★★★☆",
"error": null
},
{
"model": "gemma4:e4b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 2,
"testsFailed": 9,
"totalDurationMs": 57080,
"totalTokens": 5310,
"avgTokPerSec": 106.64914988130955,
"promptChars": 11791,
"promptTokensEst": 2948,
"score": 51,
"stars": "★★★☆☆",
"error": null
},
{
"model": "qwen2.5-coder:3b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 3,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 22377,
"totalTokens": 3534,
"avgTokPerSec": 201.24475679283708,
"promptChars": 11479,
"promptTokensEst": 2870,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:3b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 8,
"fixRounds": 2,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 44520,
"totalTokens": 7495,
"avgTokPerSec": 201.87149050701015,
"promptChars": 11886,
"promptTokensEst": 2972,
"score": 20,
"stars": "★☆☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:3b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 20136,
"totalTokens": 3338,
"avgTokPerSec": 200.86152095722105,
"promptChars": 11228,
"promptTokensEst": 2807,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:7b",
"scenario": "todo",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
},
{
"model": "qwen2.5-coder:7b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 20012,
"totalTokens": 2119,
"avgTokPerSec": 122.7557304112134,
"promptChars": 10342,
"promptTokensEst": 2586,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:7b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 26133,
"totalTokens": 2715,
"avgTokPerSec": 121.94987205993503,
"promptChars": 11193,
"promptTokensEst": 2798,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 44757,
"totalTokens": 2156,
"avgTokPerSec": 60.77636586631207,
"promptChars": 9635,
"promptTokensEst": 2409,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 41166,
"totalTokens": 2282,
"avgTokPerSec": 61.14821289733007,
"promptChars": 9575,
"promptTokensEst": 2394,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 66478,
"totalTokens": 3681,
"avgTokPerSec": 60.493817783668725,
"promptChars": 10500,
"promptTokensEst": 2625,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 29801,
"totalTokens": 2249,
"avgTokPerSec": 98.5661742189331,
"promptChars": 9615,
"promptTokensEst": 2404,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 22974,
"totalTokens": 2050,
"avgTokPerSec": 101.2398768597589,
"promptChars": 9273,
"promptTokensEst": 2318,
"score": 85,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 39335,
"totalTokens": 3537,
"avgTokPerSec": 100.10984073540648,
"promptChars": 10525,
"promptTokensEst": 2631,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:4b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 58668,
"totalTokens": 7134,
"avgTokPerSec": 141.76822189196028,
"promptChars": 15202,
"promptTokensEst": 3801,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:4b",
"scenario": "users",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
},
{
"model": "qwen3:4b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:14b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":186642,"totalTokens":10237,"avgTokPerSec":59.06411550065281,"promptChars":10576,"promptTokensEst":2644,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":121848,"totalTokens":6735,"avgTokPerSec":59.85231850668119,"promptChars":9684,"promptTokensEst":2421,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":11,"testsPassed":9,"testsFailed":2,"totalDurationMs":83491,"totalTokens":4677,"avgTokPerSec":60.222832434869694,"promptChars":10423,"promptTokensEst":2606,"score":89,"stars":"★★★★☆","error":null},{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":56288,"totalTokens":5235,"avgTokPerSec":99.60027546406452,"promptChars":9307,"promptTokensEst":2327,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":5,"testsFailed":1,"totalDurationMs":59639,"totalTokens":5526,"avgTokPerSec":99.6742208632186,"promptChars":9158,"promptTokensEst":2290,"score":90,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":11,"testsPassed":10,"testsFailed":1,"totalDurationMs":131793,"totalTokens":11779,"avgTokPerSec":97.17878362853351,"promptChars":10390,"promptTokensEst":2598,"score":95,"stars":"★★★★★","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,122 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 186642,
"totalTokens": 10237,
"avgTokPerSec": 59.06411550065281,
"promptChars": 10576,
"promptTokensEst": 2644,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 121848,
"totalTokens": 6735,
"avgTokPerSec": 59.85231850668119,
"promptChars": 9684,
"promptTokensEst": 2421,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 9,
"testsFailed": 2,
"totalDurationMs": 83491,
"totalTokens": 4677,
"avgTokPerSec": 60.222832434869694,
"promptChars": 10423,
"promptTokensEst": 2606,
"score": 89,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 56288,
"totalTokens": 5235,
"avgTokPerSec": 99.60027546406452,
"promptChars": 9307,
"promptTokensEst": 2327,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 5,
"testsFailed": 1,
"totalDurationMs": 59639,
"totalTokens": 5526,
"avgTokPerSec": 99.6742208632186,
"promptChars": 9158,
"promptTokensEst": 2290,
"score": 90,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 131793,
"totalTokens": 11779,
"avgTokPerSec": 97.17878362853351,
"promptChars": 10390,
"promptTokensEst": 2598,
"score": 95,
"stars": "★★★★★",
"error": null
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:14b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":66903,"totalTokens":5454,"avgTokPerSec":86.45918994499432,"promptChars":9985,"promptTokensEst":2496,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":87618,"totalTokens":7150,"avgTokPerSec":87.21782190501095,"promptChars":9922,"promptTokensEst":2481,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":9,"testsPassed":5,"testsFailed":4,"totalDurationMs":78398,"totalTokens":6427,"avgTokPerSec":85.52353711143463,"promptChars":10737,"promptTokensEst":2684,"score":73,"stars":"★★★★☆","error":null},{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":8,"testsPassed":7,"testsFailed":1,"totalDurationMs":82750,"totalTokens":10054,"avgTokPerSec":139.90690936146032,"promptChars":9360,"promptTokensEst":2340,"score":93,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":32233,"totalTokens":4404,"avgTokPerSec":143.4997404058814,"promptChars":9310,"promptTokensEst":2328,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":88563,"totalTokens":11575,"avgTokPerSec":141.54675017528362,"promptChars":10567,"promptTokensEst":2642,"score":40,"stars":"★★☆☆☆","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,122 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 66903,
"totalTokens": 5454,
"avgTokPerSec": 86.45918994499432,
"promptChars": 9985,
"promptTokensEst": 2496,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 87618,
"totalTokens": 7150,
"avgTokPerSec": 87.21782190501095,
"promptChars": 9922,
"promptTokensEst": 2481,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 5,
"testsFailed": 4,
"totalDurationMs": 78398,
"totalTokens": 6427,
"avgTokPerSec": 85.52353711143463,
"promptChars": 10737,
"promptTokensEst": 2684,
"score": 73,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 7,
"testsFailed": 1,
"totalDurationMs": 82750,
"totalTokens": 10054,
"avgTokPerSec": 139.90690936146032,
"promptChars": 9360,
"promptTokensEst": 2340,
"score": 93,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 32233,
"totalTokens": 4404,
"avgTokPerSec": 143.4997404058814,
"promptChars": 9310,
"promptTokensEst": 2328,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 88563,
"totalTokens": 11575,
"avgTokPerSec": 141.54675017528362,
"promptChars": 10567,
"promptTokensEst": 2642,
"score": 40,
"stars": "★★☆☆☆",
"error": null
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:14b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":9,"testsPassed":6,"testsFailed":3,"totalDurationMs":50350,"totalTokens":2797,"avgTokPerSec":60.919860198859574,"promptChars":9858,"promptTokensEst":2465,"score":80,"stars":"★★★★☆","error":null},{"model":"qwen3:14b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":8,"testsPassed":6,"testsFailed":2,"totalDurationMs":46557,"totalTokens":2584,"avgTokPerSec":60.88834523948,"promptChars":9544,"promptTokensEst":2386,"score":85,"stars":"★★★★☆","error":null},{"model":"qwen3:14b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":15,"testsPassed":2,"testsFailed":13,"totalDurationMs":90761,"totalTokens":4979,"avgTokPerSec":60.19247492391319,"promptChars":10521,"promptTokensEst":2630,"score":48,"stars":"★★☆☆☆","error":null},{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":27360,"totalTokens":2466,"avgTokPerSec":100.9922018173994,"promptChars":9767,"promptTokensEst":2442,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat"},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":7,"testsPassed":7,"testsFailed":0,"totalDurationMs":20920,"totalTokens":1876,"avgTokPerSec":101.60760023892685,"promptChars":8782,"promptTokensEst":2196,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":10,"testsPassed":9,"testsFailed":1,"totalDurationMs":35766,"totalTokens":3217,"avgTokPerSec":100.40066102398943,"promptChars":10334,"promptTokensEst":2584,"score":94,"stars":"★★★★★","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,122 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 50350,
"totalTokens": 2797,
"avgTokPerSec": 60.919860198859574,
"promptChars": 9858,
"promptTokensEst": 2465,
"score": 80,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 46557,
"totalTokens": 2584,
"avgTokPerSec": 60.88834523948,
"promptChars": 9544,
"promptTokensEst": 2386,
"score": 85,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 15,
"testsPassed": 2,
"testsFailed": 13,
"totalDurationMs": 90761,
"totalTokens": 4979,
"avgTokPerSec": 60.19247492391319,
"promptChars": 10521,
"promptTokensEst": 2630,
"score": 48,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 27360,
"totalTokens": 2466,
"avgTokPerSec": 100.9922018173994,
"promptChars": 9767,
"promptTokensEst": 2442,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat"
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 20920,
"totalTokens": 1876,
"avgTokPerSec": 101.60760023892685,
"promptChars": 8782,
"promptTokensEst": 2196,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 9,
"testsFailed": 1,
"totalDurationMs": 35766,
"totalTokens": 3217,
"avgTokPerSec": 100.40066102398943,
"promptChars": 10334,
"promptTokensEst": 2584,
"score": 94,
"stars": "★★★★★",
"error": null
}
]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,947 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 30801,
"totalTokens": 2333,
"avgTokPerSec": 122.77922150989748,
"promptChars": 10015,
"promptTokensEst": 2504,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 25495,
"totalTokens": 2714,
"avgTokPerSec": 122.70970007652487,
"promptChars": 9891,
"promptTokensEst": 2473,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 37153,
"totalTokens": 3979,
"avgTokPerSec": 121.9183958236036,
"promptChars": 11158,
"promptTokensEst": 2790,
"score": 95,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 43456,
"totalTokens": 2411,
"avgTokPerSec": 60.89226084568145,
"promptChars": 9831,
"promptTokensEst": 2458,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 40376,
"totalTokens": 2237,
"avgTokPerSec": 61.028627032662456,
"promptChars": 9343,
"promptTokensEst": 2336,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 2,
"testsFailed": 10,
"totalDurationMs": 68620,
"totalTokens": 3796,
"avgTokPerSec": 60.47793268944476,
"promptChars": 10497,
"promptTokensEst": 2624,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 25235,
"totalTokens": 2269,
"avgTokPerSec": 101.24212769079884,
"promptChars": 9294,
"promptTokensEst": 2324,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 21720,
"totalTokens": 1942,
"avgTokPerSec": 101.65074583709965,
"promptChars": 9020,
"promptTokensEst": 2255,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 39006,
"totalTokens": 3509,
"avgTokPerSec": 100.43593706181406,
"promptChars": 10372,
"promptTokensEst": 2593,
"score": 95,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 21989,
"totalTokens": 2339,
"avgTokPerSec": 122.8454095677367,
"promptChars": 10052,
"promptTokensEst": 2513,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23997,
"totalTokens": 2551,
"avgTokPerSec": 122.23722733560855,
"promptChars": 9973,
"promptTokensEst": 2493,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 30169,
"totalTokens": 3249,
"avgTokPerSec": 123.04696524796096,
"promptChars": 11097,
"promptTokensEst": 2774,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 47091,
"totalTokens": 2602,
"avgTokPerSec": 60.962687726457375,
"promptChars": 9633,
"promptTokensEst": 2408,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 41747,
"totalTokens": 2313,
"avgTokPerSec": 60.949025583617605,
"promptChars": 9373,
"promptTokensEst": 2343,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 2,
"testsFailed": 10,
"totalDurationMs": 66888,
"totalTokens": 3699,
"avgTokPerSec": 60.49540514685331,
"promptChars": 10323,
"promptTokensEst": 2581,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 7,
"testsFailed": 1,
"totalDurationMs": 27036,
"totalTokens": 2434,
"avgTokPerSec": 101.01399069228444,
"promptChars": 9513,
"promptTokensEst": 2378,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 20927,
"totalTokens": 1872,
"avgTokPerSec": 101.45096098956486,
"promptChars": 8881,
"promptTokensEst": 2220,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 26919,
"totalTokens": 2889,
"avgTokPerSec": 123.63666629145064,
"promptChars": 10162,
"promptTokensEst": 2541,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 27592,
"totalTokens": 2946,
"avgTokPerSec": 122.33273400152825,
"promptChars": 9469,
"promptTokensEst": 2367,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 35734,
"totalTokens": 3827,
"avgTokPerSec": 122.65156559717951,
"promptChars": 11086,
"promptTokensEst": 2772,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 50372,
"totalTokens": 2795,
"avgTokPerSec": 60.91611850918806,
"promptChars": 9758,
"promptTokensEst": 2440,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 38716,
"totalTokens": 2144,
"avgTokPerSec": 61.0412890406478,
"promptChars": 9415,
"promptTokensEst": 2354,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 14,
"testsPassed": 7,
"testsFailed": 7,
"totalDurationMs": 74882,
"totalTokens": 4130,
"avgTokPerSec": 60.32640855026445,
"promptChars": 10506,
"promptTokensEst": 2627,
"score": 70,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 3,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 35913,
"totalTokens": 3218,
"avgTokPerSec": 100.38516205100154,
"promptChars": 11338,
"promptTokensEst": 2835,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 20974,
"totalTokens": 1880,
"avgTokPerSec": 101.52450928280543,
"promptChars": 8803,
"promptTokensEst": 2201,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 9,
"testsFailed": 2,
"totalDurationMs": 36005,
"totalTokens": 3243,
"avgTokPerSec": 100.44301406462307,
"promptChars": 10414,
"promptTokensEst": 2604,
"score": 89,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 1,
"testsFailed": 6,
"totalDurationMs": 23071,
"totalTokens": 2469,
"avgTokPerSec": 124.09643322620661,
"promptChars": 9960,
"promptTokensEst": 2490,
"score": 49,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 2,
"testsFailed": 6,
"totalDurationMs": 27062,
"totalTokens": 2907,
"avgTokPerSec": 123.35530975346687,
"promptChars": 9558,
"promptTokensEst": 2390,
"score": 55,
"stars": "★★★☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 9,
"testsFailed": 0,
"totalDurationMs": 29395,
"totalTokens": 3156,
"avgTokPerSec": 123.22575073561812,
"promptChars": 10574,
"promptTokensEst": 2644,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 39590,
"totalTokens": 2198,
"avgTokPerSec": 61.051945510465806,
"promptChars": 9664,
"promptTokensEst": 2416,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 36950,
"totalTokens": 2042,
"avgTokPerSec": 61.01436784429489,
"promptChars": 9225,
"promptTokensEst": 2306,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 14,
"testsPassed": 2,
"testsFailed": 12,
"totalDurationMs": 80600,
"totalTokens": 4437,
"avgTokPerSec": 60.29371170543078,
"promptChars": 10688,
"promptTokensEst": 2672,
"score": 49,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 29125,
"totalTokens": 2619,
"avgTokPerSec": 100.90587777586212,
"promptChars": 9899,
"promptTokensEst": 2475,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 21847,
"totalTokens": 1957,
"avgTokPerSec": 101.44111070734304,
"promptChars": 8946,
"promptTokensEst": 2237,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 21127,
"totalTokens": 2245,
"avgTokPerSec": 124.22714049663371,
"promptChars": 9972,
"promptTokensEst": 2493,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 7,
"testsFailed": 2,
"totalDurationMs": 30281,
"totalTokens": 3079,
"avgTokPerSec": 123.00254714651271,
"promptChars": 9562,
"promptTokensEst": 2391,
"score": 87,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 39630,
"totalTokens": 4274,
"avgTokPerSec": 123.08303937451802,
"promptChars": 11119,
"promptTokensEst": 2780,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38032,
"totalTokens": 2104,
"avgTokPerSec": 61.05445464163662,
"promptChars": 9455,
"promptTokensEst": 2364,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 39620,
"totalTokens": 2193,
"avgTokPerSec": 61.04565233675101,
"promptChars": 9481,
"promptTokensEst": 2370,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 7,
"testsFailed": 2,
"totalDurationMs": 63579,
"totalTokens": 3520,
"avgTokPerSec": 60.51513453009977,
"promptChars": 10493,
"promptTokensEst": 2623,
"score": 87,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 30845,
"totalTokens": 2777,
"avgTokPerSec": 100.79046137130972,
"promptChars": 9507,
"promptTokensEst": 2377,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 21413,
"totalTokens": 1914,
"avgTokPerSec": 101.25525436264132,
"promptChars": 8804,
"promptTokensEst": 2201,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 5
}
]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,947 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 33892,
"totalTokens": 2675,
"avgTokPerSec": 88.07409036121237,
"promptChars": 9688,
"promptTokensEst": 2422,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 30647,
"totalTokens": 2549,
"avgTokPerSec": 88.4488185974085,
"promptChars": 9594,
"promptTokensEst": 2399,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 13,
"testsPassed": 6,
"testsFailed": 7,
"totalDurationMs": 44371,
"totalTokens": 3678,
"avgTokPerSec": 88.172616246191,
"promptChars": 10432,
"promptTokensEst": 2608,
"score": 68,
"stars": "★★★☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 18385,
"totalTokens": 2375,
"avgTokPerSec": 147.62230806597154,
"promptChars": 9478,
"promptTokensEst": 2370,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 13968,
"totalTokens": 1904,
"avgTokPerSec": 148.3084817167518,
"promptChars": 8837,
"promptTokensEst": 2209,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 25642,
"totalTokens": 3476,
"avgTokPerSec": 146.49556892944076,
"promptChars": 10734,
"promptTokensEst": 2684,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 19982,
"totalTokens": 2937,
"avgTokPerSec": 191.2786317674431,
"promptChars": 10281,
"promptTokensEst": 2570,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 17114,
"totalTokens": 2903,
"avgTokPerSec": 190.51221206765385,
"promptChars": 9654,
"promptTokensEst": 2414,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 22352,
"totalTokens": 3776,
"avgTokPerSec": 190.56628728306987,
"promptChars": 11134,
"promptTokensEst": 2784,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 31217,
"totalTokens": 2463,
"avgTokPerSec": 88.6684646675098,
"promptChars": 9598,
"promptTokensEst": 2400,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 27520,
"totalTokens": 2288,
"avgTokPerSec": 88.64765360012593,
"promptChars": 9612,
"promptTokensEst": 2403,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 3,
"testsFailed": 9,
"totalDurationMs": 41874,
"totalTokens": 3474,
"avgTokPerSec": 88.22266853318554,
"promptChars": 10408,
"promptTokensEst": 2602,
"score": 55,
"stars": "★★★☆☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 24781,
"totalTokens": 3240,
"avgTokPerSec": 146.89167309934365,
"promptChars": 10179,
"promptTokensEst": 2545,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 19148,
"totalTokens": 2605,
"avgTokPerSec": 147.55250620481297,
"promptChars": 9634,
"promptTokensEst": 2409,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 23816,
"totalTokens": 3232,
"avgTokPerSec": 147.25857324533817,
"promptChars": 9226,
"promptTokensEst": 2307,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 16639,
"totalTokens": 2369,
"avgTokPerSec": 191.61273045157245,
"promptChars": 10048,
"promptTokensEst": 2512,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 8,
"testsFailed": 1,
"totalDurationMs": 18588,
"totalTokens": 3163,
"avgTokPerSec": 190.86975006725547,
"promptChars": 10048,
"promptTokensEst": 2512,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 22677,
"totalTokens": 3828,
"avgTokPerSec": 190.15611016906482,
"promptChars": 11090,
"promptTokensEst": 2773,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 26449,
"totalTokens": 2063,
"avgTokPerSec": 88.77498453063184,
"promptChars": 9608,
"promptTokensEst": 2402,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 27510,
"totalTokens": 2289,
"avgTokPerSec": 88.74699253414485,
"promptChars": 9418,
"promptTokensEst": 2355,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 3,
"testsFailed": 9,
"totalDurationMs": 45105,
"totalTokens": 3738,
"avgTokPerSec": 88.04788102995212,
"promptChars": 10564,
"promptTokensEst": 2641,
"score": 55,
"stars": "★★★☆☆",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 7,
"testsFailed": 1,
"totalDurationMs": 19204,
"totalTokens": 2480,
"avgTokPerSec": 147.91758782382294,
"promptChars": 9391,
"promptTokensEst": 2348,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 12990,
"totalTokens": 1769,
"avgTokPerSec": 148.2616673700717,
"promptChars": 8898,
"promptTokensEst": 2225,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 10,
"testsFailed": 2,
"totalDurationMs": 25831,
"totalTokens": 3500,
"avgTokPerSec": 146.86924785880186,
"promptChars": 9465,
"promptTokensEst": 2366,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 19453,
"totalTokens": 2845,
"avgTokPerSec": 191.37382231956113,
"promptChars": 10157,
"promptTokensEst": 2539,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 9,
"testsFailed": 0,
"totalDurationMs": 21570,
"totalTokens": 3529,
"avgTokPerSec": 190.65454623497536,
"promptChars": 9732,
"promptTokensEst": 2433,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 25537,
"totalTokens": 4300,
"avgTokPerSec": 189.94521619124598,
"promptChars": 11127,
"promptTokensEst": 2782,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 7,
"testsFailed": 2,
"totalDurationMs": 31923,
"totalTokens": 2522,
"avgTokPerSec": 88.62182881661799,
"promptChars": 9700,
"promptTokensEst": 2425,
"score": 87,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 26000,
"totalTokens": 2163,
"avgTokPerSec": 88.86878707672254,
"promptChars": 9288,
"promptTokensEst": 2322,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 43275,
"totalTokens": 3588,
"avgTokPerSec": 88.24995936347965,
"promptChars": 10173,
"promptTokensEst": 2543,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 14,
"testsPassed": 0,
"testsFailed": 14,
"totalDurationMs": 30045,
"totalTokens": 3913,
"avgTokPerSec": 146.51683619371713,
"promptChars": 10334,
"promptTokensEst": 2584,
"score": 40,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 5,
"testsFailed": 4,
"totalDurationMs": 17076,
"totalTokens": 2321,
"avgTokPerSec": 147.99547121069506,
"promptChars": 9451,
"promptTokensEst": 2363,
"score": 73,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 23890,
"totalTokens": 3243,
"avgTokPerSec": 147.20125507974117,
"promptChars": 9217,
"promptTokensEst": 2304,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 21812,
"totalTokens": 3246,
"avgTokPerSec": 191.07801335688654,
"promptChars": 10249,
"promptTokensEst": 2562,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 8,
"testsFailed": 1,
"totalDurationMs": 20325,
"totalTokens": 3441,
"avgTokPerSec": 190.10241840094508,
"promptChars": 9930,
"promptTokensEst": 2483,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 26087,
"totalTokens": 4387,
"avgTokPerSec": 189.8005689388054,
"promptChars": 11109,
"promptTokensEst": 2777,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 30287,
"totalTokens": 2388,
"avgTokPerSec": 88.72243320918638,
"promptChars": 9695,
"promptTokensEst": 2424,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 31212,
"totalTokens": 2601,
"avgTokPerSec": 88.71289036919063,
"promptChars": 9619,
"promptTokensEst": 2405,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 15,
"testsPassed": 3,
"testsFailed": 12,
"totalDurationMs": 50939,
"totalTokens": 4217,
"avgTokPerSec": 88.06125722020734,
"promptChars": 10743,
"promptTokensEst": 2686,
"score": 52,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 17913,
"totalTokens": 2310,
"avgTokPerSec": 148.0291268001691,
"promptChars": 9357,
"promptTokensEst": 2339,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 13948,
"totalTokens": 1898,
"avgTokPerSec": 148.37907379944423,
"promptChars": 8725,
"promptTokensEst": 2181,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 15229,
"totalTokens": 2119,
"avgTokPerSec": 192.33007410215646,
"promptChars": 9827,
"promptTokensEst": 2457,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 18223,
"totalTokens": 3093,
"avgTokPerSec": 190.71372054282037,
"promptChars": 9641,
"promptTokensEst": 2410,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 1,
"testsFailed": 9,
"totalDurationMs": 21215,
"totalTokens": 3589,
"avgTokPerSec": 190.49493540345176,
"promptChars": 11180,
"promptTokensEst": 2795,
"score": 46,
"stars": "★★☆☆☆",
"error": null,
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":21688,"totalTokens":2243,"avgTokPerSec":121.7719614197307,"promptChars":11588,"promptTokensEst":2897,"score":100,"stars":"★★★★★","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,22 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 21688,
"totalTokens": 2243,
"avgTokPerSec": 121.7719614197307,
"promptChars": 11588,
"promptTokensEst": 2897,
"score": 100,
"stars": "★★★★★",
"error": null
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":23521,"totalTokens":2090,"avgTokPerSec":100.94324085271073,"promptChars":10962,"promptTokensEst":2741,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":1,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":33680,"totalTokens":3003,"avgTokPerSec":100.52754588753601,"promptChars":10171,"promptTokensEst":2543,"score":90,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui"}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,62 @@
[
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23521,
"totalTokens": 2090,
"avgTokPerSec": 100.94324085271073,
"promptChars": 10962,
"promptTokensEst": 2741,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 33680,
"totalTokens": 3003,
"avgTokPerSec": 100.52754588753601,
"promptChars": 10171,
"promptTokensEst": 2543,
"score": 90,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":8,"testsPassed":6,"testsFailed":2,"totalDurationMs":97470,"totalTokens":8786,"avgTokPerSec":97.96636139685832,"promptChars":11290,"promptTokensEst":2823,"score":65,"stars":"★★★☆☆","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":18951,"totalTokens":1666,"avgTokPerSec":101.807593927545,"promptChars":10293,"promptTokensEst":2573,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":126005,"totalTokens":11056,"avgTokPerSec":96.6373549161171,"promptChars":11878,"promptTokensEst":2970,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe"}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,62 @@
[
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 97470,
"totalTokens": 8786,
"avgTokPerSec": 97.96636139685832,
"promptChars": 11290,
"promptTokensEst": 2823,
"score": 65,
"stars": "★★★☆☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 18951,
"totalTokens": 1666,
"avgTokPerSec": 101.807593927545,
"promptChars": 10293,
"promptTokensEst": 2573,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 126005,
"totalTokens": 11056,
"avgTokPerSec": 96.6373549161171,
"promptChars": 11878,
"promptTokensEst": 2970,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe"
}
]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,947 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 25444,
"totalTokens": 2661,
"avgTokPerSec": 122.06801173056196,
"promptChars": 11849,
"promptTokensEst": 2962,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 24447,
"totalTokens": 2537,
"avgTokPerSec": 121.11837170891442,
"promptChars": 11045,
"promptTokensEst": 2761,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 38071,
"totalTokens": 3965,
"avgTokPerSec": 120.37309655579647,
"promptChars": 12702,
"promptTokensEst": 3176,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38459,
"totalTokens": 2106,
"avgTokPerSec": 60.889088461567745,
"promptChars": 10951,
"promptTokensEst": 2738,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 35959,
"totalTokens": 1966,
"avgTokPerSec": 60.9684885562545,
"promptChars": 10698,
"promptTokensEst": 2675,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 13,
"testsPassed": 2,
"testsFailed": 11,
"totalDurationMs": 269370,
"totalTokens": 14361,
"avgTokPerSec": 57.79069860126629,
"promptChars": 11838,
"promptTokensEst": 2960,
"score": 29,
"stars": "★★☆☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23199,
"totalTokens": 2054,
"avgTokPerSec": 101.09280595816365,
"promptChars": 10854,
"promptTokensEst": 2714,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 72665,
"totalTokens": 6586,
"avgTokPerSec": 99.40636298490288,
"promptChars": 10157,
"promptTokensEst": 2539,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 136309,
"totalTokens": 12036,
"avgTokPerSec": 97.02525169408467,
"promptChars": 10823,
"promptTokensEst": 2706,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 28177,
"totalTokens": 2946,
"avgTokPerSec": 121.23541038097,
"promptChars": 11836,
"promptTokensEst": 2959,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 22631,
"totalTokens": 2352,
"avgTokPerSec": 121.93930190168658,
"promptChars": 10440,
"promptTokensEst": 2610,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 40394,
"totalTokens": 4225,
"avgTokPerSec": 120.84107397324551,
"promptChars": 12362,
"promptTokensEst": 3091,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 46081,
"totalTokens": 2542,
"avgTokPerSec": 60.93046828700026,
"promptChars": 11412,
"promptTokensEst": 2853,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 41323,
"totalTokens": 2272,
"avgTokPerSec": 60.99406174164295,
"promptChars": 10884,
"promptTokensEst": 2721,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 14,
"testsPassed": 2,
"testsFailed": 12,
"totalDurationMs": 262591,
"totalTokens": 14129,
"avgTokPerSec": 57.91340837830759,
"promptChars": 12143,
"promptTokensEst": 3036,
"score": 29,
"stars": "★★☆☆☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 24007,
"totalTokens": 2137,
"avgTokPerSec": 101.05982103292858,
"promptChars": 10756,
"promptTokensEst": 2689,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 68739,
"totalTokens": 6199,
"avgTokPerSec": 98.9825675198183,
"promptChars": 10313,
"promptTokensEst": 2578,
"score": 71,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23472,
"totalTokens": 2427,
"avgTokPerSec": 120.85293828875076,
"promptChars": 11663,
"promptTokensEst": 2916,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 25864,
"totalTokens": 2671,
"avgTokPerSec": 120.6883137195962,
"promptChars": 11148,
"promptTokensEst": 2787,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 41074,
"totalTokens": 4275,
"avgTokPerSec": 120.33351485161673,
"promptChars": 12664,
"promptTokensEst": 3166,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 40457,
"totalTokens": 2229,
"avgTokPerSec": 61.093615619948345,
"promptChars": 10905,
"promptTokensEst": 2726,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 77506,
"totalTokens": 4268,
"avgTokPerSec": 60.19655522627278,
"promptChars": 11135,
"promptTokensEst": 2784,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 74791,
"totalTokens": 3590,
"avgTokPerSec": 60.549298891176214,
"promptChars": 11653,
"promptTokensEst": 2913,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 26402,
"totalTokens": 2358,
"avgTokPerSec": 100.76936895480246,
"promptChars": 11243,
"promptTokensEst": 2811,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 20751,
"totalTokens": 1837,
"avgTokPerSec": 101.05480893032836,
"promptChars": 10553,
"promptTokensEst": 2638,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 22098,
"totalTokens": 2283,
"avgTokPerSec": 121.81254413612446,
"promptChars": 11503,
"promptTokensEst": 2876,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 65403,
"totalTokens": 6779,
"avgTokPerSec": 118.13288294758586,
"promptChars": 10939,
"promptTokensEst": 2735,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 36044,
"totalTokens": 3748,
"avgTokPerSec": 120.14822967005487,
"promptChars": 12639,
"promptTokensEst": 3160,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38501,
"totalTokens": 2113,
"avgTokPerSec": 61.01814139430428,
"promptChars": 10929,
"promptTokensEst": 2732,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 1,
"testsFailed": 7,
"totalDurationMs": 147057,
"totalTokens": 7799,
"avgTokPerSec": 56.209406465865904,
"promptChars": 11207,
"promptTokensEst": 2802,
"score": 28,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 227508,
"totalTokens": 12026,
"avgTokPerSec": 58.52888492610325,
"promptChars": 11809,
"promptTokensEst": 2952,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 131964,
"totalTokens": 11403,
"avgTokPerSec": 97.10963264920952,
"promptChars": 11786,
"promptTokensEst": 2947,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38820,
"totalTokens": 1826,
"avgTokPerSec": 101.07773707712924,
"promptChars": 10568,
"promptTokensEst": 2642,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 39797,
"totalTokens": 3776,
"avgTokPerSec": 120.91801837211113,
"promptChars": 11435,
"promptTokensEst": 2859,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 9,
"testsPassed": 8,
"testsFailed": 1,
"totalDurationMs": 87836,
"totalTokens": 9343,
"avgTokPerSec": 119.28783662683314,
"promptChars": 10718,
"promptTokensEst": 2680,
"score": 73,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 36644,
"totalTokens": 3897,
"avgTokPerSec": 122.28607796191666,
"promptChars": 12598,
"promptTokensEst": 3150,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 127532,
"totalTokens": 3919,
"avgTokPerSec": 34.13133325491828,
"promptChars": 11352,
"promptTokensEst": 2838,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 217365,
"totalTokens": 7764,
"avgTokPerSec": 38.67613170588518,
"promptChars": 10834,
"promptTokensEst": 2709,
"score": 65,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 14,
"testsPassed": 7,
"testsFailed": 7,
"totalDurationMs": 248311,
"totalTokens": 13443,
"avgTokPerSec": 58.05680015263308,
"promptChars": 12219,
"promptTokensEst": 3055,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38326,
"totalTokens": 2079,
"avgTokPerSec": 100.89778087504016,
"promptChars": 10908,
"promptTokensEst": 2727,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 60823,
"totalTokens": 1772,
"avgTokPerSec": 96.76383996716295,
"promptChars": 10378,
"promptTokensEst": 2595,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 81654,
"totalTokens": 3458,
"avgTokPerSec": 95.65675360193613,
"promptChars": 11914,
"promptTokensEst": 2979,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1 @@
[]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,317 @@
[
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 97527,
"totalTokens": 2228,
"avgTokPerSec": 100.69171830800946,
"promptChars": 11566,
"promptTokensEst": 2892,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 39549,
"totalTokens": 1960,
"avgTokPerSec": 100.98265593129491,
"promptChars": 11073,
"promptTokensEst": 2768,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 131339,
"totalTokens": 11518,
"avgTokPerSec": 96.52358107464266,
"promptChars": 12388,
"promptTokensEst": 3097,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 20658,
"totalTokens": 1808,
"avgTokPerSec": 101.0081173861862,
"promptChars": 11057,
"promptTokensEst": 2764,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 1,
"fixRounds": 5,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 320031,
"totalTokens": 11985,
"avgTokPerSec": 54.915025374575386,
"promptChars": 12517,
"promptTokensEst": 3129,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 28654,
"totalTokens": 1877,
"avgTokPerSec": 100.70920643946336,
"promptChars": 10747,
"promptTokensEst": 2687,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 67943,
"totalTokens": 6002,
"avgTokPerSec": 98.29436788902672,
"promptChars": 12389,
"promptTokensEst": 3097,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 20203,
"totalTokens": 1774,
"avgTokPerSec": 100.9066297884274,
"promptChars": 10905,
"promptTokensEst": 2726,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 13,
"testsPassed": 12,
"testsFailed": 1,
"totalDurationMs": 148491,
"totalTokens": 12747,
"avgTokPerSec": 95.18237885727869,
"promptChars": 12476,
"promptTokensEst": 3119,
"score": 75,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23830,
"totalTokens": 2102,
"avgTokPerSec": 100.641489789061,
"promptChars": 11404,
"promptTokensEst": 2851,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 122453,
"totalTokens": 7285,
"avgTokPerSec": 94.12482830400619,
"promptChars": 11400,
"promptTokensEst": 2850,
"score": 65,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 147125,
"totalTokens": 9893,
"avgTokPerSec": 97.37021605085566,
"promptChars": 12455,
"promptTokensEst": 3114,
"score": 75,
"stars": "★★★★☆",
"error": null,
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":1,"testsTotal":11,"testsPassed":11,"testsFailed":0,"totalDurationMs":64124,"totalTokens":5689,"avgTokPerSec":98.61378134916481,"promptChars":12098,"promptTokensEst":3025,"score":90,"stars":"★★★★★","error":null,"profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":126014,"totalTokens":11162,"avgTokPerSec":97.09858655726343,"promptChars":12101,"promptTokensEst":3025,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":3}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,69 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 64124,
"totalTokens": 5689,
"avgTokPerSec": 98.61378134916481,
"promptChars": 12098,
"promptTokensEst": 3025,
"score": 90,
"stars": "★★★★★",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 126014,
"totalTokens": 11162,
"avgTokPerSec": 97.09858655726343,
"promptChars": 12101,
"promptTokensEst": 3025,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":10,"testsFailed":2,"totalDurationMs":139308,"totalTokens":11782,"avgTokPerSec":96.85039238572556,"promptChars":11148,"promptTokensEst":2787,"score":70,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":132306,"totalTokens":11671,"avgTokPerSec":96.88921767777383,"promptChars":11267,"promptTokensEst":2817,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":11,"testsFailed":1,"totalDurationMs":126092,"totalTokens":11132,"avgTokPerSec":96.98598556369416,"promptChars":11292,"promptTokensEst":2823,"score":75,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":3}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,71 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 10,
"testsFailed": 2,
"totalDurationMs": 139308,
"totalTokens": 11782,
"avgTokPerSec": 96.85039238572556,
"promptChars": 11148,
"promptTokensEst": 2787,
"score": 70,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 132306,
"totalTokens": 11671,
"avgTokPerSec": 96.88921767777383,
"promptChars": 11267,
"promptTokensEst": 2817,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 11,
"testsFailed": 1,
"totalDurationMs": 126092,
"totalTokens": 11132,
"avgTokPerSec": 96.98598556369416,
"promptChars": 11292,
"promptTokensEst": 2823,
"score": 75,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 3
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":11,"testsPassed":9,"testsFailed":2,"totalDurationMs":75178,"totalTokens":9916,"avgTokPerSec":142.94675043471062,"promptChars":10516,"promptTokensEst":2629,"score":69,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":1,"fixRounds":5,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":98787,"totalTokens":12904,"avgTokPerSec":141.16873850064812,"promptChars":11810,"promptTokensEst":2953,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":81763,"totalTokens":10277,"avgTokPerSec":134.82946940948588,"promptChars":11534,"promptTokensEst":2884,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":88517,"totalTokens":11280,"avgTokPerSec":136.63597159351744,"promptChars":10568,"promptTokensEst":2642,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":9,"testsFailed":3,"totalDurationMs":87817,"totalTokens":11171,"avgTokPerSec":136.1538785139482,"promptChars":11627,"promptTokensEst":2907,"score":65,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 9,
"testsFailed": 2,
"totalDurationMs": 75178,
"totalTokens": 9916,
"avgTokPerSec": 142.94675043471062,
"promptChars": 10516,
"promptTokensEst": 2629,
"score": 69,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 1,
"fixRounds": 5,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 98787,
"totalTokens": 12904,
"avgTokPerSec": 141.16873850064812,
"promptChars": 11810,
"promptTokensEst": 2953,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 81763,
"totalTokens": 10277,
"avgTokPerSec": 134.82946940948588,
"promptChars": 11534,
"promptTokensEst": 2884,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 88517,
"totalTokens": 11280,
"avgTokPerSec": 136.63597159351744,
"promptChars": 10568,
"promptTokensEst": 2642,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 9,
"testsFailed": 3,
"totalDurationMs": 87817,
"totalTokens": 11171,
"avgTokPerSec": 136.1538785139482,
"promptChars": 11627,
"promptTokensEst": 2907,
"score": 65,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":79193,"totalTokens":10304,"avgTokPerSec":141.2083113764173,"promptChars":12199,"promptTokensEst":3050,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":10,"testsPassed":6,"testsFailed":4,"totalDurationMs":66764,"totalTokens":8896,"avgTokPerSec":142.57944640796882,"promptChars":12391,"promptTokensEst":3098,"score":56,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":76403,"totalTokens":9962,"avgTokPerSec":137.0023398819064,"promptChars":12432,"promptTokensEst":3108,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":13,"testsPassed":7,"testsFailed":6,"totalDurationMs":81345,"totalTokens":10535,"avgTokPerSec":139.42076339875726,"promptChars":11419,"promptTokensEst":2855,"score":52,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":11,"testsFailed":1,"totalDurationMs":72723,"totalTokens":9567,"avgTokPerSec":141.2709378394512,"promptChars":11416,"promptTokensEst":2854,"score":75,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 79193,
"totalTokens": 10304,
"avgTokPerSec": 141.2083113764173,
"promptChars": 12199,
"promptTokensEst": 3050,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 10,
"testsPassed": 6,
"testsFailed": 4,
"totalDurationMs": 66764,
"totalTokens": 8896,
"avgTokPerSec": 142.57944640796882,
"promptChars": 12391,
"promptTokensEst": 3098,
"score": 56,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 76403,
"totalTokens": 9962,
"avgTokPerSec": 137.0023398819064,
"promptChars": 12432,
"promptTokensEst": 3108,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 13,
"testsPassed": 7,
"testsFailed": 6,
"totalDurationMs": 81345,
"totalTokens": 10535,
"avgTokPerSec": 139.42076339875726,
"promptChars": 11419,
"promptTokensEst": 2855,
"score": 52,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 11,
"testsFailed": 1,
"totalDurationMs": 72723,
"totalTokens": 9567,
"avgTokPerSec": 141.2709378394512,
"promptChars": 11416,
"promptTokensEst": 2854,
"score": 75,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":56798,"totalTokens":5105,"avgTokPerSec":99.4097006568848,"promptChars":11326,"promptTokensEst":2832,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":114297,"totalTokens":10163,"avgTokPerSec":97.19131591932717,"promptChars":12182,"promptTokensEst":3046,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":11,"testsFailed":1,"totalDurationMs":112008,"totalTokens":9892,"avgTokPerSec":97.0586619009377,"promptChars":12406,"promptTokensEst":3102,"score":75,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,113 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 56798,
"totalTokens": 5105,
"avgTokPerSec": 99.4097006568848,
"promptChars": 11326,
"promptTokensEst": 2832,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 114297,
"totalTokens": 10163,
"avgTokPerSec": 97.19131591932717,
"promptChars": 12182,
"promptTokensEst": 3046,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 11,
"testsFailed": 1,
"totalDurationMs": 112008,
"totalTokens": 9892,
"avgTokPerSec": 97.0586619009377,
"promptChars": 12406,
"promptTokensEst": 3102,
"score": 75,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":11,"testsPassed":11,"testsFailed":0,"totalDurationMs":143640,"totalTokens":12611,"avgTokPerSec":96.28061629672216,"promptChars":12125,"promptTokensEst":3031,"score":80,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":12,"testsPassed":12,"testsFailed":0,"totalDurationMs":116061,"totalTokens":10181,"avgTokPerSec":96.63321228455318,"promptChars":12435,"promptTokensEst":3109,"score":80,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":11,"testsPassed":11,"testsFailed":0,"totalDurationMs":113792,"totalTokens":10022,"avgTokPerSec":96.96815077469971,"promptChars":12260,"promptTokensEst":3065,"score":80,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,113 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 143640,
"totalTokens": 12611,
"avgTokPerSec": 96.28061629672216,
"promptChars": 12125,
"promptTokensEst": 3031,
"score": 80,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 116061,
"totalTokens": 10181,
"avgTokPerSec": 96.63321228455318,
"promptChars": 12435,
"promptTokensEst": 3109,
"score": 80,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 113792,
"totalTokens": 10022,
"avgTokPerSec": 96.96815077469971,
"promptChars": 12260,
"promptTokensEst": 3065,
"score": 80,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":10508,"promptTokensEst":2627,"score":0,"stars":"","error":"Puuttuvat: Cargo.toml, src/models.rs, src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,107 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 10508,
"promptTokensEst": 2627,
"score": 0,
"stars": "",
"error": "Puuttuvat: Cargo.toml, src/models.rs, src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":3,"testsPassed":0,"testsFailed":3,"totalDurationMs":217110,"totalTokens":21602,"avgTokPerSec":114.70956637458333,"promptChars":12612,"promptTokensEst":3153,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":3,"testsPassed":0,"testsFailed":3,"totalDurationMs":204772,"totalTokens":20717,"avgTokPerSec":114.45999021594592,"promptChars":12743,"promptTokensEst":3186,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":4,"testsPassed":0,"testsFailed":4,"totalDurationMs":180501,"totalTokens":18467,"avgTokPerSec":115.23583963958032,"promptChars":12392,"promptTokensEst":3098,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":25,"testsPassed":0,"testsFailed":25,"totalDurationMs":282681,"totalTokens":27665,"avgTokPerSec":111.29688837623901,"promptChars":12675,"promptTokensEst":3169,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":5,"testsPassed":0,"testsFailed":5,"totalDurationMs":171686,"totalTokens":17525,"avgTokPerSec":114.88288274375243,"promptChars":12618,"promptTokensEst":3155,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 3,
"testsPassed": 0,
"testsFailed": 3,
"totalDurationMs": 217110,
"totalTokens": 21602,
"avgTokPerSec": 114.70956637458333,
"promptChars": 12612,
"promptTokensEst": 3153,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 3,
"testsPassed": 0,
"testsFailed": 3,
"totalDurationMs": 204772,
"totalTokens": 20717,
"avgTokPerSec": 114.45999021594592,
"promptChars": 12743,
"promptTokensEst": 3186,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 4,
"testsPassed": 0,
"testsFailed": 4,
"totalDurationMs": 180501,
"totalTokens": 18467,
"avgTokPerSec": 115.23583963958032,
"promptChars": 12392,
"promptTokensEst": 3098,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 25,
"testsPassed": 0,
"testsFailed": 25,
"totalDurationMs": 282681,
"totalTokens": 27665,
"avgTokPerSec": 111.29688837623901,
"promptChars": 12675,
"promptTokensEst": 3169,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 5,
"testsPassed": 0,
"testsFailed": 5,
"totalDurationMs": 171686,
"totalTokens": 17525,
"avgTokPerSec": 114.88288274375243,
"promptChars": 12618,
"promptTokensEst": 3155,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":18,"testsPassed":0,"testsFailed":18,"totalDurationMs":208078,"totalTokens":20783,"avgTokPerSec":114.94478559756693,"promptChars":13278,"promptTokensEst":3320,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":13362,"promptTokensEst":3341,"score":0,"stars":"","error":"Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":9,"testsPassed":0,"testsFailed":9,"totalDurationMs":221174,"totalTokens":22354,"avgTokPerSec":114.09551344946065,"promptChars":13234,"promptTokensEst":3309,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":13317,"promptTokensEst":3329,"score":0,"stars":"","error":"Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":8795,"totalTokens":954,"avgTokPerSec":124.86009274372915,"promptChars":13335,"promptTokensEst":3334,"score":0,"stars":"☆☆☆☆☆","error":"fetch failed","profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,113 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 18,
"testsPassed": 0,
"testsFailed": 18,
"totalDurationMs": 208078,
"totalTokens": 20783,
"avgTokPerSec": 114.94478559756693,
"promptChars": 13278,
"promptTokensEst": 3320,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 13362,
"promptTokensEst": 3341,
"score": 0,
"stars": "",
"error": "Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 9,
"testsPassed": 0,
"testsFailed": 9,
"totalDurationMs": 221174,
"totalTokens": 22354,
"avgTokPerSec": 114.09551344946065,
"promptChars": 13234,
"promptTokensEst": 3309,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 13317,
"promptTokensEst": 3329,
"score": 0,
"stars": "",
"error": "Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 8795,
"totalTokens": 954,
"avgTokPerSec": 124.86009274372915,
"promptChars": 13335,
"promptTokensEst": 3334,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "fetch failed",
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":1,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":133173,"totalTokens":13174,"avgTokPerSec":117.52479437665707,"promptChars":14102,"promptTokensEst":3526,"score":30,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":5,"testsPassed":0,"testsFailed":5,"totalDurationMs":267561,"totalTokens":27021,"avgTokPerSec":113.5812238661422,"promptChars":14052,"promptTokensEst":3513,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":13914,"promptTokensEst":3479,"score":0,"stars":"","error":"Puuttuvat: src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":162271,"totalTokens":16343,"avgTokPerSec":115.53039090208604,"promptChars":14062,"promptTokensEst":3516,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":211367,"totalTokens":21183,"avgTokPerSec":113.22772767359652,"promptChars":14038,"promptTokensEst":3510,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,115 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 133173,
"totalTokens": 13174,
"avgTokPerSec": 117.52479437665707,
"promptChars": 14102,
"promptTokensEst": 3526,
"score": 30,
"stars": "★★☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 5,
"testsPassed": 0,
"testsFailed": 5,
"totalDurationMs": 267561,
"totalTokens": 27021,
"avgTokPerSec": 113.5812238661422,
"promptChars": 14052,
"promptTokensEst": 3513,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 13914,
"promptTokensEst": 3479,
"score": 0,
"stars": "",
"error": "Puuttuvat: src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 162271,
"totalTokens": 16343,
"avgTokPerSec": 115.53039090208604,
"promptChars": 14062,
"promptTokensEst": 3516,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 211367,
"totalTokens": 21183,
"avgTokPerSec": 113.22772767359652,
"promptChars": 14038,
"promptTokensEst": 3510,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":38807,"totalTokens":5667,"avgTokPerSec":183.83891911423427,"promptChars":21818,"promptTokensEst":5455,"score":40,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":178290,"totalTokens":26265,"avgTokPerSec":168.77786498646262,"promptChars":21840,"promptTokensEst":5460,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":151603,"totalTokens":22725,"avgTokPerSec":170.74115131582644,"promptChars":21750,"promptTokensEst":5438,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":41059,"totalTokens":6288,"avgTokPerSec":183.76827829344424,"promptChars":21848,"promptTokensEst":5462,"score":40,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":187666,"totalTokens":27278,"avgTokPerSec":166.24197655672018,"promptChars":21694,"promptTokensEst":5424,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 38807,
"totalTokens": 5667,
"avgTokPerSec": 183.83891911423427,
"promptChars": 21818,
"promptTokensEst": 5455,
"score": 40,
"stars": "★★☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 178290,
"totalTokens": 26265,
"avgTokPerSec": 168.77786498646262,
"promptChars": 21840,
"promptTokensEst": 5460,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 151603,
"totalTokens": 22725,
"avgTokPerSec": 170.74115131582644,
"promptChars": 21750,
"promptTokensEst": 5438,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 41059,
"totalTokens": 6288,
"avgTokPerSec": 183.76827829344424,
"promptChars": 21848,
"promptTokensEst": 5462,
"score": 40,
"stars": "★★☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 187666,
"totalTokens": 27278,
"avgTokPerSec": 166.24197655672018,
"promptChars": 21694,
"promptTokensEst": 5424,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":4,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":231122,"totalTokens":22952,"avgTokPerSec":113.75113825466987,"promptChars":17604,"promptTokensEst":4401,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":5,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":260314,"totalTokens":26144,"avgTokPerSec":113.40388181735229,"promptChars":17539,"promptTokensEst":4385,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":4,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":227228,"totalTokens":22381,"avgTokPerSec":113.5362722539456,"promptChars":17630,"promptTokensEst":4408,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":1,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":102052,"totalTokens":9984,"avgTokPerSec":117.77973450501808,"promptChars":17571,"promptTokensEst":4393,"score":30,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":146321,"totalTokens":14445,"avgTokPerSec":115.61186488022163,"promptChars":17589,"promptTokensEst":4397,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

Some files were not shown because too many files have changed in this diff Show More