108 Commits

Author SHA1 Message Date
2d1b1d3ec6 initial commit: agentic office
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:14:39 +03:00
1d701ae75e CodeBench: --boss orkestroija — iso malli analysoi, pieni korjaa
- --boss + --boss-ollama: iso malli tekee spekin JA analysoi käännösvirheet
- Boss muotoilee selkeät korjausohjeet ("add import X, change line Y")
- Worker (pieni malli) toteuttaa korjauksen täsmällisen ohjeen perusteella
- Boss ei generoi koodia itse — vain analysoi ja ohjaa
2026-04-15 00:51:31 +03:00
a32c4787f8 CodeBench: plaintext-speksi pienille malleille
- spec-plain.md: "entity Author (authors): name string, email string"
- extractPlainSpec() parseri plaintext → {entities, relationships}
- Small-profiili käyttää plain-formaattia, large JSON
- specText muuttuja: plaintext tai JSON prompteihin
- Ei voi mennä syntaktisesti rikki kuten JSON
2026-04-15 00:37:34 +03:00
6ccf6fb0e1 CodeBench: stripaa === marker === file-by-file outputista 2026-04-15 00:28:57 +03:00
a3ea0c2fda CodeBench: file-by-file build-validointi vasta kun kaikki tiedostot valmiina
- Vaihe 1: generoi kaikki 4 tiedostoa peräkkäin (konteksti kasvaa)
- Vaihe 2: go build kaikilla tiedostoilla → virheet per tiedosto → korjaus
- Ratkaisee: models.go yksinään ei käänny (ei main-funktiota)
- Virheet ryhmitellään tiedostoittain, korjataan vain viallinen tiedosto
2026-04-15 00:21:36 +03:00
3b1a02a9af CodeBench: go build output ei vuoda terminaaliin (stdio: pipe) 2026-04-15 00:12:57 +03:00
56133b5d19 CodeBench: siistimpi file-by-file virheilmoitus 2026-04-15 00:12:20 +03:00
9670c85750 CodeBench: file-by-file välitön go build -validointi + korjausloop
- Jokainen tiedosto (paitsi _test.go) validoidaan go buildilla heti
- Käännösvirheet palautetaan mallille korjattavaksi (max 2 yritystä)
- Virhe korjataan yhden tiedoston kontekstissa, ei koko projektia
- Testitiedosto validoidaan vasta lopussa go testillä (vaihe 5)
2026-04-15 00:10:06 +03:00
20a1e5f015 CodeBench: --convert-model Python→Go pipeline
- 8b generoi Pythonia (osaa sen), 30b konvertoi Go:ksi
- convert-go.md prompti: FastAPI→Chi, SQLAlchemy→database/sql mappaukset
- Koodigenerointi käyttää Python golden+promptia kun convert-model asetettu
- Vaihe [3.5/5] konvertointia varten
2026-04-14 23:54:45 +03:00
a16c33f4fb CodeBench: go.mod-generointi ennen missing-tarkistusta 2026-04-14 23:37:09 +03:00
afef340eb8 CodeBench: file-by-file logiin tokenit, rivimäärä ja tok/s per tiedosto 2026-04-14 23:36:29 +03:00
a65a25c56c CodeBench: --file-by-file generointi pienille malleille
- Generoi yksi tiedosto kerrallaan: models.go → handlers.go → main.go → tests
- Edellisten tiedostojen koodi kontekstissa seuraavalle
- Max 2048 tok per tiedosto (vs 10240 kaikki kerralla)
- go.mod generoidaan aina golden examplesta (ei mallin tuotoksesta)
- Promptissa "Write ONLY the file X" + "Start with package main"
2026-04-14 23:31:20 +03:00
178bef1277 CodeBench: korvaa go.mod aina golden versiolla — pienet mallit tuottavat vääriä moduulipolkuja 2026-04-14 23:14:07 +03:00
1649d2e864 CodeBench: --spec-ollama lippu eri Ollama-instanssille spec-vaiheissa 2026-04-14 23:05:07 +03:00
3caefa2f6e CodeBench: automaattinen go.mod-korjaus pienille malleille 2026-04-14 23:02:44 +03:00
65e7365e75 Poistettu virheelliset 8b Go-tulokset (väärä promptti: code-small → Python) 2026-04-14 23:00:31 +03:00
9aa4a46768 CodeBench: kielisuffiksi priorisoituu prompttivalinnassa (code-go > code-small) 2026-04-14 22:58:14 +03:00
61966783e3 CodeBench: spec-simple.md pienille malleille
- Yksinkertaistettu JSON-skeema: ei sa_type/py_type, vain type-kenttä
- Small-profiili käyttää automaattisesti spec-simple promptia
- Vähemmän kenttiä per entity → pienempi output → 8b selviytyy
2026-04-14 22:06:29 +03:00
7bcba3daf8 CodeBench: --spec-model lippu — eri malli spec-vaiheille (1-2) 2026-04-14 21:43:44 +03:00
7d49d62f81 CodeBench: kompaktoitu Go handlers.go golden — error handling yhdelle riville 2026-04-14 21:15:57 +03:00
5b8919ef89 CodeBench: sekunnit aikaleimaan (T17-58 → T17-58-23) 2026-04-14 21:10:52 +03:00
a4942edb9f CodeBench: --no-orchestrate lippu orkestroinnin ohittamiseen 2026-04-14 21:08:34 +03:00
8fc31f2a53 CodeBench: kierroskohtainen output-dir + tiivistetty Go golden example
- runPipeline saa round-parametrin, dir: model__scenario__r1, __r2 jne.
- todo-go.md testit 6→4 (poistettu list+update toisteiset), 466→370 riviä
2026-04-14 20:57:50 +03:00
01364b7031 CodeBench: korjaa Go-pipeline — tiedostoparseri + go mod tidy
- parseGeneratedFiles regex: lisätty .go ja .mod päätteet
- Markdown fence strippaus: lisätty go/gomod
- Dockerfile.go-test: go mod tidy ennen testejä (go.sum generoidaan)
- Testattu: 6/6 golden example ilman go.sum
2026-04-14 19:27:19 +03:00
f3cd1347ab CodeBench: Go-tuki — Chi + SQLite + httptest
- Golden example: todo-go/ (6/6 testit läpi)
- todo-go.md golden reference
- prompts/code-go.md koodigenerointi-prompti
- Dockerfile.go-test (golang:1.23-alpine)
- benchmark.mjs: LANG_CONFIG, parseTestOutput, prompt/golden-valinta Go:lle
- Käyttö: node benchmark.mjs --lang go --models qwen2.5-coder:32b
2026-04-14 19:20:18 +03:00
5ea2540588 CodeBench: promptit kokonaan englanniksi — poistettu suomenkieliset esimerkit 2026-04-14 18:58:20 +03:00
b91253235e CodeBench: lisätty qwen2.5-coder:32b profiili Rust-kandidaatiksi 2026-04-14 18:56:51 +03:00
ac2e3e92fc CodeBench: orkestrointi kaikille malleille ja kielille kun >1 entiteetti
Aiemmin vain small+python. Nyt kaikki multi-entity skenaariot pilkotaan
entiteetti kerrallaan — myös Rust ja large-mallit.
Author toimii jo, Article inkrementaalisena lisäyksenä helpompi.
2026-04-14 18:47:34 +03:00
0975385101 CodeBench: reqwest 0.13 + Docker volume cache + rust:latest
- reqwest 0.12 → 0.13, rustls-tls → rustls (golden, Dockerfile, promptit)
- Docker volume cache: kipina-cargo-registry + kipina-cargo-target
- rust:latest (1.94) + cmake (aws-lc-sys vaatii)
- Dockerfile yksinkertaistettu — esikäännös ei toimi, volume hoitaa
- Golden example 10/10 testattu uudella setupilla
2026-04-14 18:42:05 +03:00
bb8be3ffb4 CodeBench: revert-if-worse + erillinen testFixRounds-laskuri
- Seurataan parasta testitulosta (bestPassed/bestFiles)
- Jos korjaus huonontaa: palautetaan paras versio ja lopetetaan
- fixRounds laskee vain testikorjaukset, ei cargo check -kierroksia
- Estää 4/7 → 0/1 regressiot korjaussilmukassa
2026-04-14 18:24:46 +03:00
8fbb8eda2d CodeBench: esikäännä Rust-riippuvuudet Docker-imageen — 35x nopeampi
Dummy-projekti samalla Cargo.toml:llä: cargo check + cargo build --tests
imageen. Runtime kääntyy vain itse projekti (~2.5s vs ~90s).
2026-04-14 18:16:15 +03:00
742f331d93 CodeBench: Rust cargo check -vaihe ennen testejä + käännösvirheiden itsekorjaus
- Vaihe 4/5: cargo check Docker-kontissa ennen cargo test -ajoa
- Käännösvirheet syötetään mallille korjattavaksi (max 2 kierrosta)
- Estää turhat cargo test -ajot kun koodi ei käänny
2026-04-14 17:52:45 +03:00
2f602717b8 CodeBench: tiivistetty todo-rs.md golden example 540→331 riviä
- handlers.rs: tiiviimpi muotoilu, kommentit kuvaavat patternia
- tests: 10 testiä → 4 avaintestiä (create, get, not_found, delete)
- spawn_server tiivistetty
- Kaikki kriittiset patternit säilyvät: RETURNING, fetch_optional, rows_affected
2026-04-14 17:50:19 +03:00
d003f73217 CodeBench: tyhjennä GPU-muisti jokaisen kierroksen alussa
clearVram()-funktio vapauttaa kaikki Ollama-mallit VRAM:sta ennen testiä.
2026-04-14 17:47:55 +03:00
882bcece06 CodeBench: kirjoita results.json jokaisen kierroksen jälkeen
Välitulokset näkyvät heti tiedostossa, ei tarvitse odottaa koko ajon loppua.
2026-04-14 17:45:54 +03:00
477c21efd0 CodeBench: Rust golden example — todo-rs.md + kielitietoinen valinta
- Luotu todo-rs.md golden example Rust-referenssitoteutuksesta
- getGoldenForModel() huomioi nyt LANG: todo.md → todo-rs.md Rust-moodissa
- Korjattu golden-compact-rs.md /:id → /{id} bugi
- Juurisyy: malli sai Python golden examplen mutta piti generoida Rustia
2026-04-14 17:37:38 +03:00
088bad7b21 CodeBench: code-rs.md — spawn_server-esimerkki, {id} vahvistus, init_db yksinkertaistus
- Eksplisiittinen spawn_server()-koodi testien promptiin (async move wrappaus)
- {param} reittiohje vahvistettu kahdesti, chaining-ohje
- init_db: .expect() ei Result
- "You MUST generate ALL 6 files"
2026-04-14 17:08:26 +03:00
de3e33d46e CodeBench: code-rs.md — korjaa Rust-prompti kolmeen kriittiseen ongelmaan
- sqlx::query_as::<_, T>() runtime-funktiot, EI query_as!() compile-time makroja
- Route path {id} syntaksi, ei /:id (axum 0.8)
- app(pool) ottaa SqlitePool ja kutsuu .with_state(pool)
- Lisätty RETURNING, Result-palautustyypit, testiohjeistus
2026-04-14 16:40:15 +03:00
dcdb360098 Benchmark-tulokset: orkestrointi nosti 8b blogin 0p → 80p (med)
Orkestroitu 5 kierrosta: [0, 80, 80, 0, 80] med:80
3/5 kierrosta 100% testit läpi (11/11, 12/12, 11/11).
2/5 kaatui JSON-speksi -vaiheessa (ei orkestroinnin ongelma).
2026-04-14 15:45:27 +03:00
0b926c2cad CodeBench Taso 3: orkestrointi — pilko entiteetti kerrallaan pienille malleille
Small-profiili + >1 entiteettiä → generoi entiteetti kerrallaan:
1. Author (yksi entiteetti, 8b osaa)
2. Syötä Author-koodi → generoi Post + FK
3. Yhdistä ja testaa

Large-profiili jatkaa kuten ennen (kaikki kerralla).
2026-04-14 15:00:40 +03:00
a8f731d38e CodeBench: palautetaan 8b todo-readme.md — combined liian iso, hukkuu
combined-readme aiheutti uusia virheitä (POST 200 vs 201, 405).
Malli ei pysty käsittelemään kahta esimerkkiä kerralla.
2026-04-14 14:58:15 +03:00
5d0baf3ff1 CodeBench: combined-readme.md — todo + blog golden example 8b:lle
Molemmat esimerkit (single entity + FK relaatio) yhdessä tiedostossa.
1699 tokenia, 10.4% kontekstista. 8b näkee konkreettisen FK-patternen.
2026-04-14 14:54:12 +03:00
8e9fbc5422 CodeBench: code-small — FK update-testiesimerkki (author_id mukana PUT:ssa)
8b:n test_update_article palautti 422 koska author_id puuttui.
Lisätty konkreettinen test_update_post esimerkki FK-kentän kanssa.
2026-04-14 14:15:09 +03:00
06089a58b2 CodeBench: code-small — ForeignKey importin tarkennus (sqlalchemy, ei .orm)
8b importtaa ForeignKey väärästä paikasta (sqlalchemy.orm).
Lisätty eksplisiittinen "NOT from sqlalchemy.orm!" -varoitus.
2026-04-14 14:05:59 +03:00
a25c52cff4 CodeBench: mallikohtainen golden example (profiles.json → golden kenttä)
qwen3-coder:30b → todo.md (annotaatiot)
qwen3:8b → todo-readme.md (GitHub README -muoto, tutuin koulutusdata)
Golden example ladataan dynaamisesti per malli pipelinen sisällä.
2026-04-14 14:04:28 +03:00
0c3303a640 CodeBench: tyhjennä VRAM automaattisesti ennen testiajoa 2026-04-14 14:00:38 +03:00
ba48b737f2 CodeBench: --scenarios tukee yksittäistä skenaariota (todo/users/blog) 2026-04-14 13:58:38 +03:00
a3f1ead3e6 CodeBench: code-small — test_list assert >= 1 (ei == 1)
8b:n blog kaatui koska test_list assertoi tarkkaa määrää
vaikka testit jakavat saman tietokannan.
2026-04-14 13:58:13 +03:00
7fe72480b1 CodeBench: qwen3:8b primary-rooliin, FK-esimerkit code-small promptissa
profiles.json: role-kenttä (primary/backup/minimal/retired).
code-small.md: lisätty konkreettinen FK-pattern ja testi-esimerkki
relaatioille — 8b:n blog-skenaario kaatui koska ei osannut FK:ta.
2026-04-14 13:55:40 +03:00
92964e322f CodeBench: mallikohtaiset promptiprofiilit (profiles.json)
- profiles.json: malli → profiili → prompti -mappaus
- code-small.md: tiivistetty prompti pienille malleille (8b, 4b)
- benchmark valitsee automaattisesti oikean promptin mallin perusteella
- qwen3-coder:30b → code.md (large), qwen3:8b → code-small.md (small)
2026-04-14 13:54:26 +03:00
e54c1b057c Golden example: tarkat 6 testiä per entiteetti, ei ylimääräisiä
Malli generoi test_search, test_filter yms. joita ei ole endpointeissa.
Nyt todo.md listaa tarkalleen 6 testiä per entiteetti nimillä.
2026-04-14 12:56:50 +03:00
1de7e5c90b CodeBench: nopea syntaksitarkistus ennen Docker-ajoa
py_compile tarkistaa .py-tiedostot millisekunneissa.
Syntaksivirhe → suoraan itsekorjaukseen, ohitetaan Docker (~10s säästö).
2026-04-14 12:52:03 +03:00
e360896436 CodeBench Taso 4: itsekorjaava looppi — syötä pytest-virhe mallille
Jos testit epäonnistuvat, LLM saa virheilmoituksen + koodin ja korjaa.
Max 3 korjauskierrosta. Testattu: qwen3:8b users 0/6 → korjaus → 6/6.
2026-04-14 12:46:06 +03:00
6a40ca5730 CodeBench: golden example markdown-muodossa (koodi + selitykset)
todo.md yhdistää koodin ja annotaatiot: miksi pattern on valittu,
mitä EI saa tehdä. 1567 tokenia (vs raaka 1340, compact 335).
Benchmark lataa .md-version oletuksena, fallback erillisiin tiedostoihin.
2026-04-14 12:38:25 +03:00
2d470ee418 CodeBench: deprecated-patterns.md + inline deprecated-säännöt promptissa
Lisätty SQLAlchemy 2.0, Pydantic v2, FastAPI ja Python deprecated →
modern patterniparit. Uusimmat dokumentaatiot tarkistettu 2026-04-14.
2026-04-14 12:28:35 +03:00
062e6af776 CodeBench: vahvista CRITICAL-sääntö — ei ylimääräisiä kenttiä
qwen3:14b lisäsi created_at spekin ulkopuolelta ja käytti
server_default=datetime.now (virheellinen). Nostettu CRITICAL-tasolle.
2026-04-14 12:27:10 +03:00
75870c1100 CodeBench: korjaa aikaleima-sääntö — ei lisää ylimääräisiä kenttiä, func import
func.now() aiheutti NameError koska mallit eivät importanneet func:ia.
Uusi lähestymistapa: kielletään ylimääräiset kentät, ja JOS speksissä
on aikaleimat niin käytetään server_default + func import.
2026-04-14 12:18:36 +03:00
6e83fad31d CodeBench: 3 uutta promptisääntöä 5-kierroksen virheanalyysistä
1. Aikaleimakentät: server_default=func.now(), ei pakollisia Create-schemassa
2. Ei ylimääräisiä filter/search-endpointeja
3. Ei ylimääräisiä kenttiä spekin ulkopuolelta
2026-04-14 12:14:36 +03:00
0f3310996e CodeBench: oletus-URL 127.0.0.1 localhostin sijaan (Node 18 IPv6-ongelma) 2026-04-14 11:08:21 +03:00
e2a16b8ff6 CodeBench: väliraportti jokaisen kierroksen jälkeen
Näyttää mediaanin, kaikkien kierrosten pisteet ja trendin (▲▼─).
2026-04-14 11:04:51 +03:00
a0d3748faf CodeBench: --rounds N toistaa testiajot 1-10 kertaa
Kierrosyhteenveto näyttää mediaanin, min/max ja pass-raten per kierros.
Käyttö: node benchmark.mjs --models qwen3:14b --scenarios all --rounds 3
2026-04-14 11:03:00 +03:00
01b4fb8e22 CodeBench: --compact tiivistää golden examplen templaatiksi
Python: 1340 → 335 tokenia (−75%)
Rust: 3383 → 445 tokenia (−87%)
Käyttö: node benchmark.mjs --compact --models qwen3:4b
2026-04-14 10:59:39 +03:00
e7b33b7d6f CodeBench: Rust-tuki (--lang rust), golden example todo-rs, Dockerfile.cargo-test
- golden-examples/todo-rs/: Axum 0.8 + SQLx + SQLite, 10 testiä
- prompts/code-rs.md: Rust-koodingenerointiprompt
- Dockerfile.cargo-test: rust:1.87-slim testikontti
- benchmark.mjs: --lang python|rust, kieliriippuvainen golden example,
  parseri tukee cargo test -tuloksia, src/ alihakemistot
2026-04-14 10:55:50 +03:00
9da5540ca2 Golden example: todo-rs (Axum + SQLx + SQLite) 2026-04-14 10:50:16 +03:00
838d5fbd73 SUPERAGENTS.md: mallisuositukset VRAM-luokittain benchmark-datan perusteella
4GB qwen3:4b, 8GB qwen3:8b (95p), 16GB qwen3:14b (100p),
24GB qwen3-coder:30b (97p, 123 tok/s), 48GB molemmat rinnakkain.
Thinking-moodi huonontaa tuloksia. Yleismallit voittivat kooderimallit.
2026-04-14 10:33:35 +03:00
d02f6a51c1 CodeBench: --think lippu thinking-moodin testaamiseen
think:true + 3× token-raja (ajattelu vie ~2/3 tokeneista).
Käyttö: node benchmark.mjs --think --models qwen3:14b
2026-04-14 10:12:44 +03:00
8ba9ef83a3 CodeBench: num_ctx 16384 — rajoita konteksti-ikkuna VRAM-säästöksi
256K konteksti varaa ~15 GB KV-cachea vaikka benchmark käyttää ~3K.
16K riittää hyvin ja säästää merkittävästi VRAM:ia.
2026-04-14 09:49:30 +03:00
f50dc884a3 CodeBench: automaattinen aikaleima ja arkistointi results/-kansioon
Output-hakemisto /tmp/kipina-benchmark/2026-04-14T12-30/
Tulokset kopioidaan automaattisesti results/<aikaleima>.json/.html
2026-04-14 09:47:32 +03:00
7b27800390 Siirrä kipina-codebench projektin päätasolle 2026-04-14 09:44:14 +03:00
b93ae2fd1b Golden examples: README.md — ohje uusien esimerkkien luomiseen 2026-04-14 09:44:02 +03:00
4c116428c3 kipina-codebench: itsenäinen benchmark-moduli git-submoduliksi
Refaktoroitu tests/-kansiosta omaksi moduliksi:
- prompts/ — kaikki promptit erillisinä .md-tiedostoina
- golden-examples/ — todo (taso 1) + blog (taso 2)
- benchmark.mjs lataa promptit ja esimerkit dynaamisesti
- Dockerfile.pytest, report-template.html, package.json, README.md
- results/ — tallennetut benchmark-tulokset
2026-04-14 09:42:20 +03:00
542230f091 Benchmark: promptisääntö — update-testidatan pitää sisältää kaikki pakolliset kentät
qwen3-coder:30b users 6/7: test_update_user lähetti luontipaiva=None
NOT NULL -kenttään. Uusi sääntö estää tämän.
2026-04-14 09:31:42 +03:00
c217271907 Benchmark-tulokset 2026-04-14: mistral-perhe ja top3-vertailu
top3: qwen3-coder:30b ★★★★★ 97p, codestral:22b ★★★★☆ 88p, qwen3.5:35b 40p
mistral: codestral:22b 80p, mistral-small3.1 30p, devstral:24b 44p
2026-04-14 09:30:40 +03:00
a08b5f3893 Benchmark: think:false — kytke ajattelu pois Ollama-kutsuissa
Thinking-mallit (qwen3.5) käyttivät kaikki tokenit ajatteluun
eikä content-kenttään jäänyt mitään. think:false pakottaa
suoran vastauksen ilman ajattelublokkia.
2026-04-14 08:48:03 +03:00
25b9ab0c37 Benchmark: käytä thinking-kenttää fallbackina jos content tyhjä
qwen3.5 palauttaa vastauksen thinking-kentässä kun content on tyhjä.
Lisätty debug-logi thinking-malleille.
2026-04-14 08:45:06 +03:00
62c9b6e17e Benchmark: nosta token-rajoja thinking-malleja varten
qwen3.5 palauttaa ajattelun erillisessä thinking-kentässä,
content jää tyhjäksi jos tokenit loppuvat kesken.
Vaatimukset 1024→2048, speksi 2048→4096.
2026-04-14 08:42:32 +03:00
ad097ca712 Benchmark: HTML-raportti laskee pisteet itse (toimii vanhoilla tuloksilla) 2026-04-14 08:29:47 +03:00
868d116961 Benchmark: HTML-webbiraportit tuloksista
Standalone HTML-tiedosto joka sisältää:
- Yhteenvetokortit (keskiarvo, paras malli, nopein, testit)
- Mallikohtainen taulukko palkkikaavioilla
- Yksittäiset tulokset sortattavassa taulussa
- Dark mode, ei ulkoisia dependencyjä
2026-04-14 08:27:01 +03:00
02e3701d77 Benchmark: output-tokenit yhteenvetotaulussa per skenaario ja yhteensä 2026-04-14 08:20:32 +03:00
b3abf4e89f Benchmark: mallikohtainen yhteenvetotaulu + kokonaisaika
Näyttää per malli: testit ja aika per skenaario, kokonaisläpäisy,
kokonaisaika, keskimääräinen tok/s ja keskipisteet.
2026-04-14 08:19:27 +03:00
9f2899b83d Benchmark: pisteytys (0-100) ja tähtiluokitus tuloksissa
Pisteytys: speksi 10p + koodi 10p + testit 60p + korjaukset 20p.
Tähdet: ★★★★★ (90+), ★★★★☆ (70+), ★★★☆☆ (50+), ★★☆☆☆ (25+), ★☆☆☆☆ (1+).
Näkyy per-ajo rivillä, tulostaulussa ja yhteenvedossa.
2026-04-14 08:10:27 +03:00
4a811e4171 Benchmark: näytä kontekstin koko (promptin token-arvio) tuloksissa 2026-04-14 08:05:59 +03:00
8efbf96295 Golden example: blog (taso 2, relaatiot Author → Post)
13 testiä, ForeignKey-relaatio, uniikki suomalainen testidata
(Aleksis Kivi, Tove Jansson jne). Testattu Docker-kontissa.
2026-04-14 08:03:21 +03:00
16f40a7536 Benchmark: pytest ajetaan Docker-kontissa (kipina-pytest)
Kontti hoitaa uv init + uv add + pytest eristetyssä ympäristössä.
Python 3.14, ei VIRTUAL_ENV-ongelmia, täysi toistettavuus.
Image: docker build -t kipina-pytest -f tests/Dockerfile.pytest tests/
2026-04-14 07:39:23 +03:00
42ee959781 Benchmark: uv init + uv add hoitaa projektiasetuksen
LLM generoi enää 4 tiedostoa (ei pyproject.toml).
Pipeline: uv init → uv add deps → kirjoita .py → pytest.
Poistaa Poetry-yhteensopivuusongelmat kokonaan.
2026-04-14 07:34:06 +03:00
0850a139f1 Benchmark: fallback korvaa Poetry-pyproject.toml PEP 621 -versiolla
Kaikki testatut mallit generoivat [tool.poetry] -muodon
vaikka kultainen esimerkki näyttää [project]-muodon.
uv sync ei ymmärrä Poetrya → pytest ei asennu → kaatuu.
Fallback korvaa rikkinäisen pyproject.toml kultaisella versiolla.
2026-04-14 07:30:55 +03:00
d6a544909c Benchmark: kultainen esimerkki + zensical-dokumentointiohjeet
- golden-examples/todo/: 6/6 PASS referenssitoteutus
  - SQLAlchemy 2.0 (DeclarativeBase, Mapped, mapped_column)
  - Pydantic v2 (ConfigDict)
  - PEP 621 pyproject.toml, Python >=3.14
  - Uniikki testidata per testi
- CODE_SYSTEM päivitetty: few-shot kultaisesta esimerkistä
- DOCUMENTATION.md: zensical-dokumentointiohjeet
2026-04-14 07:28:47 +03:00
8f154a578c SUPERAGENTS.md: benchmark-arkkitehtuuri kehityksen todentamiseen
Moniulotteinen pisteytys (0-100), portaittaiset vaikeustasot
(CRUD → relaatiot → liiketoimintalogiikka → kehittyneet patternit),
historiavertailu ja regressiotunnistus.
2026-04-14 07:16:37 +03:00
7221f5e920 SUPERAGENTS.md: itseoppivan koodausagentin arkkitehtuuri ja toteutussuunnitelma 2026-04-14 07:14:17 +03:00
34a56e408d Benchmark: stripThinking tukee myös qwen3/3.5 <think>-tageja 2026-04-14 06:58:18 +03:00
ecd4bc2ac3 Benchmark: nosta koodigeneroinnin token-raja 4096 → 8192
gemma4:e4b tuotti 323 riviä ja tokenit loppuivat kesken,
pyproject.toml ei mahtunut vastaukseen.
2026-04-14 06:38:40 +03:00
7dc2af59c3 Benchmark: stripThinking poistaa gemma4-ajattelutagit vastauksista 2026-04-14 06:35:31 +03:00
4aa09e1025 Benchmark: LLM generoi koodin templaattien sijaan
Vaihe 3 käyttää nyt oikeaa LLM-kutsua (CODE_SYSTEM-prompti)
koodin generointiin. Templaattifunktiot poistettu kokonaan.
Tämä mittaa mallin todellista koodingenerointikykyä.
2026-04-13 22:23:35 +03:00
20cea8f268 Model benchmark: testaa kaikki Ollama-mallit järjestelmällisesti
Ajaa täyden pipeline-kierroksen per malli × skenaario:
1. Client-prompti → vaatimukset
2. Manager/SPEC_SYSTEM → JSON-speksi
3. Template-generointi → koodi
4. Validointi + LLM-korjaussilmukka
5. uv sync + pytest

Tuottaa vertailutaulukon: speksin laatu, testien tulos, nopeus.
Tukee suoraa Ollamaa (--ollama) ja hub-reittiä (--hub).
2026-04-13 22:08:47 +03:00
38a18c555b Debug: reititys logittaa kaikki solmut ja niiden tilat 2026-04-13 21:53:40 +03:00
8138e41aa1 native-noden tuunausta 2026-04-13 21:29:05 +03:00
6ee5bdf960 Native node: lämmittelykutsu lataa mallin VRAM:iin heti käynnistyksessä 2026-04-13 21:23:56 +03:00
cf3bf54bf8 kipina-node: automaattinen versiopäivitys build-hashilla
Poistettu interaktiivinen "haluatko korvata?" -kysely. Tilalle:
- Bootstrap hakee .build-hash palvelimelta joka käynnistyksellä
- Vertaa paikalliseen kipina-node-bin.hash
- Lataa uuden automaattisesti jos hash eroaa
- Näyttää version käynnistyksen yhteydessä

Ei enää tilannetta jossa vanha binääri jää vahingossa ajoon.
2026-04-13 21:21:48 +03:00
56f21a96c9 TUI: VRAM-tila värikoodattu (vihreä=100% GPU, keltainen=osittainen, punainen=CPU) 2026-04-13 21:12:50 +03:00
763b93396c Reititys: busy-solmut suodatetaan pois — työ jakautuu solmuille
Aiemmin busy-lukko luettiin mutta sitä ei käytetty suodatukseen,
joten sama solmu valittiin aina uudelleen vaikka se oli varattu.
Nyt matching-lista suodattaa pois busy-solmut, joten toinen
vapaa solmu saa tehtävän. Heavy-fallback kevyempään solmuun
jos kaikki isot mallit ovat varattuja.
2026-04-13 21:09:24 +03:00
e09962940a Native node: VRAM-tila TUI:ssa (ollama ps)
- fetch_ps(): hakee /api/ps ja palauttaa ModelVramStatus
- ModelVramStatus: size vs size_vram → 100% GPU / osittainen / CPU
- TUI: uusi "VRAM: ✓ qwen3:32b (20.1 GB) — 100% GPU" -rivi
- Taustapäivitys 30s välein
- Tuore linux-x86_64 binääri
2026-04-13 21:06:27 +03:00
5e44b63b0c Native node: tuore linux-x86_64 -binääri (reconnect, timestamp, node_id) 2026-04-13 16:54:28 +03:00
0f3881aa02 Fix: async RwLock read ennen Mutex-scopea (Send-yhteensopivuus) 2026-04-13 16:34:51 +03:00
fa85dcc5b3 Älykäs reititys: capability=heavy priorisoi isoimman mallin solmun
Hub:
- Parsii node_models:sta suurimman mallin parametrimäärän (B)
  per solmu (esim. qwen3:32b → 32, qwen2.5-coder:7b → 7)
- Tallentaa node_max_param_b: HashMap<u64, u32>
- ChatCompletionRequest: uusi capability-kenttä ("heavy"/"light")
- Reitityslogiikka: capability=heavy → valitsee solmun jolla on
  suurin malli; oletus → natiivi ensin kuten ennenkin

Frontend (pipeline):
- JSON-speksin generointi: capability=heavy
- QA-korjaussilmukan koodikorjaus: capability=heavy
- Observer/README-arviointi: capability=heavy
- Vaatimukset (Client): oletus (kevyt, kelpaa pieni malli)

Tämä mahdollistaa sen, että A40-koneella pyörivä Qwen3:32B
saa raskaat tehtävät ja selaimen 0.5B-malli hoitaa kevyet.
2026-04-13 16:30:47 +03:00
58d93613f0 Hero-kuvat: oikeat kipina.tech-kuvat (forge, serpent, gecko) 2026-04-13 14:33:11 +03:00
66b4435362 Teemavalitsin: painike kiertää gecko/forge/serpent, oletus forge
- Teemapainike (emoji) oikeaan yläkulmaan kuten kipina.tech:ssä
- Oletus forge (syaani), tallennetaan localStorage:iin
- Hero-kuva vaihtuu teeman mukaan fade-efektillä
- Kolme hero-kuvaa: gecko_hero, forge_hero (hämähäkki), serpent_hero
2026-04-13 14:29:14 +03:00
3a00de9b8e Kolme kipina.tech-teemaa: gecko, forge, serpent — satunnaisvalinta
Tuodaan kipina.techin kolme visuaalista teemaa kipina.studioon:
- gecko: lämmin kulta/oranssi (#ff7b00)
- forge: kyber-sininen/syaani (#00e5ff)
- serpent: neon-turkoosi (#00ffff)

Teema arvotaan satunnaisesti joka sivulatauksella. Kaikki aiemmin
hardcoodatut #ff6b00-aksenttivärit korvattu CSS-muuttujilla
(--hero-accent, --hero-glow) jotka mukautuvat teemaan.
2026-04-13 14:22:33 +03:00
670141c8c3 QA-korjaussilmukka: validointi delegoi ongelmat Coder-agentille
Aiemmin mekaaninen validateProjectCode() vain listasi ongelmat terminaaliin.
Nyt pipeline toimii näin:
1. QA-agentti ajaa mekaanisen validoinnin
2. Jos ongelmia → ryhmittelee ne tiedostoittain
3. Delegoi jokaisen tiedoston korjauksen oikealle agentille (Coder/Data/QA)
4. Agentti (LLM) palauttaa korjatun tiedoston
5. Validointi ajetaan uudelleen — max 2 korjauskierrosta
6. Lopullinen tulos näytetään vihreänä/punaisena
7. Tarkkailija arvioi lopullisen version

Kaikki korjausvaiheet tallentuvat promptLog:iin → näkyvät oppimispolussa.
2026-04-13 14:09:10 +03:00
181 changed files with 23944 additions and 73 deletions

View File

@@ -0,0 +1,4 @@
FROM rust:latest
RUN apt-get update && apt-get install -y pkg-config libssl-dev cmake && rm -rf /var/lib/apt/lists/*
WORKDIR /work
ENTRYPOINT ["sh", "-c", "cp -r /src/* . && cargo test 2>&1"]

View File

@@ -0,0 +1,4 @@
FROM golang:1.23-alpine
RUN apk add --no-cache gcc musl-dev
WORKDIR /work
ENTRYPOINT ["sh", "-c", "cp -r /src/* . && go mod tidy 2>&1 && go test -v -count=1 ./... 2>&1"]

View File

@@ -0,0 +1,5 @@
FROM python:3.14-slim
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
WORKDIR /work
ENV PYTHONPATH=/work
ENTRYPOINT ["sh", "-c", "uv init --no-readme --python '>=3.14' 2>/dev/null && rm -f hello.py main.py && uv add fastapi 'uvicorn[standard]' sqlalchemy pytest httpx 2>/dev/null && cp /src/*.py . && rm -f app.db test.db && uv run pytest test_main.py -v --tb=short 2>&1"]

View File

@@ -0,0 +1,95 @@
# Kipinä CodeBench
LLM-koodingenerointibenchmark. Testaa Ollama-mallien kykyä generoida toimivia FastAPI+SQLAlchemy-projekteja ja ajaa testit Docker-kontissa.
## Pikastart
```bash
# 1. Rakenna Docker-testikontti
docker build -t kipina-pytest -f Dockerfile.pytest .
# 2. Aja benchmark
node benchmark.mjs --ollama http://localhost:11434 --scenarios all
# 3. Avaa raportti
open /tmp/kipina-benchmark/report.html
```
## Pipeline
```
1. LLM → vaatimusmäärittely (prompts/client.md)
2. LLM → JSON-speksi (prompts/spec.md)
3. LLM → 4 Python-tiedostoa (prompts/code.md + golden-examples/)
4. Staattinen validointi + LLM-korjaus (prompts/fix.md)
5. Docker: uv init + uv add + pytest
```
## CLI-argumentit
| Argumentti | Oletus | Kuvaus |
|-----------|--------|--------|
| `--ollama` | `http://localhost:11434` | Ollama-palvelimen URL |
| `--hub` | - | Hub-reitti (vaihtoehto Ollamalle) |
| `--models` | kaikki | Pilkuilla erotettu mallilista |
| `--scenarios` | `default` (todo) | `all` = todo, users, blog |
| `--output` | `/tmp/kipina-benchmark` | Tuloshakemisto |
## Hakemistorakenne
```
kipina-codebench/
├── benchmark.mjs ← runner
├── Dockerfile.pytest ← Python 3.14 + uv testikontti
├── report-template.html ← HTML-raporttipohja
├── package.json
├── prompts/ ← muokattavat promptit
│ ├── client.md ← vaatimusmäärittely
│ ├── spec.md ← JSON-speksi
│ ├── code.md ← koodigenerointi
│ └── fix.md ← korjaus
├── golden-examples/ ← referenssitoteutukset
│ ├── todo/ ← taso 1: perus-CRUD (6 testiä)
│ ├── blog/ ← taso 2: relaatiot (13 testiä)
│ └── DOCUMENTATION.md ← zensical-dokumentointiohjeet
└── results/ ← tallennetut tulokset
```
## Promptien muokkaus
Promptit ovat `prompts/`-kansiossa Markdown-tiedostoina. Muokkaa suoraan — benchmark lataa ne käynnistyksessä.
Esimerkki: lisää sääntö `prompts/code.md`:hen:
```
- Tests: PUT/update test data MUST include ALL required fields
```
## Kultaiset esimerkit
`golden-examples/todo/` syötetään LLM:lle referenssinä. Malli näkee tarkalleen millaista koodia odotetaan:
- SQLAlchemy 2.0 (DeclarativeBase, Mapped, mapped_column)
- Pydantic v2 (ConfigDict)
- Python 3.14 syntaksi (str | None)
- Uniikki testidata per testi
Lisää uusia esimerkkejä luomalla hakemisto (esim. `golden-examples/shop/`).
## Pisteytys
| Komponentti | Pisteet | Peruste |
|---|---|---|
| Speksi OK | 10p | JSON-speksi onnistui |
| Koodi generoitu | 10p | Kaikki 4 tiedostoa syntyneet |
| Testit | 060p | passed/total × 60 |
| Korjaukset | 020p | 0 kierrosta = 20p, 1 = 10p, 2+ = 0p |
Tähdet: ★★★★★ (90+), ★★★★☆ (70+), ★★★☆☆ (50+), ★★☆☆☆ (25+), ★☆☆☆☆ (1+)
## Käyttö git-submodulena
```bash
git submodule add <repo-url> tools/codebench
cd tools/codebench
docker build -t kipina-pytest -f Dockerfile.pytest .
node benchmark.mjs --ollama http://localhost:11434 --scenarios all
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,84 @@
# Dokumentointiohjeet — Zensical
Hyvä dokumentointi kertoo **mitä asia ON**, ei mitä se tekee. Se on kuin zen-koan: lyhyt, tarkka, riittävä.
## Periaatteet
1. **Yksi rivi riittää.** Jos tarvitset kappaleen, koodi on liian monimutkainen.
2. **Kerro mitä, älä miten.** `"""Tietokantamallit — SQLAlchemy 2.0, SQLite."""` ei `"""This module creates database models using SQLAlchemy..."""`
3. **Älä toista koodia.** Jos funktio on `create_todo`, docstring ei ole "Creates a todo".
4. **Suomi tai englanti, ei molempia.** Valitse yksi kieli per projekti.
5. **Ei täytesanoja.** "This module provides functionality for" → poista.
## Mitä dokumentoidaan
| Kohde | Dokumentointi | Esimerkki |
|-------|--------------|-----------|
| **Moduuli** (.py) | Aina. Yksi rivi: mitä tiedosto sisältää. | `"""Pydantic v2 -skeemat — Create ja Response."""` |
| **Luokka** | Aina. Mitä entiteetti edustaa. | `"""Tehtävä — otsikko, deadline, prioriteetti."""` |
| **Funktio** | Vain jos nimi ei kerro kaikkea. | `get_db``"""Tietokantasessio per pyyntö."""` |
| **CRUD-endpoint** | Ei. Nimi + HTTP-metodi riittää. | `create_todo`, `list_todos` — itsedokumentoivia |
| **Testi** | Ei. Testin nimi on dokumentaatio. | `test_get_todo_not_found` — selvä |
| **Konfiguraatio** | Kommentti vain jos arvo yllättää. | `check_same_thread: False # SQLite + FastAPI` |
## Mitä EI dokumentoida
- Importteja
- Ilmeisiä parametreja (`item_id: int`)
- Tyyppivihjeitä jotka kertovat saman asian
- Geneerisiä "boilerplate"-docstringejä
## Esimerkkejä
### Hyvä (zensical)
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
...
def get_db():
"""Tietokantasessio per pyyntö."""
...
```
### Huono (verbose)
```python
"""
This module defines the database models for the Todo application.
It uses SQLAlchemy ORM to create the database tables and provides
the session factory for database connections.
"""
class Todo(Base):
"""
Represents a todo item in the database.
Attributes:
id: The unique identifier for the todo item.
title: The title of the todo item.
...
"""
...
```
### Huono (tyhjä)
```python
# Ei docstringejä ollenkaan — lukija ei tiedä mikä tiedoston rooli on
class Todo(Base):
__tablename__ = "todos"
...
```
## Tarkistuslista
Generoitu koodi on hyvin dokumentoitu kun:
- [ ] Jokainen .py-tiedosto alkaa yksirivisellä docstringillä
- [ ] Jokainen luokka kertoo mitä entiteetti edustaa
- [ ] Docstringit ovat saman kielen kuin muu koodi
- [ ] CRUD-endpointeilla ei ole turhia docstringejä
- [ ] Kommentteja on vain siellä missä koodi yllättää

View File

@@ -0,0 +1,123 @@
# Golden Examples — referenssitoteutukset
Kultaiset esimerkit ovat **täydellisiä, testattuja** FastAPI-projekteja joita LLM käyttää mallina koodigeneroinnissa. Malli näkee esimerkin ja tuottaa vastaavan rakenteen uudelle projektille.
## Uuden esimerkin luominen
### 1. Luo hakemisto
```bash
mkdir golden-examples/shop
```
Nimeä hakemisto skenaarion mukaan (todo, blog, shop, booking...).
### 2. Luo 4 tiedostoa
| Tiedosto | Sisältö |
|----------|---------|
| `models.py` | SQLAlchemy 2.0 -mallit (DeclarativeBase, Mapped, mapped_column) |
| `schemas.py` | Pydantic v2 -skeemat (ConfigDict, `str \| None` -syntaksi) |
| `main.py` | FastAPI CRUD -endpointit (POST 201, GET, GET/:id 404, PUT, DELETE 204) |
| `test_main.py` | Pytest + TestClient, erillinen test.db, uniikki data per testi |
### 3. Noudata konventioita
**Python-versio:** >=3.14
**SQLAlchemy 2.0** (ei legacy):
```python
# Oikein
class Base(DeclarativeBase):
pass
class Todo(Base):
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
status: Mapped[str] = mapped_column(String(20), default="pending")
# Väärin
Base = declarative_base()
id = Column(Integer, primary_key=True)
```
**Pydantic v2** (ei v1):
```python
# Oikein
class TodoResponse(TodoCreate):
id: int
model_config = ConfigDict(from_attributes=True)
# Väärin
class Config:
orm_mode = True
```
**Tyypitys:**
```python
# Oikein
description: Mapped[str | None] = mapped_column(Text, default=None)
# Väärin
description: Mapped[Optional[str]]
```
**Dokumentointi (zensical):**
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
```
Yksi rivi riittää. Kerro mitä asia ON, älä mitä se tekee. Katso [DOCUMENTATION.md](DOCUMENTATION.md).
**Testidata — uniikki ja kuvaava:**
```python
# Oikein
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
# Väärin — geneerinen data
def test_create_todo():
response = client.post("/todos/", json={"title": "test", "priority": 1})
```
### 4. Testaa Docker-kontissa
```bash
rm -rf /tmp/golden-test && mkdir /tmp/golden-test
cp golden-examples/shop/*.py /tmp/golden-test/
docker run --rm -v /tmp/golden-test:/src:ro kipina-pytest
```
**Kaikkien testien pitää mennä läpi.** Ei varoituksia, ei deprecation-viestejä.
### 5. Vaikeustasot
| Taso | Esimerkit | Haaste |
|------|-----------|--------|
| 1 — Perus-CRUD | `todo/`, `users/`, `notes/` | Yksi entiteetti |
| 2 — Relaatiot | `blog/`, `library/`, `school/` | Foreign key, 23 entiteettiä |
| 3 — Liiketoimintalogiikka | `shop/`, `booking/` | Custom endpointit, validointi |
Aloita tasosta 1 ja etene. Tason 1 esimerkkien pitää olla yksinkertaisia — ne opettavat mallille perusrakenteen.
## Miten esimerkit vaikuttavat
Benchmark lataa `todo/`-esimerkin ja syöttää sen LLM:lle osana koodingenerointipromptia:
```
REFERENCE IMPLEMENTATION (todo project — follow this exact structure):
=== models.py ===
<todo/models.py sisältö>
=== schemas.py ===
...
```
Malli näkee tarkan esimerkin ja tuottaa vastaavan rakenteen uudelle projektille. Mitä parempi esimerkki, sitä parempi tulos.

View File

@@ -0,0 +1,110 @@
"""FastAPI CRUD — kaksi endpoint-settiä, Author ja Post."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Author, Post
from schemas import AuthorCreate, AuthorResponse, PostCreate, PostResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
# --- Author ---
@app.post("/authors/", response_model=AuthorResponse, status_code=201)
def create_author(item: AuthorCreate, db: Session = Depends(get_db)):
db_item = Author(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/authors/", response_model=list[AuthorResponse])
def list_authors(db: Session = Depends(get_db)):
return db.query(Author).all()
@app.get("/authors/{item_id}", response_model=AuthorResponse)
def get_author(item_id: int, db: Session = Depends(get_db)):
item = db.query(Author).filter(Author.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Author not found")
return item
@app.put("/authors/{item_id}", response_model=AuthorResponse)
def update_author(item_id: int, item: AuthorCreate, db: Session = Depends(get_db)):
db_item = db.query(Author).filter(Author.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Author not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/authors/{item_id}", status_code=204)
def delete_author(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Author).filter(Author.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Author not found")
db.delete(db_item)
db.commit()
# --- Post ---
@app.post("/posts/", response_model=PostResponse, status_code=201)
def create_post(item: PostCreate, db: Session = Depends(get_db)):
db_item = Post(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/posts/", response_model=list[PostResponse])
def list_posts(db: Session = Depends(get_db)):
return db.query(Post).all()
@app.get("/posts/{item_id}", response_model=PostResponse)
def get_post(item_id: int, db: Session = Depends(get_db)):
item = db.query(Post).filter(Post.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Post not found")
return item
@app.put("/posts/{item_id}", response_model=PostResponse)
def update_post(item_id: int, item: PostCreate, db: Session = Depends(get_db)):
db_item = db.query(Post).filter(Post.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Post not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/posts/{item_id}", status_code=204)
def delete_post(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Post).filter(Post.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Post not found")
db.delete(db_item)
db.commit()

View File

@@ -0,0 +1,45 @@
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, ForeignKey-relaatiot."""
from datetime import datetime
from sqlalchemy import String, Text, DateTime, ForeignKey, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Author(Base):
"""Kirjoittaja — nimi, sähköposti ja bio."""
__tablename__ = "authors"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
name: Mapped[str] = mapped_column(String(255))
email: Mapped[str] = mapped_column(String(255), unique=True)
bio: Mapped[str | None] = mapped_column(Text, default=None)
posts: Mapped[list["Post"]] = relationship(back_populates="author")
class Post(Base):
"""Blogipostaus — otsikko, sisältö, kirjoittaja, julkaisuaika ja tila."""
__tablename__ = "posts"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
content: Mapped[str] = mapped_column(Text)
author_id: Mapped[int] = mapped_column(ForeignKey("authors.id"))
published_at: Mapped[datetime | None] = mapped_column(DateTime, default=None)
status: Mapped[str] = mapped_column(String(20), default="draft")
author: Mapped["Author"] = relationship(back_populates="posts")
Base.metadata.create_all(bind=engine)

View File

@@ -0,0 +1,37 @@
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import datetime
from pydantic import BaseModel, ConfigDict
class AuthorCreate(BaseModel):
"""Uuden kirjoittajan luonti. Pakolliset: name, email."""
name: str
email: str
bio: str | None = None
class AuthorResponse(AuthorCreate):
"""Palautettava kirjoittaja — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)
class PostCreate(BaseModel):
"""Uuden postauksen luonti. Pakolliset: title, content, author_id."""
title: str
content: str
author_id: int
published_at: datetime | None = None
status: str = "draft"
class PostResponse(PostCreate):
"""Palautettava postaus — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)

View File

@@ -0,0 +1,164 @@
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def _create_author(name="Eino Leino", email=None):
"""Apufunktio kirjoittajan luomiseen testeissä."""
if email is None:
email = f"{name.lower().replace(' ', '.')}@example.com"
return client.post(
"/authors/", json={"name": name, "email": email}
).json()
# --- Author-testit ---
def test_create_author():
response = client.post(
"/authors/",
json={"name": "Aleksis Kivi", "email": "aleksis@example.com", "bio": "Suomen kansalliskirjailija"},
)
assert response.status_code == 201
assert response.json()["name"] == "Aleksis Kivi"
assert response.json()["bio"] == "Suomen kansalliskirjailija"
assert "id" in response.json()
def test_list_authors():
_create_author("Minna Canth", "minna.canth@example.com")
response = client.get("/authors/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_author_by_id():
created = _create_author("Väinö Linna", "vaino.linna@example.com")
response = client.get(f"/authors/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_author_not_found():
response = client.get("/authors/99999")
assert response.status_code == 404
def test_update_author():
created = _create_author("Vanha Nimi", "vanha.nimi@example.com")
response = client.put(
f"/authors/{created['id']}",
json={"name": "Uusi Nimi", "email": "uusi.nimi@example.com"},
)
assert response.status_code == 200
assert response.json()["name"] == "Uusi Nimi"
def test_delete_author():
created = _create_author("Poistettava Kirjailija", "poistettava@example.com")
response = client.delete(f"/authors/{created['id']}")
assert response.status_code == 204
response = client.get(f"/authors/{created['id']}")
assert response.status_code == 404
# --- Post-testit ---
def test_create_post():
author = _create_author("Tove Jansson", "tove.jansson@example.com")
response = client.post(
"/posts/",
json={"title": "Muumipeikko ja pyrstötähti", "content": "Eräänä aamuna...", "author_id": author["id"]},
)
assert response.status_code == 201
assert response.json()["title"] == "Muumipeikko ja pyrstötähti"
assert response.json()["author_id"] == author["id"]
assert response.json()["status"] == "draft"
def test_list_posts():
author = _create_author("Juhani Aho", "juhani.aho@example.com")
client.post(
"/posts/",
json={"title": "Rautatie", "content": "Junasta kertova novelli.", "author_id": author["id"]},
)
response = client.get("/posts/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_post_by_id():
author = _create_author("Elias Lönnrot", "elias.lonnrot@example.com")
created = client.post(
"/posts/",
json={"title": "Kalevala", "content": "Vaka vanha Väinämöinen.", "author_id": author["id"]},
).json()
response = client.get(f"/posts/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_post_not_found():
response = client.get("/posts/99999")
assert response.status_code == 404
def test_update_post():
author = _create_author("Joel Lehtonen", "joel.lehtonen@example.com")
created = client.post(
"/posts/",
json={"title": "Vanha otsikko", "content": "Alkuperäinen teksti.", "author_id": author["id"]},
).json()
response = client.put(
f"/posts/{created['id']}",
json={"title": "Päivitetty otsikko", "content": "Muokattu teksti.", "author_id": author["id"], "status": "published"},
)
assert response.status_code == 200
assert response.json()["title"] == "Päivitetty otsikko"
assert response.json()["status"] == "published"
def test_delete_post():
author = _create_author("Aino Kallas", "aino.kallas@example.com")
created = client.post(
"/posts/",
json={"title": "Poistettava postaus", "content": "Tämä poistetaan.", "author_id": author["id"]},
).json()
response = client.delete(f"/posts/{created['id']}")
assert response.status_code == 204
response = client.get(f"/posts/{created['id']}")
assert response.status_code == 404
def test_post_belongs_to_author():
author = _create_author("Sofi Oksanen", "sofi.oksanen@example.com")
post = client.post(
"/posts/",
json={"title": "Puhdistus", "content": "Romaani Virosta.", "author_id": author["id"]},
).json()
assert post["author_id"] == author["id"]

View File

@@ -0,0 +1,204 @@
# Example 1: Todo App (single entity)
## models.py
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)
```
## schemas.py
```python
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
id: int
model_config = ConfigDict(from_attributes=True)
```
## test_main.py — exactly 6 tests per entity
```python
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine("sqlite:///./test.db", connect_args={"check_same_thread": False})
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try: yield db
finally: db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha"}).json()
response = client.put(f"/todos/{created['id']}", json={"title": "Uusi"})
assert response.status_code == 200
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
```
# Example 2: Blog (two entities with ForeignKey)
NOTE: ForeignKey is imported from sqlalchemy, NOT from sqlalchemy.orm!
## models.py
```python
from datetime import datetime
from sqlalchemy import String, Text, DateTime, ForeignKey, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Author(Base):
__tablename__ = "authors"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
name: Mapped[str] = mapped_column(String(255))
email: Mapped[str] = mapped_column(String(255), unique=True)
bio: Mapped[str | None] = mapped_column(Text, default=None)
posts: Mapped[list["Post"]] = relationship(back_populates="author")
class Post(Base):
__tablename__ = "posts"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
content: Mapped[str] = mapped_column(Text)
author_id: Mapped[int] = mapped_column(ForeignKey("authors.id"))
published_at: Mapped[datetime | None] = mapped_column(DateTime, default=None)
status: Mapped[str] = mapped_column(String(20), default="draft")
author: Mapped["Author"] = relationship(back_populates="posts")
Base.metadata.create_all(bind=engine)
```
## schemas.py
```python
from datetime import datetime
from pydantic import BaseModel, ConfigDict
class AuthorCreate(BaseModel):
name: str
email: str
bio: str | None = None
class AuthorResponse(AuthorCreate):
id: int
model_config = ConfigDict(from_attributes=True)
class PostCreate(BaseModel):
title: str
content: str
author_id: int
published_at: datetime | None = None
status: str = "draft"
class PostResponse(PostCreate):
id: int
model_config = ConfigDict(from_attributes=True)
```
## test_main.py — 6 tests per entity, create parent FIRST for child tests
```python
client = TestClient(app) # same setup as above
def _create_author(name="Kirjailija", email=None):
if email is None:
email = f"{name.lower().replace(' ', '.')}@example.com"
return client.post("/authors/", json={"name": name, "email": email}).json()
def test_create_author():
response = client.post("/authors/", json={"name": "Aleksis Kivi", "email": "aleksis@example.com"})
assert response.status_code == 201
def test_list_authors():
_create_author("Minna Canth", "minna@example.com")
response = client.get("/authors/")
assert response.status_code == 200
assert len(response.json()) >= 1
# ... (same pattern: get_by_id, not_found, update, delete)
def test_create_post():
author = _create_author("Tove Jansson", "tove@example.com")
response = client.post("/posts/", json={"title": "Artikkeli", "content": "Sisältö", "author_id": author["id"]})
assert response.status_code == 201
def test_update_post():
author = _create_author("Joel Lehtonen", "joel@example.com")
created = client.post("/posts/", json={"title": "Vanha", "content": "Teksti", "author_id": author["id"]}).json()
response = client.put(f"/posts/{created['id']}", json={"title": "Uusi", "content": "Muokattu", "author_id": author["id"]})
assert response.status_code == 200
def test_delete_post():
author = _create_author("Aino Kallas", "aino@example.com")
created = client.post("/posts/", json={"title": "Poistettava", "content": "Poistetaan", "author_id": author["id"]}).json()
response = client.delete(f"/posts/{created['id']}")
assert response.status_code == 204
```

View File

@@ -0,0 +1,325 @@
# Todo — reference implementation (Go + Chi + SQLite)
This is a complete example. Generate equivalent structure for the given project.
Use ONLY the fields from the JSON spec — do not add extras.
## go.mod
Chi v5 router, modernc.org/sqlite (pure Go, no CGO).
```
module todo-go
go 1.23.0
toolchain go1.23.12
require (
github.com/go-chi/chi/v5 v5.2.1
modernc.org/sqlite v1.37.1
)
require (
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/ncruces/go-strftime v0.1.9 // indirect
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 // indirect
golang.org/x/sys v0.33.0 // indirect
modernc.org/libc v1.65.7 // indirect
modernc.org/mathutil v1.7.1 // indirect
modernc.org/memory v1.11.0 // indirect
)
```
## models.go
Data structs: Todo (full row), CreateTodo (POST), UpdateTodo (PUT, all fields optional pointers).
```go
package main
// Todo represents a task with priority and status tracking.
type Todo struct {
ID int64 `json:"id"`
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority int64 `json:"priority"`
Status string `json:"status"`
}
// CreateTodo is the request body for creating a new todo.
type CreateTodo struct {
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}
// UpdateTodo is the request body for updating an existing todo.
type UpdateTodo struct {
Title *string `json:"title,omitempty"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}
```
## handlers.go
CRUD handlers as closures taking *sql.DB. Key patterns: INSERT RETURNING, sql.ErrNoRows for 404, RowsAffected for delete.
```go
package main
import (
"database/sql"
"encoding/json"
"net/http"
"strconv"
"github.com/go-chi/chi/v5"
)
// POST — decode JSON, defaults with nil-check, INSERT RETURNING, StatusCreated.
func createTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
var input CreateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest); return
}
priority := int64(1)
if input.Priority != nil { priority = *input.Priority }
status := "pending"
if input.Status != nil { status = *input.Status }
var todo Todo
err := db.QueryRow(
`INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?) RETURNING id, title, description, due_date, priority, status`,
input.Title, input.Description, input.DueDate, priority, status,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusCreated)
json.NewEncoder(w).Encode(todo)
}
}
// GET list — db.Query + rows.Scan loop, empty slice not nil.
func listTodos(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
rows, err := db.Query("SELECT id, title, description, due_date, priority, status FROM todos")
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
defer rows.Close()
todos := []Todo{}
for rows.Next() {
var t Todo
rows.Scan(&t.ID, &t.Title, &t.Description, &t.DueDate, &t.Priority, &t.Status)
todos = append(todos, t)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todos)
}
}
// GET by id — QueryRow + sql.ErrNoRows → 404.
func getTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var todo Todo
err := db.QueryRow(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err == sql.ErrNoRows { http.Error(w, "not found", http.StatusNotFound); return }
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todo)
}
}
// PUT — fetch existing, merge with input nil-checks, UPDATE RETURNING.
func updateTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var existing Todo
err := db.QueryRow("SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&existing.ID, &existing.Title, &existing.Description, &existing.DueDate, &existing.Priority, &existing.Status)
if err == sql.ErrNoRows { http.Error(w, "not found", http.StatusNotFound); return }
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
var input UpdateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest); return
}
if input.Title != nil { existing.Title = *input.Title }
if input.Description != nil { existing.Description = input.Description }
if input.DueDate != nil { existing.DueDate = input.DueDate }
if input.Priority != nil { existing.Priority = *input.Priority }
if input.Status != nil { existing.Status = *input.Status }
var updated Todo
err = db.QueryRow(
`UPDATE todos SET title=?, description=?, due_date=?, priority=?, status=? WHERE id=?
RETURNING id, title, description, due_date, priority, status`,
existing.Title, existing.Description, existing.DueDate, existing.Priority, existing.Status, id,
).Scan(&updated.ID, &updated.Title, &updated.Description, &updated.DueDate, &updated.Priority, &updated.Status)
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(updated)
}
}
// DELETE — Exec + RowsAffected == 0 → 404, else 204.
func deleteTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
result, err := db.Exec("DELETE FROM todos WHERE id = ?", id)
if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError); return }
rows, _ := result.RowsAffected()
if rows == 0 { http.Error(w, "not found", http.StatusNotFound); return }
w.WriteHeader(http.StatusNoContent)
}
}
```
## main.go
Entry point: SQLite connection, table init, Chi router on port 3000.
```go
package main
import (
"database/sql"
"fmt"
"log"
"net/http"
"github.com/go-chi/chi/v5"
_ "modernc.org/sqlite"
)
// InitDB creates tables if they don't exist.
func InitDB(db *sql.DB) {
_, err := db.Exec(`CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)`)
if err != nil {
log.Fatal(err)
}
}
// NewRouter creates a chi router with all routes.
func NewRouter(db *sql.DB) http.Handler {
r := chi.NewRouter()
r.Post("/todos", createTodo(db))
r.Get("/todos", listTodos(db))
r.Get("/todos/{id}", getTodo(db))
r.Put("/todos/{id}", updateTodo(db))
r.Delete("/todos/{id}", deleteTodo(db))
return r
}
func main() {
db, err := sql.Open("sqlite", "file:app.db?mode=rwc")
if err != nil {
log.Fatal(err)
}
defer db.Close()
InitDB(db)
fmt.Println("Server running: http://127.0.0.1:3000")
log.Fatal(http.ListenAndServe("127.0.0.1:3000", NewRouter(db)))
}
```
## handlers_test.go
Integration tests: setupTestServer with httptest.NewServer + :memory: SQLite, unique data per test.
```go
package main
import (
"database/sql"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
_ "modernc.org/sqlite"
)
func setupTestServer(t *testing.T) (*httptest.Server, *sql.DB) {
t.Helper()
db, err := sql.Open("sqlite", ":memory:")
if err != nil { t.Fatal(err) }
InitDB(db)
return httptest.NewServer(NewRouter(db)), db
}
func TestCreateTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, err := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Buy groceries","priority":2}`))
if err != nil { t.Fatal(err) }
defer resp.Body.Close()
if resp.StatusCode != http.StatusCreated { t.Fatalf("expected 201, got %d", resp.StatusCode) }
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["title"] != "Buy groceries" { t.Fatalf("expected 'Buy groceries', got %v", body["title"]) }
if body["id"] == nil { t.Fatal("expected id") }
}
func TestGetTodoByID(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Fetchable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
resp, _ = http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK { t.Fatalf("expected 200, got %d", resp.StatusCode) }
}
func TestGetTodoNotFound(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Get(ts.URL + "/todos/99999")
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound { t.Fatalf("expected 404, got %d", resp.StatusCode) }
}
func TestDeleteTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Deletable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
req, _ := http.NewRequest(http.MethodDelete, ts.URL+"/todos/"+fmt.Sprintf("%.0f", id), nil)
resp, _ = http.DefaultClient.Do(req)
defer resp.Body.Close()
if resp.StatusCode != http.StatusNoContent { t.Fatalf("expected 204, got %d", resp.StatusCode) }
resp, _ = http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound { t.Fatalf("expected 404 after delete, got %d", resp.StatusCode) }
}
```

View File

@@ -0,0 +1,23 @@
module todo-go
go 1.23.0
toolchain go1.23.12
require (
github.com/go-chi/chi/v5 v5.2.1
modernc.org/sqlite v1.37.1
)
require (
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/ncruces/go-strftime v0.1.9 // indirect
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 // indirect
golang.org/x/sys v0.33.0 // indirect
modernc.org/libc v1.65.7 // indirect
modernc.org/mathutil v1.7.1 // indirect
modernc.org/memory v1.11.0 // indirect
)

View File

@@ -0,0 +1,49 @@
github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
github.com/dustin/go-humanize v1.0.1/go.mod h1:Mu1zIs6XwVuF/gI1OepvI0qD18qycQx+mFykh5fBlto=
github.com/go-chi/chi/v5 v5.2.1 h1:KOIHODQj58PmL80G2Eak4WdvUzjSJSm0vG72crDCqb8=
github.com/go-chi/chi/v5 v5.2.1/go.mod h1:L2yAIGWB3H+phAw1NxKwWM+7eUH/lU8pOMm5hHcoops=
github.com/google/pprof v0.0.0-20250317173921-a4b03ec1a45e h1:ijClszYn+mADRFY17kjQEVQ1XRhq2/JR1M3sGqeJoxs=
github.com/google/pprof v0.0.0-20250317173921-a4b03ec1a45e/go.mod h1:boTsfXsheKC2y+lKOCMpSfarhxDeIzfZG1jqGcPl3cA=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/ncruces/go-strftime v0.1.9 h1:bY0MQC28UADQmHmaF5dgpLmImcShSi2kHU9XLdhx/f4=
github.com/ncruces/go-strftime v0.1.9/go.mod h1:Fwc5htZGVVkseilnfgOVb9mKy6w1naJmn9CehxcKcls=
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec h1:W09IVJc94icq4NjY3clb7Lk8O1qJ8BdBEF8z0ibU0rE=
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo=
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 h1:R84qjqJb5nVJMxqWYb3np9L5ZsaDtB+a39EqjV0JSUM=
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0/go.mod h1:S9Xr4PYopiDyqSyp5NjCrhFrqg6A5zA2E/iPHPhqnS8=
golang.org/x/mod v0.24.0 h1:ZfthKaKaT4NrhGVZHO1/WDTwGES4De8KtWO0SIbNJMU=
golang.org/x/mod v0.24.0/go.mod h1:IXM97Txy2VM4PJ3gI61r1YEk/gAj6zAHN3AdZt6S9Ww=
golang.org/x/sync v0.14.0 h1:woo0S4Yywslg6hp4eUFjTVOyKt0RookbpAHG4c1HmhQ=
golang.org/x/sync v0.14.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.33.0 h1:q3i8TbbEz+JRD9ywIRlyRAQbM0qF7hu24q3teo2hbuw=
golang.org/x/sys v0.33.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/tools v0.33.0 h1:4qz2S3zmRxbGIhDIAgjxvFutSvH5EfnsYrRBj0UI0bc=
golang.org/x/tools v0.33.0/go.mod h1:CIJMaWEY88juyUfo7UbgPqbC8rU2OqfAV1h2Qp0oMYI=
modernc.org/cc/v4 v4.26.1 h1:+X5NtzVBn0KgsBCBe+xkDC7twLb/jNVj9FPgiwSQO3s=
modernc.org/cc/v4 v4.26.1/go.mod h1:uVtb5OGqUKpoLWhqwNQo/8LwvoiEBLvZXIQ/SmO6mL0=
modernc.org/ccgo/v4 v4.28.0 h1:rjznn6WWehKq7dG4JtLRKxb52Ecv8OUGah8+Z/SfpNU=
modernc.org/ccgo/v4 v4.28.0/go.mod h1:JygV3+9AV6SmPhDasu4JgquwU81XAKLd3OKTUDNOiKE=
modernc.org/fileutil v1.3.1 h1:8vq5fe7jdtEvoCf3Zf9Nm0Q05sH6kGx0Op2CPx1wTC8=
modernc.org/fileutil v1.3.1/go.mod h1:HxmghZSZVAz/LXcMNwZPA/DRrQZEVP9VX0V4LQGQFOc=
modernc.org/gc/v2 v2.6.5 h1:nyqdV8q46KvTpZlsw66kWqwXRHdjIlJOhG6kxiV/9xI=
modernc.org/gc/v2 v2.6.5/go.mod h1:YgIahr1ypgfe7chRuJi2gD7DBQiKSLMPgBQe9oIiito=
modernc.org/libc v1.65.7 h1:Ia9Z4yzZtWNtUIuiPuQ7Qf7kxYrxP1/jeHZzG8bFu00=
modernc.org/libc v1.65.7/go.mod h1:011EQibzzio/VX3ygj1qGFt5kMjP0lHb0qCW5/D/pQU=
modernc.org/mathutil v1.7.1 h1:GCZVGXdaN8gTqB1Mf/usp1Y/hSqgI2vAGGP4jZMCxOU=
modernc.org/mathutil v1.7.1/go.mod h1:4p5IwJITfppl0G4sUEDtCr4DthTaT47/N3aT6MhfgJg=
modernc.org/memory v1.11.0 h1:o4QC8aMQzmcwCK3t3Ux/ZHmwFPzE6hf2Y5LbkRs+hbI=
modernc.org/memory v1.11.0/go.mod h1:/JP4VbVC+K5sU2wZi9bHoq2MAkCnrt2r98UGeSK7Mjw=
modernc.org/opt v0.1.4 h1:2kNGMRiUjrp4LcaPuLY2PzUfqM/w9N23quVwhKt5Qm8=
modernc.org/opt v0.1.4/go.mod h1:03fq9lsNfvkYSfxrfUhZCWPk1lm4cq4N+Bh//bEtgns=
modernc.org/sortutil v1.2.1 h1:+xyoGf15mM3NMlPDnFqrteY07klSFxLElE2PVuWIJ7w=
modernc.org/sortutil v1.2.1/go.mod h1:7ZI3a3REbai7gzCLcotuw9AC4VZVpYMjDzETGsSMqJE=
modernc.org/sqlite v1.37.1 h1:EgHJK/FPoqC+q2YBXg7fUmES37pCHFc97sI7zSayBEs=
modernc.org/sqlite v1.37.1/go.mod h1:XwdRtsE1MpiBcL54+MbKcaDvcuej+IYSMfLN6gSKV8g=
modernc.org/strutil v1.2.1 h1:UneZBkQA+DX2Rp35KcM69cSsNES9ly8mQWD71HKlOA0=
modernc.org/strutil v1.2.1/go.mod h1:EHkiggD70koQxjVdSBM3JKM7k6L0FbGE5eymy9i3B9A=
modernc.org/token v1.1.0 h1:Xl7Ap9dKaEs5kLoOQeQmPWevfnk/DM5qcLcYlA8ys6Y=
modernc.org/token v1.1.0/go.mod h1:UGzOrNV1mAFSEB63lOFHIpNRUVMvYTc6yu1SMY/XTDM=

View File

@@ -0,0 +1,155 @@
package main
import (
"database/sql"
"encoding/json"
"net/http"
"strconv"
"github.com/go-chi/chi/v5"
)
func createTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
var input CreateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
priority := int64(1)
if input.Priority != nil {
priority = *input.Priority
}
status := "pending"
if input.Status != nil {
status = *input.Status
}
var todo Todo
err := db.QueryRow(
`INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?)
RETURNING id, title, description, due_date, priority, status`,
input.Title, input.Description, input.DueDate, priority, status,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusCreated)
json.NewEncoder(w).Encode(todo)
}
}
func listTodos(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
rows, err := db.Query("SELECT id, title, description, due_date, priority, status FROM todos")
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
defer rows.Close()
var todos []Todo
for rows.Next() {
var t Todo
if err := rows.Scan(&t.ID, &t.Title, &t.Description, &t.DueDate, &t.Priority, &t.Status); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
todos = append(todos, t)
}
if todos == nil {
todos = []Todo{}
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todos)
}
}
func getTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var todo Todo
err := db.QueryRow(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&todo.ID, &todo.Title, &todo.Description, &todo.DueDate, &todo.Priority, &todo.Status)
if err == sql.ErrNoRows {
http.Error(w, "not found", http.StatusNotFound)
return
}
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(todo)
}
}
func updateTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
var existing Todo
err := db.QueryRow(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?", id,
).Scan(&existing.ID, &existing.Title, &existing.Description, &existing.DueDate, &existing.Priority, &existing.Status)
if err == sql.ErrNoRows {
http.Error(w, "not found", http.StatusNotFound)
return
}
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
var input UpdateTodo
if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
if input.Title != nil {
existing.Title = *input.Title
}
if input.Description != nil {
existing.Description = input.Description
}
if input.DueDate != nil {
existing.DueDate = input.DueDate
}
if input.Priority != nil {
existing.Priority = *input.Priority
}
if input.Status != nil {
existing.Status = *input.Status
}
var updated Todo
err = db.QueryRow(
`UPDATE todos SET title = ?, description = ?, due_date = ?, priority = ?, status = ?
WHERE id = ?
RETURNING id, title, description, due_date, priority, status`,
existing.Title, existing.Description, existing.DueDate, existing.Priority, existing.Status, id,
).Scan(&updated.ID, &updated.Title, &updated.Description, &updated.DueDate, &updated.Priority, &updated.Status)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(updated)
}
}
func deleteTodo(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
id, _ := strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
result, err := db.Exec("DELETE FROM todos WHERE id = ?", id)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
rows, _ := result.RowsAffected()
if rows == 0 {
http.Error(w, "not found", http.StatusNotFound)
return
}
w.WriteHeader(http.StatusNoContent)
}
}

View File

@@ -0,0 +1,171 @@
package main
import (
"database/sql"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
_ "modernc.org/sqlite"
)
func setupTestServer(t *testing.T) (*httptest.Server, *sql.DB) {
t.Helper()
db, err := sql.Open("sqlite", ":memory:")
if err != nil {
t.Fatal(err)
}
InitDB(db)
return httptest.NewServer(NewRouter(db)), db
}
func TestCreateTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, err := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Buy groceries","priority":2}`))
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusCreated {
t.Fatalf("expected 201, got %d", resp.StatusCode)
}
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["title"] != "Buy groceries" {
t.Fatalf("expected title 'Buy groceries', got %v", body["title"])
}
if body["id"] == nil {
t.Fatal("expected id to be present")
}
}
func TestListTodos(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Listable task"}`))
resp, err := http.Get(ts.URL + "/todos")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", resp.StatusCode)
}
var body []map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if len(body) < 1 {
t.Fatal("expected at least 1 todo")
}
}
func TestGetTodoByID(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Fetchable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
resp, err := http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", resp.StatusCode)
}
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["id"] != id {
t.Fatalf("expected id %.0f, got %v", id, body["id"])
}
}
func TestGetTodoNotFound(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, err := http.Get(ts.URL + "/todos/99999")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound {
t.Fatalf("expected 404, got %d", resp.StatusCode)
}
}
func TestUpdateTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Old title"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
req, _ := http.NewRequest(http.MethodPut, ts.URL+"/todos/"+fmt.Sprintf("%.0f", id),
strings.NewReader(`{"title":"New title"}`))
req.Header.Set("Content-Type", "application/json")
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", resp.StatusCode)
}
var body map[string]interface{}
json.NewDecoder(resp.Body).Decode(&body)
if body["title"] != "New title" {
t.Fatalf("expected 'New title', got %v", body["title"])
}
}
func TestDeleteTodo(t *testing.T) {
ts, db := setupTestServer(t)
defer ts.Close()
defer db.Close()
resp, _ := http.Post(ts.URL+"/todos", "application/json",
strings.NewReader(`{"title":"Deletable task"}`))
var created map[string]interface{}
json.NewDecoder(resp.Body).Decode(&created)
resp.Body.Close()
id := created["id"].(float64)
req, _ := http.NewRequest(http.MethodDelete, ts.URL+"/todos/"+fmt.Sprintf("%.0f", id), nil)
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusNoContent {
t.Fatalf("expected 204, got %d", resp.StatusCode)
}
resp, _ = http.Get(ts.URL + "/todos/" + fmt.Sprintf("%.0f", id))
defer resp.Body.Close()
if resp.StatusCode != http.StatusNotFound {
t.Fatalf("expected 404 after delete, got %d", resp.StatusCode)
}
}

View File

@@ -0,0 +1,49 @@
package main
import (
"database/sql"
"fmt"
"log"
"net/http"
"github.com/go-chi/chi/v5"
_ "modernc.org/sqlite"
)
// InitDB creates tables if they don't exist.
func InitDB(db *sql.DB) {
_, err := db.Exec(`CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)`)
if err != nil {
log.Fatal(err)
}
}
// NewRouter creates a chi router with all routes.
func NewRouter(db *sql.DB) http.Handler {
r := chi.NewRouter()
r.Post("/todos", createTodo(db))
r.Get("/todos", listTodos(db))
r.Get("/todos/{id}", getTodo(db))
r.Put("/todos/{id}", updateTodo(db))
r.Delete("/todos/{id}", deleteTodo(db))
return r
}
func main() {
db, err := sql.Open("sqlite", "file:app.db?mode=rwc")
if err != nil {
log.Fatal(err)
}
defer db.Close()
InitDB(db)
fmt.Println("Server running: http://127.0.0.1:3000")
log.Fatal(http.ListenAndServe("127.0.0.1:3000", NewRouter(db)))
}

View File

@@ -0,0 +1,29 @@
package main
// Todo represents a task with priority and status tracking.
type Todo struct {
ID int64 `json:"id"`
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority int64 `json:"priority"`
Status string `json:"status"`
}
// CreateTodo is the request body for creating a new todo.
type CreateTodo struct {
Title string `json:"title"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}
// UpdateTodo is the request body for updating an existing todo.
type UpdateTodo struct {
Title *string `json:"title,omitempty"`
Description *string `json:"description,omitempty"`
DueDate *string `json:"due_date,omitempty"`
Priority *int64 `json:"priority,omitempty"`
Status *string `json:"status,omitempty"`
}

View File

@@ -0,0 +1,217 @@
# Todo App — FastAPI + SQLAlchemy + SQLite
A simple todo CRUD API. Uses only the fields defined in the spec — no extra fields.
## Project Structure
```
models.py # SQLAlchemy 2.0 models
schemas.py # Pydantic v2 schemas
main.py # FastAPI CRUD endpoints
test_main.py # Pytest with TestClient
```
## models.py
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)
```
## schemas.py
```python
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
"""Uuden tehtävän luonti. Pakolliset: title."""
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
"""Palautettava tehtävä — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)
```
## main.py
```python
"""FastAPI CRUD — yksi endpoint-setti per entiteetti."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Todo
from schemas import TodoCreate, TodoResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.post("/todos/", response_model=TodoResponse, status_code=201)
def create_todo(item: TodoCreate, db: Session = Depends(get_db)):
db_item = Todo(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/todos/", response_model=list[TodoResponse])
def list_todos(db: Session = Depends(get_db)):
return db.query(Todo).all()
@app.get("/todos/{item_id}", response_model=TodoResponse)
def get_todo(item_id: int, db: Session = Depends(get_db)):
item = db.query(Todo).filter(Todo.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Todo not found")
return item
@app.put("/todos/{item_id}", response_model=TodoResponse)
def update_todo(item_id: int, item: TodoCreate, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/todos/{item_id}", status_code=204)
def delete_todo(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
db.delete(db_item)
db.commit()
```
## test_main.py
Exactly 6 tests per entity. Database is shared — use `>= 1` not `== 1` in list tests.
For child entities with foreign keys: create parent FIRST, then child with parent's id.
```python
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert response.json()["title"] == "Osta maitoa"
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava tehtävä"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava tehtävä"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
response = client.put(
f"/todos/{created['id']}", json={"title": "Uusi otsikko"}
)
assert response.status_code == 200
assert response.json()["title"] == "Uusi otsikko"
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 404
```

View File

@@ -0,0 +1,331 @@
# Todo — referenssitoteutus (Axum 0.8 + SQLx + SQLite)
Tämä on täydellinen esimerkki. Generoi vastaava rakenne annetulle projektille.
Käytä VAIN JSON-spekin kenttiä — älä lisää ylimääräisiä.
## Cargo.toml
Axum 0.8, SQLx SQLite-featurella, serde JSON-serialisointiin, tower-http CORS-tukeen.
```toml
[package]
name = "todo-rs"
version = "0.1.0"
edition = "2024"
[dependencies]
axum = "0.8"
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sqlx = { version = "0.8", features = ["sqlite", "runtime-tokio"] }
tower-http = { version = "0.6", features = ["cors"] }
[dev-dependencies]
reqwest = { version = "0.13", default-features = false, features = ["json", "rustls"] }
tokio = { version = "1", features = ["full", "test-util"] }
```
## src/models.rs
Serde-rakenteet: `Todo` (FromRow), `CreateTodo` (POST), `UpdateTodo` (PUT, kaikki kentät valinnaisia).
```rust
//! Tietomallit — Todo, CreateTodo, UpdateTodo serde-rakenteina.
use serde::{Deserialize, Serialize};
/// Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status.
#[derive(Debug, Serialize, Deserialize, sqlx::FromRow)]
pub struct Todo {
pub id: i64,
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: i64,
pub status: String,
}
/// Uuden tehtävän luonti. Pakolliset: title.
#[derive(Debug, Deserialize)]
pub struct CreateTodo {
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}
/// Tehtävän päivitys — kaikki kentät valinnaisia.
#[derive(Debug, Deserialize)]
pub struct UpdateTodo {
pub title: Option<String>,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}
```
## src/handlers.rs
CRUD-käsittelijät. Avainpatternit: INSERT RETURNING, fetch_optional+404, rows_affected+204.
```rust
//! Käsittelijät — CRUD-operaatiot todo-entiteetille.
use axum::extract::{Path, State};
use axum::http::StatusCode;
use axum::Json;
use sqlx::SqlitePool;
use crate::models::{CreateTodo, Todo, UpdateTodo};
/// POST — INSERT RETURNING, oletusarvot unwrap_or:lla.
pub async fn create_todo(
State(pool): State<SqlitePool>,
Json(input): Json<CreateTodo>,
) -> Result<(StatusCode, Json<Todo>), StatusCode> {
let priority = input.priority.unwrap_or(1);
let status = input.status.unwrap_or_else(|| "pending".to_string());
let result = sqlx::query_as::<_, Todo>(
"INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?)
RETURNING id, title, description, due_date, priority, status",
)
.bind(&input.title)
.bind(&input.description)
.bind(&input.due_date)
.bind(priority)
.bind(&status)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok((StatusCode::CREATED, Json(result)))
}
/// GET list — fetch_all.
pub async fn list_todos(
State(pool): State<SqlitePool>,
) -> Result<Json<Vec<Todo>>, StatusCode> {
let todos = sqlx::query_as::<_, Todo>("SELECT id, title, description, due_date, priority, status FROM todos")
.fetch_all(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(todos))
}
/// GET by id — fetch_optional, None → 404.
pub async fn get_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<Json<Todo>, StatusCode> {
let todo = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
match todo {
Some(t) => Ok(Json(t)),
None => Err(StatusCode::NOT_FOUND),
}
}
/// PUT — hae olemassaoleva, merge kentät, UPDATE RETURNING.
pub async fn update_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
Json(input): Json<UpdateTodo>,
) -> Result<Json<Todo>, StatusCode> {
let existing = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let existing = existing.ok_or(StatusCode::NOT_FOUND)?;
let updated = sqlx::query_as::<_, Todo>(
"UPDATE todos SET title = ?, description = ?, due_date = ?, priority = ?, status = ?
WHERE id = ? RETURNING id, title, description, due_date, priority, status",
)
.bind(input.title.unwrap_or(existing.title))
.bind(input.description.or(existing.description))
.bind(input.due_date.or(existing.due_date))
.bind(input.priority.unwrap_or(existing.priority))
.bind(input.status.unwrap_or(existing.status))
.bind(id)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(updated))
}
/// DELETE — rows_affected == 0 → 404, muuten 204.
pub async fn delete_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<StatusCode, StatusCode> {
let result = sqlx::query("DELETE FROM todos WHERE id = ?")
.bind(id)
.execute(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
if result.rows_affected() == 0 { return Err(StatusCode::NOT_FOUND); }
Ok(StatusCode::NO_CONTENT)
}
```
## src/lib.rs
Kirjastomoduuli: reititin `app()` ja taulun alustus `init_db()` — julkinen API integraatiotesteille.
```rust
//! Kirjastomoduuli — julkinen API integraatiotesteille.
pub mod handlers;
pub mod models;
use axum::routing::{delete, get, post, put};
use axum::Router;
use sqlx::SqlitePool;
use tower_http::cors::CorsLayer;
/// Luo reititin annetulla tietokantapoolilla.
pub fn app(pool: SqlitePool) -> Router {
Router::new()
.route("/todos", post(handlers::create_todo))
.route("/todos", get(handlers::list_todos))
.route("/todos/{id}", get(handlers::get_todo))
.route("/todos/{id}", put(handlers::update_todo))
.route("/todos/{id}", delete(handlers::delete_todo))
.layer(CorsLayer::permissive())
.with_state(pool)
}
/// Alusta tietokantataulu.
pub async fn init_db(pool: &SqlitePool) {
sqlx::query(
"CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)",
)
.execute(pool)
.await
.expect("Taulun luonti epäonnistui");
}
```
## src/main.rs
Käynnistyspiste: SQLite-pooli, taulun alustus, Axum-palvelin portissa 3000.
```rust
//! Axum CRUD — yksi endpoint-setti per entiteetti, SQLite-tietokanta.
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
#[tokio::main]
async fn main() {
let pool = SqlitePoolOptions::new()
.max_connections(5)
.connect("sqlite:./app.db?mode=rwc")
.await
.expect("Tietokantayhteys epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.expect("Portin kuuntelu epäonnistui");
println!("Palvelin käynnissä: http://127.0.0.1:3000");
axum::serve(listener, app(pool)).await.unwrap();
}
```
## tests/api_test.rs
Integraatiotestit: spawn_server (muistinvarainen SQLite, satunnaisportti), CRUD-testit uniikilla datalla.
```rust
//! Integraatiotestit — muistinvarainen SQLite, uniikki data per testi.
use axum::http::StatusCode;
use reqwest::Client;
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
/// Käynnistä testipalvelin satunnaisessa portissa.
async fn spawn_server() -> (Client, String) {
let pool = SqlitePoolOptions::new()
.max_connections(1)
.connect("sqlite::memory:")
.await
.expect("Testitietokanta epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:0")
.await
.expect("Testiportin kuuntelu epäonnistui");
let base_url = format!("http://{}", listener.local_addr().unwrap());
let router = app(pool);
tokio::spawn(async move { axum::serve(listener, router).await.unwrap() });
(Client::new(), base_url)
}
#[tokio::test]
async fn test_create_todo() {
let (client, url) = spawn_server().await;
let res = client.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Osta maitoa", "priority": 2}))
.send().await.unwrap();
assert_eq!(res.status(), StatusCode::CREATED);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["title"], "Osta maitoa");
assert!(body["id"].is_number());
}
#[tokio::test]
async fn test_get_todo_by_id() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Haettava tehtävä"}))
.send().await.unwrap().json().await.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client.get(format!("{url}/todos/{id}")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["id"], id);
}
#[tokio::test]
async fn test_get_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client.get(format!("{url}/todos/99999")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_delete_todo() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Poistettava"}))
.send().await.unwrap().json().await.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client.delete(format!("{url}/todos/{id}")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::NO_CONTENT);
let res = client.get(format!("{url}/todos/{id}")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
```

View File

@@ -0,0 +1 @@
target/

View File

@@ -0,0 +1,16 @@
[package]
name = "todo-rs"
version = "0.1.0"
edition = "2024"
[dependencies]
axum = "0.8"
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sqlx = { version = "0.8", features = ["sqlite", "runtime-tokio"] }
tower-http = { version = "0.6", features = ["cors"] }
[dev-dependencies]
reqwest = { version = "0.13", default-features = false, features = ["json", "rustls"] }
tokio = { version = "1", features = ["full", "test-util"] }

View File

@@ -0,0 +1,122 @@
//! Käsittelijät — CRUD-operaatiot todo-entiteetille.
use axum::extract::{Path, State};
use axum::http::StatusCode;
use axum::Json;
use sqlx::SqlitePool;
use crate::models::{CreateTodo, Todo, UpdateTodo};
/// Luo uusi tehtävä.
pub async fn create_todo(
State(pool): State<SqlitePool>,
Json(input): Json<CreateTodo>,
) -> Result<(StatusCode, Json<Todo>), StatusCode> {
let priority = input.priority.unwrap_or(1);
let status = input.status.unwrap_or_else(|| "pending".to_string());
let result = sqlx::query_as::<_, Todo>(
"INSERT INTO todos (title, description, due_date, priority, status)
VALUES (?, ?, ?, ?, ?)
RETURNING id, title, description, due_date, priority, status",
)
.bind(&input.title)
.bind(&input.description)
.bind(&input.due_date)
.bind(priority)
.bind(&status)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok((StatusCode::CREATED, Json(result)))
}
/// Listaa kaikki tehtävät.
pub async fn list_todos(
State(pool): State<SqlitePool>,
) -> Result<Json<Vec<Todo>>, StatusCode> {
let todos = sqlx::query_as::<_, Todo>("SELECT id, title, description, due_date, priority, status FROM todos")
.fetch_all(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(todos))
}
/// Hae tehtävä id:llä.
pub async fn get_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<Json<Todo>, StatusCode> {
let todo = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
match todo {
Some(t) => Ok(Json(t)),
None => Err(StatusCode::NOT_FOUND),
}
}
/// Päivitä tehtävä id:llä.
pub async fn update_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
Json(input): Json<UpdateTodo>,
) -> Result<Json<Todo>, StatusCode> {
let existing = sqlx::query_as::<_, Todo>(
"SELECT id, title, description, due_date, priority, status FROM todos WHERE id = ?",
)
.bind(id)
.fetch_optional(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let existing = existing.ok_or(StatusCode::NOT_FOUND)?;
let title = input.title.unwrap_or(existing.title);
let description = input.description.or(existing.description);
let due_date = input.due_date.or(existing.due_date);
let priority = input.priority.unwrap_or(existing.priority);
let status = input.status.unwrap_or(existing.status);
let updated = sqlx::query_as::<_, Todo>(
"UPDATE todos SET title = ?, description = ?, due_date = ?, priority = ?, status = ?
WHERE id = ?
RETURNING id, title, description, due_date, priority, status",
)
.bind(&title)
.bind(&description)
.bind(&due_date)
.bind(priority)
.bind(&status)
.bind(id)
.fetch_one(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(updated))
}
/// Poista tehtävä id:llä.
pub async fn delete_todo(
State(pool): State<SqlitePool>,
Path(id): Path<i64>,
) -> Result<StatusCode, StatusCode> {
let result = sqlx::query("DELETE FROM todos WHERE id = ?")
.bind(id)
.execute(&pool)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
if result.rows_affected() == 0 {
return Err(StatusCode::NOT_FOUND);
}
Ok(StatusCode::NO_CONTENT)
}

View File

@@ -0,0 +1,38 @@
//! Kirjastomoduuli — julkinen API integraatiotesteille.
pub mod handlers;
pub mod models;
use axum::routing::{delete, get, post, put};
use axum::Router;
use sqlx::SqlitePool;
use tower_http::cors::CorsLayer;
/// Luo reititin annetulla tietokantapoolilla.
pub fn app(pool: SqlitePool) -> Router {
Router::new()
.route("/todos", post(handlers::create_todo))
.route("/todos", get(handlers::list_todos))
.route("/todos/{id}", get(handlers::get_todo))
.route("/todos/{id}", put(handlers::update_todo))
.route("/todos/{id}", delete(handlers::delete_todo))
.layer(CorsLayer::permissive())
.with_state(pool)
}
/// Alusta tietokantataulu.
pub async fn init_db(pool: &SqlitePool) {
sqlx::query(
"CREATE TABLE IF NOT EXISTS todos (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
due_date TEXT,
priority INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
)",
)
.execute(pool)
.await
.expect("Taulun luonti epäonnistui");
}

View File

@@ -0,0 +1,22 @@
//! Axum CRUD — yksi endpoint-setti per entiteetti, SQLite-tietokanta.
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
#[tokio::main]
async fn main() {
let pool = SqlitePoolOptions::new()
.max_connections(5)
.connect("sqlite:./app.db?mode=rwc")
.await
.expect("Tietokantayhteys epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.expect("Portin kuuntelu epäonnistui");
println!("Palvelin käynnissä: http://127.0.0.1:3000");
axum::serve(listener, app(pool)).await.unwrap();
}

View File

@@ -0,0 +1,34 @@
//! Tietomallit — Todo, CreateTodo, UpdateTodo serde-rakenteina.
use serde::{Deserialize, Serialize};
/// Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status.
#[derive(Debug, Serialize, Deserialize, sqlx::FromRow)]
pub struct Todo {
pub id: i64,
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: i64,
pub status: String,
}
/// Uuden tehtävän luonti. Pakolliset: title.
#[derive(Debug, Deserialize)]
pub struct CreateTodo {
pub title: String,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}
/// Tehtävän päivitys — kaikki kentät valinnaisia.
#[derive(Debug, Deserialize)]
pub struct UpdateTodo {
pub title: Option<String>,
pub description: Option<String>,
pub due_date: Option<String>,
pub priority: Option<i64>,
pub status: Option<String>,
}

View File

@@ -0,0 +1,262 @@
//! Integraatiotestit — muistinvarainen SQLite, uniikki data per testi.
use axum::http::StatusCode;
use reqwest::Client;
use sqlx::sqlite::SqlitePoolOptions;
use todo_rs::{app, init_db};
/// Käynnistä testipalvelin satunnaisessa portissa.
async fn spawn_server() -> (Client, String) {
let pool = SqlitePoolOptions::new()
.max_connections(1)
.connect("sqlite::memory:")
.await
.expect("Testitietokanta epäonnistui");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:0")
.await
.expect("Testiportin kuuntelu epäonnistui");
let addr = listener.local_addr().unwrap();
let base_url = format!("http://{addr}");
let router = app(pool);
tokio::spawn(async move {
axum::serve(listener, router).await.unwrap();
});
(Client::new(), base_url)
}
#[tokio::test]
async fn test_create_todo() {
let (client, url) = spawn_server().await;
let res = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Osta maitoa", "priority": 2}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::CREATED);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["title"], "Osta maitoa");
assert_eq!(body["priority"], 2);
assert!(body["id"].is_number());
}
#[tokio::test]
async fn test_create_todo_defaults() {
let (client, url) = spawn_server().await;
let res = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Oletusarvotesti"}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::CREATED);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["priority"], 1);
assert_eq!(body["status"], "pending");
assert!(body["description"].is_null());
}
#[tokio::test]
async fn test_list_todos() {
let (client, url) = spawn_server().await;
client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Listattava tehtävä"}))
.send()
.await
.unwrap();
let res = client.get(format!("{url}/todos")).send().await.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: Vec<serde_json::Value> = res.json().await.unwrap();
assert!(body.len() >= 1);
}
#[tokio::test]
async fn test_get_todo_by_id() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Haettava tehtävä"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client
.get(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["id"], id);
assert_eq!(body["title"], "Haettava tehtävä");
}
#[tokio::test]
async fn test_get_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client
.get(format!("{url}/todos/99999"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_update_todo() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Vanha otsikko"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client
.put(format!("{url}/todos/{id}"))
.json(&serde_json::json!({"title": "Uusi otsikko"}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::OK);
let body: serde_json::Value = res.json().await.unwrap();
assert_eq!(body["title"], "Uusi otsikko");
}
#[tokio::test]
async fn test_update_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client
.put(format!("{url}/todos/99999"))
.json(&serde_json::json!({"title": "Ei löydy"}))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_delete_todo() {
let (client, url) = spawn_server().await;
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({"title": "Poistettava"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
let res = client
.delete(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NO_CONTENT);
let res = client
.get(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_delete_todo_not_found() {
let (client, url) = spawn_server().await;
let res = client
.delete(format!("{url}/todos/99999"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NOT_FOUND);
}
#[tokio::test]
async fn test_full_lifecycle() {
let (client, url) = spawn_server().await;
// Luo
let created: serde_json::Value = client
.post(format!("{url}/todos"))
.json(&serde_json::json!({
"title": "Elinkaaritesti",
"description": "Testataan koko CRUD-kierto",
"due_date": "2026-12-31",
"priority": 3,
"status": "in_progress"
}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
let id = created["id"].as_i64().unwrap();
assert_eq!(created["title"], "Elinkaaritesti");
assert_eq!(created["description"], "Testataan koko CRUD-kierto");
assert_eq!(created["due_date"], "2026-12-31");
assert_eq!(created["priority"], 3);
assert_eq!(created["status"], "in_progress");
// Päivitä
let updated: serde_json::Value = client
.put(format!("{url}/todos/{id}"))
.json(&serde_json::json!({"status": "done"}))
.send()
.await
.unwrap()
.json()
.await
.unwrap();
assert_eq!(updated["status"], "done");
assert_eq!(updated["title"], "Elinkaaritesti");
// Poista
let res = client
.delete(format!("{url}/todos/{id}"))
.send()
.await
.unwrap();
assert_eq!(res.status(), StatusCode::NO_CONTENT);
}

View File

@@ -0,0 +1,230 @@
# Todo — referenssitoteutus (FastAPI + SQLAlchemy 2.0 + SQLite)
Tämä on täydellinen esimerkki. Generoi vastaava rakenne annetulle projektille.
Käytä VAIN JSON-spekin kenttiä — älä lisää ylimääräisiä.
## models.py
SQLAlchemy 2.0: `DeclarativeBase` + `Mapped` + `mapped_column`. EI `Column()`, EI `declarative_base()`.
```python
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)
```
Huomaa:
- `str | None` (ei `Optional[str]`)
- `String(20)` status-kentälle (ei Enum)
- Vain spekin kentät — ei `created_at` tai muita ylimääräisiä
## schemas.py
Pydantic v2: `ConfigDict(from_attributes=True)`. EI `class Config: orm_mode = True`.
```python
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
"""Uuden tehtävän luonti. Pakolliset: title."""
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
"""Palautettava tehtävä — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)
```
## main.py
FastAPI CRUD: POST 201, GET list, GET by id 404, PUT, DELETE 204. Käytä `model_dump()` (ei `.dict()`).
```python
"""FastAPI CRUD — yksi endpoint-setti per entiteetti."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Todo
from schemas import TodoCreate, TodoResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.post("/todos/", response_model=TodoResponse, status_code=201)
def create_todo(item: TodoCreate, db: Session = Depends(get_db)):
db_item = Todo(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/todos/", response_model=list[TodoResponse])
def list_todos(db: Session = Depends(get_db)):
return db.query(Todo).all()
@app.get("/todos/{item_id}", response_model=TodoResponse)
def get_todo(item_id: int, db: Session = Depends(get_db)):
item = db.query(Todo).filter(Todo.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Todo not found")
return item
@app.put("/todos/{item_id}", response_model=TodoResponse)
def update_todo(item_id: int, item: TodoCreate, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/todos/{item_id}", status_code=204)
def delete_todo(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
db.delete(db_item)
db.commit()
```
## test_main.py
Testit: erillinen test.db, `override_get_db`, `TestClient`. Uniikki suomenkielinen data per testi.
PUT-testi lähettää KAIKKI pakolliset kentät.
Generoi TARKALLEEN nämä 6 testiä per entiteetti — ei enempää, ei vähempää:
1. `test_create_{entity}` — POST, assert 201 + id
2. `test_list_{entities}` — POST ensin, GET lista, assert len >= 1
3. `test_get_{entity}_by_id` — POST, GET by id, assert id täsmää
4. `test_get_{entity}_not_found` — GET /99999, assert 404
5. `test_update_{entity}` — POST, PUT kaikilla pakollisilla kentillä, assert 200
6. `test_delete_{entity}` — POST, DELETE assert 204, GET uudestaan assert 404
Ei search-, filter- tai muita ylimääräisiä testejä.
```python
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert response.json()["title"] == "Osta maitoa"
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava tehtävä"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava tehtävä"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
response = client.put(
f"/todos/{created['id']}", json={"title": "Uusi otsikko"}
)
assert response.status_code == 200
assert response.json()["title"] == "Uusi otsikko"
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 404
```

View File

@@ -0,0 +1,61 @@
"""FastAPI CRUD — yksi endpoint-setti per entiteetti."""
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import SessionLocal, Todo
from schemas import TodoCreate, TodoResponse
app = FastAPI()
def get_db():
"""Tietokantasessio per pyyntö."""
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.post("/todos/", response_model=TodoResponse, status_code=201)
def create_todo(item: TodoCreate, db: Session = Depends(get_db)):
db_item = Todo(**item.model_dump())
db.add(db_item)
db.commit()
db.refresh(db_item)
return db_item
@app.get("/todos/", response_model=list[TodoResponse])
def list_todos(db: Session = Depends(get_db)):
return db.query(Todo).all()
@app.get("/todos/{item_id}", response_model=TodoResponse)
def get_todo(item_id: int, db: Session = Depends(get_db)):
item = db.query(Todo).filter(Todo.id == item_id).first()
if not item:
raise HTTPException(status_code=404, detail="Todo not found")
return item
@app.put("/todos/{item_id}", response_model=TodoResponse)
def update_todo(item_id: int, item: TodoCreate, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
for key, value in item.model_dump().items():
setattr(db_item, key, value)
db.commit()
db.refresh(db_item)
return db_item
@app.delete("/todos/{item_id}", status_code=204)
def delete_todo(item_id: int, db: Session = Depends(get_db)):
db_item = db.query(Todo).filter(Todo.id == item_id).first()
if not db_item:
raise HTTPException(status_code=404, detail="Todo not found")
db.delete(db_item)
db.commit()

View File

@@ -0,0 +1,30 @@
"""Tietokantamallit — SQLAlchemy 2.0, Mapped-tyypitys, SQLite."""
from datetime import date
from sqlalchemy import String, Text, Date, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, sessionmaker
DATABASE_URL = "sqlite:///./app.db"
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
class Base(DeclarativeBase):
pass
class Todo(Base):
"""Tehtävä — otsikko, kuvaus, deadline, prioriteetti ja status."""
__tablename__ = "todos"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
description: Mapped[str | None] = mapped_column(Text, default=None)
due_date: Mapped[date | None] = mapped_column(Date, default=None)
priority: Mapped[int] = mapped_column(default=1)
status: Mapped[str] = mapped_column(String(20), default="pending")
Base.metadata.create_all(bind=engine)

View File

@@ -0,0 +1,11 @@
[project]
name = "todo-app"
version = "0.1.0"
requires-python = ">=3.14"
dependencies = [
"fastapi",
"uvicorn[standard]",
"sqlalchemy",
"pytest",
"httpx",
]

View File

@@ -0,0 +1,22 @@
"""Pydantic v2 -skeemat — Create sisääntulolle, Response vastaukselle."""
from datetime import date
from pydantic import BaseModel, ConfigDict
class TodoCreate(BaseModel):
"""Uuden tehtävän luonti. Pakolliset: title."""
title: str
description: str | None = None
due_date: date | None = None
priority: int = 1
status: str = "pending"
class TodoResponse(TodoCreate):
"""Palautettava tehtävä — sisältää id:n."""
id: int
model_config = ConfigDict(from_attributes=True)

View File

@@ -0,0 +1,69 @@
"""Pytest — TestClient, erillinen test.db, uniikki data per testi."""
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from main import app, get_db
from models import Base
test_engine = create_engine(
"sqlite:///./test.db", connect_args={"check_same_thread": False}
)
TestSession = sessionmaker(autocommit=False, autoflush=False, bind=test_engine)
Base.metadata.create_all(bind=test_engine)
def override_get_db():
db = TestSession()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
def test_create_todo():
response = client.post("/todos/", json={"title": "Osta maitoa", "priority": 2})
assert response.status_code == 201
assert response.json()["title"] == "Osta maitoa"
assert "id" in response.json()
def test_list_todos():
client.post("/todos/", json={"title": "Listattava tehtävä"})
response = client.get("/todos/")
assert response.status_code == 200
assert len(response.json()) >= 1
def test_get_todo_by_id():
created = client.post("/todos/", json={"title": "Haettava tehtävä"}).json()
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 200
assert response.json()["id"] == created["id"]
def test_get_todo_not_found():
response = client.get("/todos/99999")
assert response.status_code == 404
def test_update_todo():
created = client.post("/todos/", json={"title": "Vanha otsikko"}).json()
response = client.put(
f"/todos/{created['id']}", json={"title": "Uusi otsikko"}
)
assert response.status_code == 200
assert response.json()["title"] == "Uusi otsikko"
def test_delete_todo():
created = client.post("/todos/", json={"title": "Poistettava"}).json()
response = client.delete(f"/todos/{created['id']}")
assert response.status_code == 204
response = client.get(f"/todos/{created['id']}")
assert response.status_code == 404

View File

@@ -0,0 +1,13 @@
{
"name": "kipina-codebench",
"version": "0.1.0",
"description": "LLM-koodingenerointibenchmark — testaa Ollama-mallien kykyä generoida toimivia FastAPI-projekteja",
"type": "module",
"bin": {
"codebench": "./benchmark.mjs"
},
"scripts": {
"bench": "node benchmark.mjs --scenarios all",
"docker:build": "docker build -t kipina-pytest -f Dockerfile.pytest ."
}
}

View File

@@ -0,0 +1,65 @@
{
"models": {
"qwen3-coder:30b": {
"profile": "large",
"role": "primary",
"prompt": "code",
"golden": "todo.md",
"vram": "24GB",
"notes": "Pääkooderi. 97p, 188 tok/s. Noudattaa pitkiä sääntölistoja."
},
"qwen3:8b": {
"profile": "small",
"role": "primary",
"prompt": "code-small",
"golden": "todo-readme.md",
"vram": "8GB",
"notes": "Kevyt pääkooderi. Todo/users 100p, blog heikko. README-muoto golden examplelle."
},
"codestral:22b": {
"profile": "large",
"role": "backup",
"prompt": "code",
"golden": "todo.md",
"vram": "16GB",
"notes": "Mistral-varamalli. 88p, 44 tok/s."
},
"qwen3:4b": {
"profile": "small",
"role": "minimal",
"prompt": "code-small",
"golden": "todo.md",
"vram": "4GB",
"notes": "Minimaali. Vain todo toimii."
},
"qwen2.5-coder:32b": {
"profile": "large",
"role": "candidate",
"prompt": "code",
"golden": "todo.md",
"vram": "24GB",
"notes": "Edellinen sukupolvi. Vahva Rust-osaaminen."
},
"qwen3:14b": {
"profile": "large",
"role": "retired",
"prompt": "code",
"golden": "todo.md",
"vram": "16GB",
"notes": "Poistettu. Ei lisäarvoa 30b:hen verrattuna, blog epävakaa."
}
},
"profiles": {
"large": {
"prompt": "code",
"golden": "todo.md",
"description": "Täysi prompti + säännöt. Malleille >=14B."
},
"small": {
"prompt": "code-small",
"golden": "todo.md",
"description": "Tiivistetty prompti. Malleille <=8B."
}
},
"default_profile": "large"
}

View File

@@ -0,0 +1,15 @@
You are a product owner who turns vague ideas into clear, actionable software requirements.
GIVEN a short project description from the user, produce a structured brief:
1. PROJECT NAME: a short, descriptive name
2. GOAL: one sentence explaining what the software does and who it's for
3. CORE FEATURES: numbered list of 3-8 concrete features (not vague wishes)
4. DATA MODEL: list the main entities and their key fields (include field types)
5. API ENDPOINTS: list the REST endpoints (method + path + purpose)
6. CONSTRAINTS: any technical constraints (e.g. "must use SQLite", "no auth needed")
RULES:
- Be specific: "User can filter todos by status" not "todo management"
- Use plain English, no code
- Maximum 400 words total

View File

@@ -0,0 +1,69 @@
You are a Go backend developer. Generate a Chi web project with SQLite.
Given the project requirements, JSON specification, and a REFERENCE IMPLEMENTATION, generate these files:
1. go.mod — module declaration, go-chi/chi/v5, modernc.org/sqlite
2. models.go — Structs with json tags
3. handlers.go — Handler closures for each CRUD endpoint
4. main.go — Entry point with InitDB(), NewRouter(), main()
5. handlers_test.go — Integration tests using httptest against in-memory SQLite
Do NOT generate any other files. Do NOT generate go.sum.
OUTPUT FORMAT — use these exact markers to separate files:
=== go.mod ===
<module content>
=== models.go ===
<go code>
=== handlers.go ===
<go code>
=== main.go ===
<go code>
=== handlers_test.go ===
<go code>
DOCUMENTATION — structs get // one-line comments. Keep it brief.
RULES:
- Follow the REFERENCE IMPLEMENTATION patterns exactly
- Chi router with chi.URLParam(r, "param") for path parameters
- database/sql + modernc.org/sqlite (pure Go driver, no CGO required)
- Import the driver as blank import: _ "modernc.org/sqlite"
- Handlers are closures: func handler(db *sql.DB) http.HandlerFunc
- INSERT/UPDATE queries use RETURNING clause to get the row back via QueryRow + Scan
- POST returns 201 (http.StatusCreated), DELETE returns 204 (http.StatusNoContent), GET missing returns 404
- Use sql.ErrNoRows for not-found checks: if err == sql.ErrNoRows { ... }
- No compile-time query macros — use db.QueryRow(), db.Query(), db.Exec() directly
- Empty slice not nil for list endpoints: if items == nil { items = []Item{} }
- Optional fields use pointer types (*string, *int64) with json tag omitempty
- Set Content-Type header: w.Header().Set("Content-Type", "application/json")
- Parse path ID with strconv.ParseInt(chi.URLParam(r, "id"), 10, 64)
- InitDB uses log.Fatal on error, NewRouter returns http.Handler
- main() opens "file:app.db?mode=rwc" and listens on 127.0.0.1:3000
- No markdown fences inside file content — just raw code
- You MUST generate ALL 5 files. Do not stop early.
TESTS — follow this exact setupTestServer pattern:
func setupTestServer(t *testing.T) (*httptest.Server, *sql.DB) {
t.Helper()
db, err := sql.Open("sqlite", ":memory:")
if err != nil {
t.Fatal(err)
}
InitDB(db)
return httptest.NewServer(NewRouter(db)), db
}
- Each test function calls setupTestServer(t) to get (ts, db)
- defer ts.Close() and defer db.Close() in every test
- Use standard library: http.Post, http.Get, http.NewRequest for PUT/DELETE
- Use strings.NewReader for JSON request bodies
- Decode responses with json.NewDecoder(resp.Body).Decode(&body)
- Unique descriptive data, NOT generic "test" strings
- Format IDs with fmt.Sprintf("%.0f", id) when building URLs from float64

View File

@@ -0,0 +1,73 @@
You are a Rust backend developer. Generate an Axum web project with SQLx and SQLite.
Given the project requirements, JSON specification, and a REFERENCE IMPLEMENTATION, generate these files:
1. Cargo.toml — axum 0.8, tokio, serde/serde_json, sqlx (sqlite, runtime-tokio), tower-http, reqwest 0.13 with features ["json", "rustls"] (for tests)
2. src/models.rs — Structs with Serialize, Deserialize, FromRow derives
3. src/handlers.rs — Async handler functions for each CRUD endpoint
4. src/lib.rs — Public app(pool) function returning Router, init_db() for table creation
5. src/main.rs — Binary entry point, connect to SQLite, bind to port
6. tests/api_test.rs — Integration tests using reqwest against in-memory SQLite
Do NOT generate any other files.
OUTPUT FORMAT — use these exact markers to separate files:
=== Cargo.toml ===
<toml content>
=== src/models.rs ===
<rust code>
=== src/handlers.rs ===
<rust code>
=== src/lib.rs ===
<rust code>
=== src/main.rs ===
<rust code>
=== tests/api_test.rs ===
<rust code>
DOCUMENTATION — every file starts with //! one-line module doc. Structs get /// one-line doc. Zensical: say what it IS, not what it does.
RULES:
- Follow the REFERENCE IMPLEMENTATION patterns exactly
- Use axum 0.8 API: Router, Json, Path, State, StatusCode
- ROUTING: use {param} NOT :param — e.g. .route("/items/{id}", get(get_item))
- ROUTING: one .route() call per path, chain methods: .route("/items", post(create).get(list))
- State is SqlitePool wrapped in axum::extract::State
- app() takes SqlitePool as argument and calls .with_state(pool) on the Router
- Handlers return Result<(StatusCode, Json<T>), StatusCode> or Result<StatusCode, StatusCode>
- POST returns 201 (StatusCode::CREATED), DELETE returns 204 (StatusCode::NO_CONTENT), GET missing returns 404
- CRITICAL: Use sqlx::query_as::<_, T>("SQL") runtime functions with .bind() — NEVER use sqlx::query_as!() or sqlx::query!() compile-time macros (they require DATABASE_URL at compile time)
- Use sqlx::query("SQL") for writes (DELETE, etc.), sqlx::query_as::<_, T>("SQL") for reads
- Use RETURNING clause in INSERT/UPDATE queries to get the created/updated row back
- DateTime fields: store as TEXT, use String type in Rust structs
- init_db: use .expect("msg") not Result return — keep it simple
- NO markdown fences inside file content — just raw code
- Edition 2024 in Cargo.toml
- You MUST generate ALL 6 files. Do not stop early.
TESTS — follow this exact spawn_server pattern:
async fn spawn_server() -> (reqwest::Client, String) {
let pool = sqlx::sqlite::SqlitePoolOptions::new()
.max_connections(1)
.connect("sqlite::memory:")
.await
.expect("DB failed");
init_db(&pool).await;
let listener = tokio::net::TcpListener::bind("127.0.0.1:0").await.expect("Bind failed");
let addr = listener.local_addr().unwrap();
let base_url = format!("http://{addr}");
let router = app(pool);
tokio::spawn(async move { axum::serve(listener, router).await.unwrap() });
(reqwest::Client::new(), base_url)
}
- Each #[tokio::test] calls spawn_server() to get (client, url)
- Unique descriptive data, NOT generic "test" strings
- Use serde_json::json!() for request bodies

View File

@@ -0,0 +1,58 @@
Generate a FastAPI project with SQLAlchemy and SQLite. Follow the REFERENCE IMPLEMENTATION exactly.
Generate these 4 files with === markers:
=== models.py ===
=== schemas.py ===
=== main.py ===
=== test_main.py ===
Key patterns (copy from reference):
- class Base(DeclarativeBase): pass
- Mapped[str] = mapped_column(String(255))
- Mapped[str | None] = mapped_column(Text, default=None)
- model_config = ConfigDict(from_attributes=True)
- model_dump() not dict()
- POST 201, GET list, GET by id 404, PUT, DELETE 204
FOREIGN KEYS (when spec has relationships):
- Child entity gets parent_id field: Mapped[int] = mapped_column(ForeignKey("parents.id"))
- Import: from sqlalchemy import ForeignKey (NOT from sqlalchemy.orm!)
- Create schema includes parent_id: int
- Test creates parent FIRST, then child with parent's id
Example FK pattern in models.py:
```
class Author(Base):
__tablename__ = "authors"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
name: Mapped[str] = mapped_column(String(255))
class Post(Base):
__tablename__ = "posts"
id: Mapped[int] = mapped_column(primary_key=True, index=True)
title: Mapped[str] = mapped_column(String(255))
author_id: Mapped[int] = mapped_column(ForeignKey("authors.id"))
```
Example FK test patterns:
```
def test_create_post():
author = client.post("/authors/", json={"name": "Jane Austen"}).json()
response = client.post("/posts/", json={"title": "First Post", "author_id": author["id"]})
assert response.status_code == 201
def test_update_post():
author = client.post("/authors/", json={"name": "Mark Twain"}).json()
created = client.post("/posts/", json={"title": "Old Title", "author_id": author["id"]}).json()
response = client.put(f"/posts/{created['id']}", json={"title": "New Title", "author_id": author["id"]})
assert response.status_code == 201
```
CRITICAL:
- Use ONLY fields from the JSON spec — no created_at or extra fields
- Generate EXACTLY 6 tests per entity: create, list, get_by_id, not_found, update, delete
- No search, filter, or other extra tests
- test_list: assert len(response.json()) >= 1, NEVER assert == 1 (database is shared between tests)
- test_create for child entities: create parent FIRST, use parent's id
- No markdown fences in output

View File

@@ -0,0 +1,47 @@
You are a Python backend developer. Generate a FastAPI project with SQLAlchemy and SQLite.
Given the project requirements, JSON specification, and a REFERENCE IMPLEMENTATION, generate these 4 files:
1. models.py — SQLAlchemy 2.0: DeclarativeBase, Mapped, mapped_column (NOT legacy declarative_base)
2. schemas.py — Pydantic v2: ConfigDict(from_attributes=True) (NOT class Config)
3. main.py — FastAPI CRUD endpoints for each entity
4. test_main.py — Pytest with TestClient, separate test.db, unique test data per test
Do NOT generate pyproject.toml — it is created separately with uv.
OUTPUT FORMAT — use these exact markers to separate files:
=== models.py ===
<python code>
=== schemas.py ===
<python code>
=== main.py ===
<python code>
=== test_main.py ===
<python code>
DOCUMENTATION — every file must have a one-line module docstring. Classes get a one-line docstring. Keep it zensical: say what it IS, not what it does. No filler.
NEVER USE DEPRECATED PATTERNS:
- ✗ declarative_base() → ✓ class Base(DeclarativeBase): pass
- ✗ Column(Type) → ✓ Mapped[type] = mapped_column(Type)
- ✗ class Config: orm_mode = True → ✓ model_config = ConfigDict(from_attributes=True)
- ✗ .dict() → ✓ .model_dump()
- ✗ Optional[str] → ✓ str | None
- ✗ session.query(Model).all() → ✓ session.execute(select(Model)).scalars().all()
RULES:
- Follow the REFERENCE IMPLEMENTATION patterns exactly
- SQLAlchemy 2.0: DeclarativeBase + Mapped + mapped_column (not Column())
- Python type unions: str | None (not Optional[str])
- Tests: unique descriptive data per test, NOT generic "test_title" strings
- Tests: PUT/update test data MUST include ALL required (non-nullable) fields, not just the field being updated
- Do NOT add filter/search endpoints — only standard CRUD (create, list, get, update, delete)
- CRITICAL: Use ONLY the fields listed in the JSON spec. NEVER add created_at, updated_at, or any field not in the spec
- If the spec happens to include timestamp fields: use server_default=func.now() (from sqlalchemy import func) and make them Optional in Create schema
- Absolute imports only (from models import ..., from schemas import ...)
- NO markdown fences inside file content — just raw code
- Only test endpoints that exist in main.py — no extra tests

View File

@@ -0,0 +1,25 @@
Convert the following Python FastAPI project to Go using Chi router and modernc.org/sqlite.
OUTPUT: Return ALL files with === markers:
=== go.mod ===
=== models.go ===
=== handlers.go ===
=== main.go ===
=== handlers_test.go ===
CONVERSION RULES:
- package main for all files
- Pydantic models → Go structs with json tags
- SQLAlchemy ORM → database/sql with raw SQL and RETURNING clause
- FastAPI routes → Chi router: r.Post("/path", handler(db))
- Handlers are closures: func handler(db *sql.DB) http.HandlerFunc
- Depends(get_db) → State passed via closure over *sql.DB
- HTTPException(404) → http.Error(w, "not found", http.StatusNotFound)
- POST returns http.StatusCreated (201), DELETE returns http.StatusNoContent (204)
- sql.ErrNoRows for not-found checks
- TestClient → httptest.NewServer + setupTestServer helper
- test.db → sql.Open("sqlite", ":memory:")
- Empty list: return []Entity{} not nil
- import _ "modernc.org/sqlite" (pure Go driver, no CGO)
- import "github.com/go-chi/chi/v5"
- No markdown fences in output — just raw code

View File

@@ -0,0 +1,31 @@
DEPRECATED PATTERNS — do NOT generate these. Use the modern alternative.
SQLAlchemy:
✗ from sqlalchemy.ext.declarative import declarative_base → ✓ from sqlalchemy.orm import DeclarativeBase
✗ Base = declarative_base() → ✓ class Base(DeclarativeBase): pass
✗ Column(Integer, primary_key=True) → ✓ Mapped[int] = mapped_column(primary_key=True)
✗ Column(String(255)) → ✓ Mapped[str] = mapped_column(String(255))
✗ session.query(User).filter_by(name="x").all() → ✓ session.execute(select(User).filter_by(name="x")).scalars().all()
✗ session.query(User).get(5) → ✓ session.get(User, 5)
✗ MetaData(bind=engine) → ✓ metadata.create_all(engine)
Pydantic:
✗ class Config: orm_mode = True → ✓ model_config = ConfigDict(from_attributes=True)
✗ .dict() → ✓ .model_dump()
✗ .json() → ✓ .model_dump_json()
✗ parse_obj() → ✓ model_validate()
@validator → ✓ @field_validator
@root_validator → ✓ @model_validator
✗ Optional[str] (auto-None in v1) → ✓ str | None = None (explicit default in v2)
✗ ConstrainedInt → ✓ Annotated[int, Field(ge=0)]
FastAPI:
✗ status_code=201 → ✓ status_code=status.HTTP_201_CREATED (readable)
✗ Manual exception strings → ✓ HTTPException(status_code=404, detail="Not found")
✗ .dict() in handlers → ✓ .model_dump() (Pydantic v2)
Python:
✗ Optional[str] → ✓ str | None (PEP 604, Python 3.10+)
✗ List[str] → ✓ list[str] (PEP 585, Python 3.9+)
✗ Dict[str, int] → ✓ dict[str, int]
✗ Tuple[int, ...] → ✓ tuple[int, ...]

View File

@@ -0,0 +1 @@
You are a Python code fixer. Return ONLY the corrected Python file. No markdown fences, no explanations — just valid Python code.

View File

@@ -0,0 +1,36 @@
REFERENCE PATTERNS (follow exactly):
STACK: SQLAlchemy 2.0 + Pydantic v2 + FastAPI + SQLite
models.py:
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase): pass
Fields: Mapped[type] = mapped_column(SqlType, default=...)
Nullable: Mapped[str | None] = mapped_column(Text, default=None)
Status: Mapped[str] = mapped_column(String(20), default="pending")
FK: Mapped[int] = mapped_column(ForeignKey("table.id"))
End: Base.metadata.create_all(bind=engine)
schemas.py:
class EntityCreate(BaseModel): fields with defaults
class EntityResponse(EntityCreate):
id: int
model_config = ConfigDict(from_attributes=True)
main.py:
def get_db(): yield SessionLocal(); finally close
POST /{table}/ → 201, model_dump()
GET /{table}/ → list
GET /{table}/{id} → 404 if not found
PUT /{table}/{id} → model_dump(), setattr loop
DELETE /{table}/{id} → 204
test_main.py:
test.db + override_get_db + TestClient
Unique descriptive data per test ("Buy milk", "Fetchable task"...)
test_create → 201 + assert "id" in json
test_list → post first, get, assert len >= 1
test_get_by_id → post, get by id, assert id matches
test_not_found → get /99999 → 404
test_update → post, put with ALL required fields, assert 200
test_delete → post, delete 204, get again → 404

View File

@@ -0,0 +1,43 @@
REFERENCE PATTERNS (follow exactly):
STACK: Axum 0.8 + SQLx + SQLite + Tokio + Serde
Cargo.toml:
edition = "2024"
deps: axum 0.8, tokio (full), serde (derive), serde_json, sqlx (sqlite, runtime-tokio), tower-http (cors)
dev: reqwest 0.13 (rustls)
src/models.rs:
#[derive(Debug, Serialize, Deserialize, FromRow)]
struct Entity { id: i64, field: String, optional: Option<String> }
struct CreateEntity { field: String, optional: Option<String> }
Status fields: String with default "pending"
src/handlers.rs:
async fn create(State(pool), Json(input)) -> (StatusCode, Json<Entity>)
POST → StatusCode::CREATED, sqlx::query("INSERT...").execute + query_as last_insert_rowid
GET list → query_as("SELECT * FROM table").fetch_all
GET by id → query_as.fetch_optional, return 404 if None
PUT → query("UPDATE...SET...WHERE id=?"), rows_affected == 0 → 404
DELETE → StatusCode::NO_CONTENT, rows_affected == 0 → 404
src/lib.rs:
pub fn app(pool: SqlitePool) -> Router
pub async fn init_db(pool: &SqlitePool) → CREATE TABLE IF NOT EXISTS
Routes: .route("/{table}", post(create).get(list))
.route("/{table}/{id}", get(get_one).put(update).delete(delete_one))
src/main.rs:
SqlitePool::connect("sqlite:./app.db"), init_db, bind 0.0.0.0:3000
tests/api_test.rs:
Each test: SqlitePool::connect("sqlite::memory:"), init_db, app(pool)
Spawn on random port: TcpListener::bind("127.0.0.1:0"), axum::serve
reqwest::Client for HTTP calls
Unique descriptive data ("Buy milk", "Fetchable task"...)
test_create → 201 + assert id exists
test_list → post first, get, assert len >= 1
test_get_by_id → post, get, assert id matches
test_not_found → 404
test_update → post, put with ALL fields, assert 200
test_delete → post, delete 204, get → 404

View File

@@ -0,0 +1,19 @@
You design database schemas. Output ONLY the schema in this exact format, nothing else.
FORMAT (one entity per line):
project: project-name
entity EntityName (table_name): field1 type, field2 type, field3 type=default
entity ChildName (table_name): field1 type, parent_id int->ParentName, field2 type
TYPES: string, text, int, float, bool, date, datetime
RULES:
- id is automatic, do NOT include it
- FK fields end with _id and use -> to reference parent
- Parent entities BEFORE children
- Max 7 fields per entity, max 3 entities
- Status fields: string with =default (e.g. status string=draft)
EXAMPLE:
project: blog-api
entity Author (authors): name string, email string, bio text
entity Post (posts): title string, content text, author_id int->Author, published_at datetime, status string=draft

View File

@@ -0,0 +1,17 @@
You design database schemas. Output ONLY valid JSON, no explanations.
SCHEMA:
{"project_name":"name","entities":[{"name":"Entity","table_name":"entities","fields":[{"name":"field","type":"string","nullable":false,"default":null}]}],"relationships":[{"from":"Child","field":"parent_id","to":"Parent"}]}
FIELD TYPES: string, text, int, float, bool, date, datetime
- Status fields: type "string", default "draft" or "pending"
- id is automatic — do NOT include it
- FK fields: type "int", name ends with _id
RULES:
- Parent entities BEFORE children in array
- Every _id field needs a relationship entry
- Max 7 fields, max 3 entities
- English names only
EXAMPLE: Blog → Author: name(string), email(string) / Post: title(string), content(text), author_id(int)→Author, status(string,default="draft")

View File

@@ -0,0 +1,31 @@
You are a software architect who designs database schemas for Python web applications.
THINK STEP BY STEP before outputting JSON:
1. What are the main ENTITIES (nouns) in this project?
2. What FIELDS does each entity need? (name, type, required?)
3. Which entities REFERENCE each other? (e.g. "a Book belongs to an Author" → Book has author_id)
4. Are there Date/DateTime fields? → add extra_imports
Then output ONLY valid JSON (no explanations before or after).
SCHEMA:
{"project_name":"short-name","description":"One sentence","entities":[{"name":"EntityName","table_name":"entity_names","fields":[{"name":"field_name","sa_type":"String(255)","py_type":"str","nullable":false,"default":null}]}],"relationships":[{"from":"ChildEntity","field":"parent_id","to":"ParentEntity","type":"many-to-one"}],"extra_imports":[]}
FIELD RULES:
- sa_type: String(N), Text, Integer, Date, DateTime, Boolean, Float
- py_type: str, int, float, bool, date, datetime — append " | None" if nullable
- Status fields: use String(20) with default value, NEVER Enum
- Every entity gets "id" automatically — do NOT add id or redundant ID fields
- Use snake_case for field names
RELATIONSHIP RULES:
- If entity A "belongs to" entity B → A has b_id field (Integer, nullable=false) + relationship entry
- EVERY _id field MUST have a matching relationship entry
- Parent entities must appear BEFORE children in the entities array
- If no relationships, set "relationships": []
AVOID: redundant ID fields, generic names, more than 7 fields or 3 entities, non-English entity/field names (ALWAYS English even if description is Finnish)
EXAMPLES (adapt, don't copy):
Todo app → Todo: title(str), description(Text|None), due_date(Date|None), status(String20="pending")
Blog → Author: name,email,bio(Text|None) / Post: title, content(Text), author_id→Author, published_at(DateTime|None), status(String20="draft")

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = /*DATA_PLACEHOLDER*/[];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,422 @@
[
{
"model": "qwen3.5:9b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 3,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 65901,
"totalTokens": 5056,
"avgTokPerSec": 82.99139473832963,
"promptChars": 12334,
"promptTokensEst": 3084,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3.5:9b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 1,
"fixRounds": 2,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 74087,
"totalTokens": 5645,
"avgTokPerSec": 83.57073831360164,
"promptChars": 10757,
"promptTokensEst": 2689,
"score": 20,
"stars": "★☆☆☆☆",
"error": null
},
{
"model": "qwen3.5:9b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 49830,
"totalTokens": 3803,
"avgTokPerSec": 83.26266260763309,
"promptChars": 10826,
"promptTokensEst": 2707,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "gemma4:e4b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 57032,
"totalTokens": 4924,
"avgTokPerSec": 106.02334905805122,
"promptChars": 11313,
"promptTokensEst": 2828,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "gemma4:e4b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 5,
"testsFailed": 2,
"totalDurationMs": 54307,
"totalTokens": 5060,
"avgTokPerSec": 106.89447491163497,
"promptChars": 11225,
"promptTokensEst": 2806,
"score": 83,
"stars": "★★★★☆",
"error": null
},
{
"model": "gemma4:e4b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 2,
"testsFailed": 9,
"totalDurationMs": 57080,
"totalTokens": 5310,
"avgTokPerSec": 106.64914988130955,
"promptChars": 11791,
"promptTokensEst": 2948,
"score": 51,
"stars": "★★★☆☆",
"error": null
},
{
"model": "qwen2.5-coder:3b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 3,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 22377,
"totalTokens": 3534,
"avgTokPerSec": 201.24475679283708,
"promptChars": 11479,
"promptTokensEst": 2870,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:3b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 8,
"fixRounds": 2,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 44520,
"totalTokens": 7495,
"avgTokPerSec": 201.87149050701015,
"promptChars": 11886,
"promptTokensEst": 2972,
"score": 20,
"stars": "★☆☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:3b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 20136,
"totalTokens": 3338,
"avgTokPerSec": 200.86152095722105,
"promptChars": 11228,
"promptTokensEst": 2807,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:7b",
"scenario": "todo",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
},
{
"model": "qwen2.5-coder:7b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 20012,
"totalTokens": 2119,
"avgTokPerSec": 122.7557304112134,
"promptChars": 10342,
"promptTokensEst": 2586,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen2.5-coder:7b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 26133,
"totalTokens": 2715,
"avgTokPerSec": 121.94987205993503,
"promptChars": 11193,
"promptTokensEst": 2798,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 44757,
"totalTokens": 2156,
"avgTokPerSec": 60.77636586631207,
"promptChars": 9635,
"promptTokensEst": 2409,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 41166,
"totalTokens": 2282,
"avgTokPerSec": 61.14821289733007,
"promptChars": 9575,
"promptTokensEst": 2394,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 66478,
"totalTokens": 3681,
"avgTokPerSec": 60.493817783668725,
"promptChars": 10500,
"promptTokensEst": 2625,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 29801,
"totalTokens": 2249,
"avgTokPerSec": 98.5661742189331,
"promptChars": 9615,
"promptTokensEst": 2404,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 22974,
"totalTokens": 2050,
"avgTokPerSec": 101.2398768597589,
"promptChars": 9273,
"promptTokensEst": 2318,
"score": 85,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 39335,
"totalTokens": 3537,
"avgTokPerSec": 100.10984073540648,
"promptChars": 10525,
"promptTokensEst": 2631,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:4b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 58668,
"totalTokens": 7134,
"avgTokPerSec": 141.76822189196028,
"promptChars": 15202,
"promptTokensEst": 3801,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:4b",
"scenario": "users",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
},
{
"model": "qwen3:4b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:14b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":186642,"totalTokens":10237,"avgTokPerSec":59.06411550065281,"promptChars":10576,"promptTokensEst":2644,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":121848,"totalTokens":6735,"avgTokPerSec":59.85231850668119,"promptChars":9684,"promptTokensEst":2421,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":11,"testsPassed":9,"testsFailed":2,"totalDurationMs":83491,"totalTokens":4677,"avgTokPerSec":60.222832434869694,"promptChars":10423,"promptTokensEst":2606,"score":89,"stars":"★★★★☆","error":null},{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":56288,"totalTokens":5235,"avgTokPerSec":99.60027546406452,"promptChars":9307,"promptTokensEst":2327,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":5,"testsFailed":1,"totalDurationMs":59639,"totalTokens":5526,"avgTokPerSec":99.6742208632186,"promptChars":9158,"promptTokensEst":2290,"score":90,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":11,"testsPassed":10,"testsFailed":1,"totalDurationMs":131793,"totalTokens":11779,"avgTokPerSec":97.17878362853351,"promptChars":10390,"promptTokensEst":2598,"score":95,"stars":"★★★★★","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,122 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 186642,
"totalTokens": 10237,
"avgTokPerSec": 59.06411550065281,
"promptChars": 10576,
"promptTokensEst": 2644,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 121848,
"totalTokens": 6735,
"avgTokPerSec": 59.85231850668119,
"promptChars": 9684,
"promptTokensEst": 2421,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 9,
"testsFailed": 2,
"totalDurationMs": 83491,
"totalTokens": 4677,
"avgTokPerSec": 60.222832434869694,
"promptChars": 10423,
"promptTokensEst": 2606,
"score": 89,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 56288,
"totalTokens": 5235,
"avgTokPerSec": 99.60027546406452,
"promptChars": 9307,
"promptTokensEst": 2327,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 5,
"testsFailed": 1,
"totalDurationMs": 59639,
"totalTokens": 5526,
"avgTokPerSec": 99.6742208632186,
"promptChars": 9158,
"promptTokensEst": 2290,
"score": 90,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 131793,
"totalTokens": 11779,
"avgTokPerSec": 97.17878362853351,
"promptChars": 10390,
"promptTokensEst": 2598,
"score": 95,
"stars": "★★★★★",
"error": null
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:14b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":66903,"totalTokens":5454,"avgTokPerSec":86.45918994499432,"promptChars":9985,"promptTokensEst":2496,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":87618,"totalTokens":7150,"avgTokPerSec":87.21782190501095,"promptChars":9922,"promptTokensEst":2481,"score":40,"stars":"★★☆☆☆","error":null},{"model":"qwen3:14b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":9,"testsPassed":5,"testsFailed":4,"totalDurationMs":78398,"totalTokens":6427,"avgTokPerSec":85.52353711143463,"promptChars":10737,"promptTokensEst":2684,"score":73,"stars":"★★★★☆","error":null},{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":8,"testsPassed":7,"testsFailed":1,"totalDurationMs":82750,"totalTokens":10054,"avgTokPerSec":139.90690936146032,"promptChars":9360,"promptTokensEst":2340,"score":93,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":32233,"totalTokens":4404,"avgTokPerSec":143.4997404058814,"promptChars":9310,"promptTokensEst":2328,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":88563,"totalTokens":11575,"avgTokPerSec":141.54675017528362,"promptChars":10567,"promptTokensEst":2642,"score":40,"stars":"★★☆☆☆","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,122 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 66903,
"totalTokens": 5454,
"avgTokPerSec": 86.45918994499432,
"promptChars": 9985,
"promptTokensEst": 2496,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 87618,
"totalTokens": 7150,
"avgTokPerSec": 87.21782190501095,
"promptChars": 9922,
"promptTokensEst": 2481,
"score": 40,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 5,
"testsFailed": 4,
"totalDurationMs": 78398,
"totalTokens": 6427,
"avgTokPerSec": 85.52353711143463,
"promptChars": 10737,
"promptTokensEst": 2684,
"score": 73,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 7,
"testsFailed": 1,
"totalDurationMs": 82750,
"totalTokens": 10054,
"avgTokPerSec": 139.90690936146032,
"promptChars": 9360,
"promptTokensEst": 2340,
"score": 93,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 32233,
"totalTokens": 4404,
"avgTokPerSec": 143.4997404058814,
"promptChars": 9310,
"promptTokensEst": 2328,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 88563,
"totalTokens": 11575,
"avgTokPerSec": 141.54675017528362,
"promptChars": 10567,
"promptTokensEst": 2642,
"score": 40,
"stars": "★★☆☆☆",
"error": null
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:14b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":9,"testsPassed":6,"testsFailed":3,"totalDurationMs":50350,"totalTokens":2797,"avgTokPerSec":60.919860198859574,"promptChars":9858,"promptTokensEst":2465,"score":80,"stars":"★★★★☆","error":null},{"model":"qwen3:14b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":8,"testsPassed":6,"testsFailed":2,"totalDurationMs":46557,"totalTokens":2584,"avgTokPerSec":60.88834523948,"promptChars":9544,"promptTokensEst":2386,"score":85,"stars":"★★★★☆","error":null},{"model":"qwen3:14b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":15,"testsPassed":2,"testsFailed":13,"totalDurationMs":90761,"totalTokens":4979,"avgTokPerSec":60.19247492391319,"promptChars":10521,"promptTokensEst":2630,"score":48,"stars":"★★☆☆☆","error":null},{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":27360,"totalTokens":2466,"avgTokPerSec":100.9922018173994,"promptChars":9767,"promptTokensEst":2442,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat"},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":7,"testsPassed":7,"testsFailed":0,"totalDurationMs":20920,"totalTokens":1876,"avgTokPerSec":101.60760023892685,"promptChars":8782,"promptTokensEst":2196,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":10,"testsPassed":9,"testsFailed":1,"totalDurationMs":35766,"totalTokens":3217,"avgTokPerSec":100.40066102398943,"promptChars":10334,"promptTokensEst":2584,"score":94,"stars":"★★★★★","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,122 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 50350,
"totalTokens": 2797,
"avgTokPerSec": 60.919860198859574,
"promptChars": 9858,
"promptTokensEst": 2465,
"score": 80,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 46557,
"totalTokens": 2584,
"avgTokPerSec": 60.88834523948,
"promptChars": 9544,
"promptTokensEst": 2386,
"score": 85,
"stars": "★★★★☆",
"error": null
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 15,
"testsPassed": 2,
"testsFailed": 13,
"totalDurationMs": 90761,
"totalTokens": 4979,
"avgTokPerSec": 60.19247492391319,
"promptChars": 10521,
"promptTokensEst": 2630,
"score": 48,
"stars": "★★☆☆☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 27360,
"totalTokens": 2466,
"avgTokPerSec": 100.9922018173994,
"promptChars": 9767,
"promptTokensEst": 2442,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat"
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 20920,
"totalTokens": 1876,
"avgTokPerSec": 101.60760023892685,
"promptChars": 8782,
"promptTokensEst": 2196,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 9,
"testsFailed": 1,
"totalDurationMs": 35766,
"totalTokens": 3217,
"avgTokPerSec": 100.40066102398943,
"promptChars": 10334,
"promptTokensEst": 2584,
"score": 94,
"stars": "★★★★★",
"error": null
}
]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,947 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 30801,
"totalTokens": 2333,
"avgTokPerSec": 122.77922150989748,
"promptChars": 10015,
"promptTokensEst": 2504,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 25495,
"totalTokens": 2714,
"avgTokPerSec": 122.70970007652487,
"promptChars": 9891,
"promptTokensEst": 2473,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 37153,
"totalTokens": 3979,
"avgTokPerSec": 121.9183958236036,
"promptChars": 11158,
"promptTokensEst": 2790,
"score": 95,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 43456,
"totalTokens": 2411,
"avgTokPerSec": 60.89226084568145,
"promptChars": 9831,
"promptTokensEst": 2458,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 40376,
"totalTokens": 2237,
"avgTokPerSec": 61.028627032662456,
"promptChars": 9343,
"promptTokensEst": 2336,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 2,
"testsFailed": 10,
"totalDurationMs": 68620,
"totalTokens": 3796,
"avgTokPerSec": 60.47793268944476,
"promptChars": 10497,
"promptTokensEst": 2624,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 25235,
"totalTokens": 2269,
"avgTokPerSec": 101.24212769079884,
"promptChars": 9294,
"promptTokensEst": 2324,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 21720,
"totalTokens": 1942,
"avgTokPerSec": 101.65074583709965,
"promptChars": 9020,
"promptTokensEst": 2255,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 39006,
"totalTokens": 3509,
"avgTokPerSec": 100.43593706181406,
"promptChars": 10372,
"promptTokensEst": 2593,
"score": 95,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 21989,
"totalTokens": 2339,
"avgTokPerSec": 122.8454095677367,
"promptChars": 10052,
"promptTokensEst": 2513,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23997,
"totalTokens": 2551,
"avgTokPerSec": 122.23722733560855,
"promptChars": 9973,
"promptTokensEst": 2493,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 30169,
"totalTokens": 3249,
"avgTokPerSec": 123.04696524796096,
"promptChars": 11097,
"promptTokensEst": 2774,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 47091,
"totalTokens": 2602,
"avgTokPerSec": 60.962687726457375,
"promptChars": 9633,
"promptTokensEst": 2408,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 41747,
"totalTokens": 2313,
"avgTokPerSec": 60.949025583617605,
"promptChars": 9373,
"promptTokensEst": 2343,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 2,
"testsFailed": 10,
"totalDurationMs": 66888,
"totalTokens": 3699,
"avgTokPerSec": 60.49540514685331,
"promptChars": 10323,
"promptTokensEst": 2581,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 7,
"testsFailed": 1,
"totalDurationMs": 27036,
"totalTokens": 2434,
"avgTokPerSec": 101.01399069228444,
"promptChars": 9513,
"promptTokensEst": 2378,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 20927,
"totalTokens": 1872,
"avgTokPerSec": 101.45096098956486,
"promptChars": 8881,
"promptTokensEst": 2220,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 26919,
"totalTokens": 2889,
"avgTokPerSec": 123.63666629145064,
"promptChars": 10162,
"promptTokensEst": 2541,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 27592,
"totalTokens": 2946,
"avgTokPerSec": 122.33273400152825,
"promptChars": 9469,
"promptTokensEst": 2367,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 35734,
"totalTokens": 3827,
"avgTokPerSec": 122.65156559717951,
"promptChars": 11086,
"promptTokensEst": 2772,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 50372,
"totalTokens": 2795,
"avgTokPerSec": 60.91611850918806,
"promptChars": 9758,
"promptTokensEst": 2440,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 38716,
"totalTokens": 2144,
"avgTokPerSec": 61.0412890406478,
"promptChars": 9415,
"promptTokensEst": 2354,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 14,
"testsPassed": 7,
"testsFailed": 7,
"totalDurationMs": 74882,
"totalTokens": 4130,
"avgTokPerSec": 60.32640855026445,
"promptChars": 10506,
"promptTokensEst": 2627,
"score": 70,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 3,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 35913,
"totalTokens": 3218,
"avgTokPerSec": 100.38516205100154,
"promptChars": 11338,
"promptTokensEst": 2835,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 20974,
"totalTokens": 1880,
"avgTokPerSec": 101.52450928280543,
"promptChars": 8803,
"promptTokensEst": 2201,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 9,
"testsFailed": 2,
"totalDurationMs": 36005,
"totalTokens": 3243,
"avgTokPerSec": 100.44301406462307,
"promptChars": 10414,
"promptTokensEst": 2604,
"score": 89,
"stars": "★★★★☆",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 1,
"testsFailed": 6,
"totalDurationMs": 23071,
"totalTokens": 2469,
"avgTokPerSec": 124.09643322620661,
"promptChars": 9960,
"promptTokensEst": 2490,
"score": 49,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 2,
"testsFailed": 6,
"totalDurationMs": 27062,
"totalTokens": 2907,
"avgTokPerSec": 123.35530975346687,
"promptChars": 9558,
"promptTokensEst": 2390,
"score": 55,
"stars": "★★★☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 9,
"testsFailed": 0,
"totalDurationMs": 29395,
"totalTokens": 3156,
"avgTokPerSec": 123.22575073561812,
"promptChars": 10574,
"promptTokensEst": 2644,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 39590,
"totalTokens": 2198,
"avgTokPerSec": 61.051945510465806,
"promptChars": 9664,
"promptTokensEst": 2416,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 36950,
"totalTokens": 2042,
"avgTokPerSec": 61.01436784429489,
"promptChars": 9225,
"promptTokensEst": 2306,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 14,
"testsPassed": 2,
"testsFailed": 12,
"totalDurationMs": 80600,
"totalTokens": 4437,
"avgTokPerSec": 60.29371170543078,
"promptChars": 10688,
"promptTokensEst": 2672,
"score": 49,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 29125,
"totalTokens": 2619,
"avgTokPerSec": 100.90587777586212,
"promptChars": 9899,
"promptTokensEst": 2475,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 21847,
"totalTokens": 1957,
"avgTokPerSec": 101.44111070734304,
"promptChars": 8946,
"promptTokensEst": 2237,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 21127,
"totalTokens": 2245,
"avgTokPerSec": 124.22714049663371,
"promptChars": 9972,
"promptTokensEst": 2493,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 7,
"testsFailed": 2,
"totalDurationMs": 30281,
"totalTokens": 3079,
"avgTokPerSec": 123.00254714651271,
"promptChars": 9562,
"promptTokensEst": 2391,
"score": 87,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 39630,
"totalTokens": 4274,
"avgTokPerSec": 123.08303937451802,
"promptChars": 11119,
"promptTokensEst": 2780,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38032,
"totalTokens": 2104,
"avgTokPerSec": 61.05445464163662,
"promptChars": 9455,
"promptTokensEst": 2364,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 39620,
"totalTokens": 2193,
"avgTokPerSec": 61.04565233675101,
"promptChars": 9481,
"promptTokensEst": 2370,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 7,
"testsFailed": 2,
"totalDurationMs": 63579,
"totalTokens": 3520,
"avgTokPerSec": 60.51513453009977,
"promptChars": 10493,
"promptTokensEst": 2623,
"score": 87,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 30845,
"totalTokens": 2777,
"avgTokPerSec": 100.79046137130972,
"promptChars": 9507,
"promptTokensEst": 2377,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 21413,
"totalTokens": 1914,
"avgTokPerSec": 101.25525436264132,
"promptChars": 8804,
"promptTokensEst": 2201,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 5
}
]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,947 @@
[
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 33892,
"totalTokens": 2675,
"avgTokPerSec": 88.07409036121237,
"promptChars": 9688,
"promptTokensEst": 2422,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 30647,
"totalTokens": 2549,
"avgTokPerSec": 88.4488185974085,
"promptChars": 9594,
"promptTokensEst": 2399,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 13,
"testsPassed": 6,
"testsFailed": 7,
"totalDurationMs": 44371,
"totalTokens": 3678,
"avgTokPerSec": 88.172616246191,
"promptChars": 10432,
"promptTokensEst": 2608,
"score": 68,
"stars": "★★★☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 18385,
"totalTokens": 2375,
"avgTokPerSec": 147.62230806597154,
"promptChars": 9478,
"promptTokensEst": 2370,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 13968,
"totalTokens": 1904,
"avgTokPerSec": 148.3084817167518,
"promptChars": 8837,
"promptTokensEst": 2209,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 25642,
"totalTokens": 3476,
"avgTokPerSec": 146.49556892944076,
"promptChars": 10734,
"promptTokensEst": 2684,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 19982,
"totalTokens": 2937,
"avgTokPerSec": 191.2786317674431,
"promptChars": 10281,
"promptTokensEst": 2570,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 17114,
"totalTokens": 2903,
"avgTokPerSec": 190.51221206765385,
"promptChars": 9654,
"promptTokensEst": 2414,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 22352,
"totalTokens": 3776,
"avgTokPerSec": 190.56628728306987,
"promptChars": 11134,
"promptTokensEst": 2784,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 31217,
"totalTokens": 2463,
"avgTokPerSec": 88.6684646675098,
"promptChars": 9598,
"promptTokensEst": 2400,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 27520,
"totalTokens": 2288,
"avgTokPerSec": 88.64765360012593,
"promptChars": 9612,
"promptTokensEst": 2403,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 3,
"testsFailed": 9,
"totalDurationMs": 41874,
"totalTokens": 3474,
"avgTokPerSec": 88.22266853318554,
"promptChars": 10408,
"promptTokensEst": 2602,
"score": 55,
"stars": "★★★☆☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 24781,
"totalTokens": 3240,
"avgTokPerSec": 146.89167309934365,
"promptChars": 10179,
"promptTokensEst": 2545,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 19148,
"totalTokens": 2605,
"avgTokPerSec": 147.55250620481297,
"promptChars": 9634,
"promptTokensEst": 2409,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 23816,
"totalTokens": 3232,
"avgTokPerSec": 147.25857324533817,
"promptChars": 9226,
"promptTokensEst": 2307,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 16639,
"totalTokens": 2369,
"avgTokPerSec": 191.61273045157245,
"promptChars": 10048,
"promptTokensEst": 2512,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 8,
"testsFailed": 1,
"totalDurationMs": 18588,
"totalTokens": 3163,
"avgTokPerSec": 190.86975006725547,
"promptChars": 10048,
"promptTokensEst": 2512,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 22677,
"totalTokens": 3828,
"avgTokPerSec": 190.15611016906482,
"promptChars": 11090,
"promptTokensEst": 2773,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 26449,
"totalTokens": 2063,
"avgTokPerSec": 88.77498453063184,
"promptChars": 9608,
"promptTokensEst": 2402,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 27510,
"totalTokens": 2289,
"avgTokPerSec": 88.74699253414485,
"promptChars": 9418,
"promptTokensEst": 2355,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 3,
"testsFailed": 9,
"totalDurationMs": 45105,
"totalTokens": 3738,
"avgTokPerSec": 88.04788102995212,
"promptChars": 10564,
"promptTokensEst": 2641,
"score": 55,
"stars": "★★★☆☆",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 7,
"testsFailed": 1,
"totalDurationMs": 19204,
"totalTokens": 2480,
"avgTokPerSec": 147.91758782382294,
"promptChars": 9391,
"promptTokensEst": 2348,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 12990,
"totalTokens": 1769,
"avgTokPerSec": 148.2616673700717,
"promptChars": 8898,
"promptTokensEst": 2225,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 10,
"testsFailed": 2,
"totalDurationMs": 25831,
"totalTokens": 3500,
"avgTokPerSec": 146.86924785880186,
"promptChars": 9465,
"promptTokensEst": 2366,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 19453,
"totalTokens": 2845,
"avgTokPerSec": 191.37382231956113,
"promptChars": 10157,
"promptTokensEst": 2539,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 9,
"testsFailed": 0,
"totalDurationMs": 21570,
"totalTokens": 3529,
"avgTokPerSec": 190.65454623497536,
"promptChars": 9732,
"promptTokensEst": 2433,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 25537,
"totalTokens": 4300,
"avgTokPerSec": 189.94521619124598,
"promptChars": 11127,
"promptTokensEst": 2782,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 7,
"testsFailed": 2,
"totalDurationMs": 31923,
"totalTokens": 2522,
"avgTokPerSec": 88.62182881661799,
"promptChars": 9700,
"promptTokensEst": 2425,
"score": 87,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 26000,
"totalTokens": 2163,
"avgTokPerSec": 88.86878707672254,
"promptChars": 9288,
"promptTokensEst": 2322,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 43275,
"totalTokens": 3588,
"avgTokPerSec": 88.24995936347965,
"promptChars": 10173,
"promptTokensEst": 2543,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 14,
"testsPassed": 0,
"testsFailed": 14,
"totalDurationMs": 30045,
"totalTokens": 3913,
"avgTokPerSec": 146.51683619371713,
"promptChars": 10334,
"promptTokensEst": 2584,
"score": 40,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 5,
"testsFailed": 4,
"totalDurationMs": 17076,
"totalTokens": 2321,
"avgTokPerSec": 147.99547121069506,
"promptChars": 9451,
"promptTokensEst": 2363,
"score": 73,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 23890,
"totalTokens": 3243,
"avgTokPerSec": 147.20125507974117,
"promptChars": 9217,
"promptTokensEst": 2304,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 21812,
"totalTokens": 3246,
"avgTokPerSec": 191.07801335688654,
"promptChars": 10249,
"promptTokensEst": 2562,
"score": 85,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 8,
"testsFailed": 1,
"totalDurationMs": 20325,
"totalTokens": 3441,
"avgTokPerSec": 190.10241840094508,
"promptChars": 9930,
"promptTokensEst": 2483,
"score": 93,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 26087,
"totalTokens": 4387,
"avgTokPerSec": 189.8005689388054,
"promptChars": 11109,
"promptTokensEst": 2777,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 30287,
"totalTokens": 2388,
"avgTokPerSec": 88.72243320918638,
"promptChars": 9695,
"promptTokensEst": 2424,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 9,
"testsPassed": 6,
"testsFailed": 3,
"totalDurationMs": 31212,
"totalTokens": 2601,
"avgTokPerSec": 88.71289036919063,
"promptChars": 9619,
"promptTokensEst": 2405,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 15,
"testsPassed": 3,
"testsFailed": 12,
"totalDurationMs": 50939,
"totalTokens": 4217,
"avgTokPerSec": 88.06125722020734,
"promptChars": 10743,
"promptTokensEst": 2686,
"score": 52,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 17913,
"totalTokens": 2310,
"avgTokPerSec": 148.0291268001691,
"promptChars": 9357,
"promptTokensEst": 2339,
"score": 91,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 13948,
"totalTokens": 1898,
"avgTokPerSec": 148.37907379944423,
"promptChars": 8725,
"promptTokensEst": 2181,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 1,
"testsFailed": 5,
"totalDurationMs": 15229,
"totalTokens": 2119,
"avgTokPerSec": 192.33007410215646,
"promptChars": 9827,
"promptTokensEst": 2457,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 18223,
"totalTokens": 3093,
"avgTokPerSec": 190.71372054282037,
"promptChars": 9641,
"promptTokensEst": 2410,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 1,
"testsFailed": 9,
"totalDurationMs": 21215,
"totalTokens": 3589,
"avgTokPerSec": 190.49493540345176,
"promptChars": 11180,
"promptTokensEst": 2795,
"score": 46,
"stars": "★★☆☆☆",
"error": null,
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":21688,"totalTokens":2243,"avgTokPerSec":121.7719614197307,"promptChars":11588,"promptTokensEst":2897,"score":100,"stars":"★★★★★","error":null}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,22 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 21688,
"totalTokens": 2243,
"avgTokPerSec": 121.7719614197307,
"promptChars": 11588,
"promptTokensEst": 2897,
"score": 100,
"stars": "★★★★★",
"error": null
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":23521,"totalTokens":2090,"avgTokPerSec":100.94324085271073,"promptChars":10962,"promptTokensEst":2741,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":1,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":33680,"totalTokens":3003,"avgTokPerSec":100.52754588753601,"promptChars":10171,"promptTokensEst":2543,"score":90,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui"}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,62 @@
[
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23521,
"totalTokens": 2090,
"avgTokPerSec": 100.94324085271073,
"promptChars": 10962,
"promptTokensEst": 2741,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 33680,
"totalTokens": 3003,
"avgTokPerSec": 100.52754588753601,
"promptChars": 10171,
"promptTokensEst": 2543,
"score": 90,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui"
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"todo","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":8,"testsPassed":6,"testsFailed":2,"totalDurationMs":97470,"totalTokens":8786,"avgTokPerSec":97.96636139685832,"promptChars":11290,"promptTokensEst":2823,"score":65,"stars":"★★★☆☆","error":null},{"model":"qwen3:8b","scenario":"users","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":6,"testsPassed":6,"testsFailed":0,"totalDurationMs":18951,"totalTokens":1666,"avgTokPerSec":101.807593927545,"promptChars":10293,"promptTokensEst":2573,"score":100,"stars":"★★★★★","error":null},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":126005,"totalTokens":11056,"avgTokPerSec":96.6373549161171,"promptChars":11878,"promptTokensEst":2970,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe"}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,62 @@
[
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 97470,
"totalTokens": 8786,
"avgTokPerSec": 97.96636139685832,
"promptChars": 11290,
"promptTokensEst": 2823,
"score": 65,
"stars": "★★★☆☆",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 18951,
"totalTokens": 1666,
"avgTokPerSec": 101.807593927545,
"promptChars": 10293,
"promptTokensEst": 2573,
"score": 100,
"stars": "★★★★★",
"error": null
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 126005,
"totalTokens": 11056,
"avgTokPerSec": 96.6373549161171,
"promptChars": 11878,
"promptTokensEst": 2970,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe"
}
]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,947 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 25444,
"totalTokens": 2661,
"avgTokPerSec": 122.06801173056196,
"promptChars": 11849,
"promptTokensEst": 2962,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 24447,
"totalTokens": 2537,
"avgTokPerSec": 121.11837170891442,
"promptChars": 11045,
"promptTokensEst": 2761,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 38071,
"totalTokens": 3965,
"avgTokPerSec": 120.37309655579647,
"promptChars": 12702,
"promptTokensEst": 3176,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38459,
"totalTokens": 2106,
"avgTokPerSec": 60.889088461567745,
"promptChars": 10951,
"promptTokensEst": 2738,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 35959,
"totalTokens": 1966,
"avgTokPerSec": 60.9684885562545,
"promptChars": 10698,
"promptTokensEst": 2675,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 13,
"testsPassed": 2,
"testsFailed": 11,
"totalDurationMs": 269370,
"totalTokens": 14361,
"avgTokPerSec": 57.79069860126629,
"promptChars": 11838,
"promptTokensEst": 2960,
"score": 29,
"stars": "★★☆☆☆",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23199,
"totalTokens": 2054,
"avgTokPerSec": 101.09280595816365,
"promptChars": 10854,
"promptTokensEst": 2714,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 72665,
"totalTokens": 6586,
"avgTokPerSec": 99.40636298490288,
"promptChars": 10157,
"promptTokensEst": 2539,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 136309,
"totalTokens": 12036,
"avgTokPerSec": 97.02525169408467,
"promptChars": 10823,
"promptTokensEst": 2706,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 28177,
"totalTokens": 2946,
"avgTokPerSec": 121.23541038097,
"promptChars": 11836,
"promptTokensEst": 2959,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 22631,
"totalTokens": 2352,
"avgTokPerSec": 121.93930190168658,
"promptChars": 10440,
"promptTokensEst": 2610,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 40394,
"totalTokens": 4225,
"avgTokPerSec": 120.84107397324551,
"promptChars": 12362,
"promptTokensEst": 3091,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 46081,
"totalTokens": 2542,
"avgTokPerSec": 60.93046828700026,
"promptChars": 11412,
"promptTokensEst": 2853,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 41323,
"totalTokens": 2272,
"avgTokPerSec": 60.99406174164295,
"promptChars": 10884,
"promptTokensEst": 2721,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 14,
"testsPassed": 2,
"testsFailed": 12,
"totalDurationMs": 262591,
"totalTokens": 14129,
"avgTokPerSec": 57.91340837830759,
"promptChars": 12143,
"promptTokensEst": 3036,
"score": 29,
"stars": "★★☆☆☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 24007,
"totalTokens": 2137,
"avgTokPerSec": 101.05982103292858,
"promptChars": 10756,
"promptTokensEst": 2689,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 7,
"testsPassed": 6,
"testsFailed": 1,
"totalDurationMs": 68739,
"totalTokens": 6199,
"avgTokPerSec": 98.9825675198183,
"promptChars": 10313,
"promptTokensEst": 2578,
"score": 71,
"stars": "★★★★☆",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23472,
"totalTokens": 2427,
"avgTokPerSec": 120.85293828875076,
"promptChars": 11663,
"promptTokensEst": 2916,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 25864,
"totalTokens": 2671,
"avgTokPerSec": 120.6883137195962,
"promptChars": 11148,
"promptTokensEst": 2787,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 41074,
"totalTokens": 4275,
"avgTokPerSec": 120.33351485161673,
"promptChars": 12664,
"promptTokensEst": 3166,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 40457,
"totalTokens": 2229,
"avgTokPerSec": 61.093615619948345,
"promptChars": 10905,
"promptTokensEst": 2726,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 77506,
"totalTokens": 4268,
"avgTokPerSec": 60.19655522627278,
"promptChars": 11135,
"promptTokensEst": 2784,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 74791,
"totalTokens": 3590,
"avgTokPerSec": 60.549298891176214,
"promptChars": 11653,
"promptTokensEst": 2913,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 26402,
"totalTokens": 2358,
"avgTokPerSec": 100.76936895480246,
"promptChars": 11243,
"promptTokensEst": 2811,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 20751,
"totalTokens": 1837,
"avgTokPerSec": 101.05480893032836,
"promptChars": 10553,
"promptTokensEst": 2638,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 22098,
"totalTokens": 2283,
"avgTokPerSec": 121.81254413612446,
"promptChars": 11503,
"promptTokensEst": 2876,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 8,
"testsPassed": 8,
"testsFailed": 0,
"totalDurationMs": 65403,
"totalTokens": 6779,
"avgTokPerSec": 118.13288294758586,
"promptChars": 10939,
"promptTokensEst": 2735,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 36044,
"totalTokens": 3748,
"avgTokPerSec": 120.14822967005487,
"promptChars": 12639,
"promptTokensEst": 3160,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38501,
"totalTokens": 2113,
"avgTokPerSec": 61.01814139430428,
"promptChars": 10929,
"promptTokensEst": 2732,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 1,
"testsFailed": 7,
"totalDurationMs": 147057,
"totalTokens": 7799,
"avgTokPerSec": 56.209406465865904,
"promptChars": 11207,
"promptTokensEst": 2802,
"score": 28,
"stars": "★★☆☆☆",
"error": null,
"round": 4
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 227508,
"totalTokens": 12026,
"avgTokPerSec": 58.52888492610325,
"promptChars": 11809,
"promptTokensEst": 2952,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 131964,
"totalTokens": 11403,
"avgTokPerSec": 97.10963264920952,
"promptChars": 11786,
"promptTokensEst": 2947,
"score": 80,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38820,
"totalTokens": 1826,
"avgTokPerSec": 101.07773707712924,
"promptChars": 10568,
"promptTokensEst": 2642,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 39797,
"totalTokens": 3776,
"avgTokPerSec": 120.91801837211113,
"promptChars": 11435,
"promptTokensEst": 2859,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 9,
"testsPassed": 8,
"testsFailed": 1,
"totalDurationMs": 87836,
"totalTokens": 9343,
"avgTokPerSec": 119.28783662683314,
"promptChars": 10718,
"promptTokensEst": 2680,
"score": 73,
"stars": "★★★★☆",
"error": null,
"round": 5
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 10,
"testsPassed": 10,
"testsFailed": 0,
"totalDurationMs": 36644,
"totalTokens": 3897,
"avgTokPerSec": 122.28607796191666,
"promptChars": 12598,
"promptTokensEst": 3150,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 127532,
"totalTokens": 3919,
"avgTokPerSec": 34.13133325491828,
"promptChars": 11352,
"promptTokensEst": 2838,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 217365,
"totalTokens": 7764,
"avgTokPerSec": 38.67613170588518,
"promptChars": 10834,
"promptTokensEst": 2709,
"score": 65,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:14b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 14,
"testsPassed": 7,
"testsFailed": 7,
"totalDurationMs": 248311,
"totalTokens": 13443,
"avgTokPerSec": 58.05680015263308,
"promptChars": 12219,
"promptTokensEst": 3055,
"score": 50,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 38326,
"totalTokens": 2079,
"avgTokPerSec": 100.89778087504016,
"promptChars": 10908,
"promptTokensEst": 2727,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 60823,
"totalTokens": 1772,
"avgTokPerSec": 96.76383996716295,
"promptChars": 10378,
"promptTokensEst": 2595,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 81654,
"totalTokens": 3458,
"avgTokPerSec": 95.65675360193613,
"promptChars": 11914,
"promptTokensEst": 2979,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1 @@
[]

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,317 @@
[
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 97527,
"totalTokens": 2228,
"avgTokPerSec": 100.69171830800946,
"promptChars": 11566,
"promptTokensEst": 2892,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 39549,
"totalTokens": 1960,
"avgTokPerSec": 100.98265593129491,
"promptChars": 11073,
"promptTokensEst": 2768,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 131339,
"totalTokens": 11518,
"avgTokPerSec": 96.52358107464266,
"promptChars": 12388,
"promptTokensEst": 3097,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 20658,
"totalTokens": 1808,
"avgTokPerSec": 101.0081173861862,
"promptChars": 11057,
"promptTokensEst": 2764,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 1,
"fixRounds": 5,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 320031,
"totalTokens": 11985,
"avgTokPerSec": 54.915025374575386,
"promptChars": 12517,
"promptTokensEst": 3129,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 7,
"testsPassed": 7,
"testsFailed": 0,
"totalDurationMs": 28654,
"totalTokens": 1877,
"avgTokPerSec": 100.70920643946336,
"promptChars": 10747,
"promptTokensEst": 2687,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 67943,
"totalTokens": 6002,
"avgTokPerSec": 98.29436788902672,
"promptChars": 12389,
"promptTokensEst": 3097,
"score": 90,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 20203,
"totalTokens": 1774,
"avgTokPerSec": 100.9066297884274,
"promptChars": 10905,
"promptTokensEst": 2726,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 13,
"testsPassed": 12,
"testsFailed": 1,
"totalDurationMs": 148491,
"totalTokens": 12747,
"avgTokPerSec": 95.18237885727869,
"promptChars": 12476,
"promptTokensEst": 3119,
"score": 75,
"stars": "★★★★☆",
"error": null,
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "todo",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 6,
"testsPassed": 6,
"testsFailed": 0,
"totalDurationMs": 23830,
"totalTokens": 2102,
"avgTokPerSec": 100.641489789061,
"promptChars": 11404,
"promptTokensEst": 2851,
"score": 100,
"stars": "★★★★★",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "users",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 8,
"testsPassed": 6,
"testsFailed": 2,
"totalDurationMs": 122453,
"totalTokens": 7285,
"avgTokPerSec": 94.12482830400619,
"promptChars": 11400,
"promptTokensEst": 2850,
"score": 65,
"stars": "★★★☆☆",
"error": null,
"round": 5
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 10,
"testsFailed": 1,
"totalDurationMs": 147125,
"totalTokens": 9893,
"avgTokPerSec": 97.37021605085566,
"promptChars": 12455,
"promptTokensEst": 3114,
"score": 75,
"stars": "★★★★☆",
"error": null,
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":1,"testsTotal":11,"testsPassed":11,"testsFailed":0,"totalDurationMs":64124,"totalTokens":5689,"avgTokPerSec":98.61378134916481,"promptChars":12098,"promptTokensEst":3025,"score":90,"stars":"★★★★★","error":null,"profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":126014,"totalTokens":11162,"avgTokPerSec":97.09858655726343,"promptChars":12101,"promptTokensEst":3025,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":3}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,69 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 64124,
"totalTokens": 5689,
"avgTokPerSec": 98.61378134916481,
"promptChars": 12098,
"promptTokensEst": 3025,
"score": 90,
"stars": "★★★★★",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 126014,
"totalTokens": 11162,
"avgTokPerSec": 97.09858655726343,
"promptChars": 12101,
"promptTokensEst": 3025,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":10,"testsFailed":2,"totalDurationMs":139308,"totalTokens":11782,"avgTokPerSec":96.85039238572556,"promptChars":11148,"promptTokensEst":2787,"score":70,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":132306,"totalTokens":11671,"avgTokPerSec":96.88921767777383,"promptChars":11267,"promptTokensEst":2817,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":11,"testsFailed":1,"totalDurationMs":126092,"totalTokens":11132,"avgTokPerSec":96.98598556369416,"promptChars":11292,"promptTokensEst":2823,"score":75,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":3}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,71 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 10,
"testsFailed": 2,
"totalDurationMs": 139308,
"totalTokens": 11782,
"avgTokPerSec": 96.85039238572556,
"promptChars": 11148,
"promptTokensEst": 2787,
"score": 70,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 132306,
"totalTokens": 11671,
"avgTokPerSec": 96.88921767777383,
"promptChars": 11267,
"promptTokensEst": 2817,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 11,
"testsFailed": 1,
"totalDurationMs": 126092,
"totalTokens": 11132,
"avgTokPerSec": 96.98598556369416,
"promptChars": 11292,
"promptTokensEst": 2823,
"score": 75,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 3
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":11,"testsPassed":9,"testsFailed":2,"totalDurationMs":75178,"totalTokens":9916,"avgTokPerSec":142.94675043471062,"promptChars":10516,"promptTokensEst":2629,"score":69,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":1,"fixRounds":5,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":98787,"totalTokens":12904,"avgTokPerSec":141.16873850064812,"promptChars":11810,"promptTokensEst":2953,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":81763,"totalTokens":10277,"avgTokPerSec":134.82946940948588,"promptChars":11534,"promptTokensEst":2884,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":88517,"totalTokens":11280,"avgTokPerSec":136.63597159351744,"promptChars":10568,"promptTokensEst":2642,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":9,"testsFailed":3,"totalDurationMs":87817,"totalTokens":11171,"avgTokPerSec":136.1538785139482,"promptChars":11627,"promptTokensEst":2907,"score":65,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 9,
"testsFailed": 2,
"totalDurationMs": 75178,
"totalTokens": 9916,
"avgTokPerSec": 142.94675043471062,
"promptChars": 10516,
"promptTokensEst": 2629,
"score": 69,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 1,
"fixRounds": 5,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 98787,
"totalTokens": 12904,
"avgTokPerSec": 141.16873850064812,
"promptChars": 11810,
"promptTokensEst": 2953,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 81763,
"totalTokens": 10277,
"avgTokPerSec": 134.82946940948588,
"promptChars": 11534,
"promptTokensEst": 2884,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 88517,
"totalTokens": 11280,
"avgTokPerSec": 136.63597159351744,
"promptChars": 10568,
"promptTokensEst": 2642,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 9,
"testsFailed": 3,
"totalDurationMs": 87817,
"totalTokens": 11171,
"avgTokPerSec": 136.1538785139482,
"promptChars": 11627,
"promptTokensEst": 2907,
"score": 65,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":79193,"totalTokens":10304,"avgTokPerSec":141.2083113764173,"promptChars":12199,"promptTokensEst":3050,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":10,"testsPassed":6,"testsFailed":4,"totalDurationMs":66764,"totalTokens":8896,"avgTokPerSec":142.57944640796882,"promptChars":12391,"promptTokensEst":3098,"score":56,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":76403,"totalTokens":9962,"avgTokPerSec":137.0023398819064,"promptChars":12432,"promptTokensEst":3108,"score":20,"stars":"★☆☆☆☆","error":"Syntaksivirhe","profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":13,"testsPassed":7,"testsFailed":6,"totalDurationMs":81345,"totalTokens":10535,"avgTokPerSec":139.42076339875726,"promptChars":11419,"promptTokensEst":2855,"score":52,"stars":"★★★☆☆","error":null,"profile":"small","promptName":"code-small","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":11,"testsFailed":1,"totalDurationMs":72723,"totalTokens":9567,"avgTokPerSec":141.2709378394512,"promptChars":11416,"promptTokensEst":2854,"score":75,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 79193,
"totalTokens": 10304,
"avgTokPerSec": 141.2083113764173,
"promptChars": 12199,
"promptTokensEst": 3050,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 10,
"testsPassed": 6,
"testsFailed": 4,
"totalDurationMs": 66764,
"totalTokens": 8896,
"avgTokPerSec": 142.57944640796882,
"promptChars": 12391,
"promptTokensEst": 3098,
"score": 56,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 76403,
"totalTokens": 9962,
"avgTokPerSec": 137.0023398819064,
"promptChars": 12432,
"promptTokensEst": 3108,
"score": 20,
"stars": "★☆☆☆☆",
"error": "Syntaksivirhe",
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 13,
"testsPassed": 7,
"testsFailed": 6,
"totalDurationMs": 81345,
"totalTokens": 10535,
"avgTokPerSec": 139.42076339875726,
"promptChars": 11419,
"promptTokensEst": 2855,
"score": 52,
"stars": "★★★☆☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 11,
"testsFailed": 1,
"totalDurationMs": 72723,
"totalTokens": 9567,
"avgTokPerSec": 141.2709378394512,
"promptChars": 11416,
"promptTokensEst": 2854,
"score": 75,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":56798,"totalTokens":5105,"avgTokPerSec":99.4097006568848,"promptChars":11326,"promptTokensEst":2832,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":114297,"totalTokens":10163,"avgTokPerSec":97.19131591932717,"promptChars":12182,"promptTokensEst":3046,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":12,"testsPassed":11,"testsFailed":1,"totalDurationMs":112008,"totalTokens":9892,"avgTokPerSec":97.0586619009377,"promptChars":12406,"promptTokensEst":3102,"score":75,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,113 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 56798,
"totalTokens": 5105,
"avgTokPerSec": 99.4097006568848,
"promptChars": 11326,
"promptTokensEst": 2832,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 114297,
"totalTokens": 10163,
"avgTokPerSec": 97.19131591932717,
"promptChars": 12182,
"promptTokensEst": 3046,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 12,
"testsPassed": 11,
"testsFailed": 1,
"totalDurationMs": 112008,
"totalTokens": 9892,
"avgTokPerSec": 97.0586619009377,
"promptChars": 12406,
"promptTokensEst": 3102,
"score": 75,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":11,"testsPassed":11,"testsFailed":0,"totalDurationMs":143640,"totalTokens":12611,"avgTokPerSec":96.28061629672216,"promptChars":12125,"promptTokensEst":3031,"score":80,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":12,"testsPassed":12,"testsFailed":0,"totalDurationMs":116061,"totalTokens":10181,"avgTokPerSec":96.63321228455318,"promptChars":12435,"promptTokensEst":3109,"score":80,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":11,"testsPassed":11,"testsFailed":0,"totalDurationMs":113792,"totalTokens":10022,"avgTokPerSec":96.96815077469971,"promptChars":12260,"promptTokensEst":3065,"score":80,"stars":"★★★★☆","error":null,"profile":"small","promptName":"code-small","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,113 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 143640,
"totalTokens": 12611,
"avgTokPerSec": 96.28061629672216,
"promptChars": 12125,
"promptTokensEst": 3031,
"score": 80,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 12,
"testsPassed": 12,
"testsFailed": 0,
"totalDurationMs": 116061,
"totalTokens": 10181,
"avgTokPerSec": 96.63321228455318,
"promptChars": 12435,
"promptTokensEst": 3109,
"score": 80,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 11,
"testsPassed": 11,
"testsFailed": 0,
"totalDurationMs": 113792,
"totalTokens": 10022,
"avgTokPerSec": 96.96815077469971,
"promptChars": 12260,
"promptTokensEst": 3065,
"score": 80,
"stars": "★★★★☆",
"error": null,
"profile": "small",
"promptName": "code-small",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":1,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":10508,"promptTokensEst":2627,"score":0,"stars":"","error":"Puuttuvat: Cargo.toml, src/models.rs, src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs","round":1},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":2},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":3},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":4},{"model":"qwen3:8b","scenario":"blog","reqOk":true,"specOk":false,"specEntities":0,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":0,"promptTokensEst":0,"score":0,"stars":"","error":"JSON-speksi epäonnistui","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,107 @@
[
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 1,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 10508,
"promptTokensEst": 2627,
"score": 0,
"stars": "",
"error": "Puuttuvat: Cargo.toml, src/models.rs, src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 1
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 2
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 3
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 4
},
{
"model": "qwen3:8b",
"scenario": "blog",
"reqOk": true,
"specOk": false,
"specEntities": 0,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 0,
"promptTokensEst": 0,
"score": 0,
"stars": "",
"error": "JSON-speksi epäonnistui",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":3,"testsPassed":0,"testsFailed":3,"totalDurationMs":217110,"totalTokens":21602,"avgTokPerSec":114.70956637458333,"promptChars":12612,"promptTokensEst":3153,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":3,"testsPassed":0,"testsFailed":3,"totalDurationMs":204772,"totalTokens":20717,"avgTokPerSec":114.45999021594592,"promptChars":12743,"promptTokensEst":3186,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":4,"testsPassed":0,"testsFailed":4,"totalDurationMs":180501,"totalTokens":18467,"avgTokPerSec":115.23583963958032,"promptChars":12392,"promptTokensEst":3098,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":25,"testsPassed":0,"testsFailed":25,"totalDurationMs":282681,"totalTokens":27665,"avgTokPerSec":111.29688837623901,"promptChars":12675,"promptTokensEst":3169,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":5,"testsPassed":0,"testsFailed":5,"totalDurationMs":171686,"totalTokens":17525,"avgTokPerSec":114.88288274375243,"promptChars":12618,"promptTokensEst":3155,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 3,
"testsPassed": 0,
"testsFailed": 3,
"totalDurationMs": 217110,
"totalTokens": 21602,
"avgTokPerSec": 114.70956637458333,
"promptChars": 12612,
"promptTokensEst": 3153,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 3,
"testsPassed": 0,
"testsFailed": 3,
"totalDurationMs": 204772,
"totalTokens": 20717,
"avgTokPerSec": 114.45999021594592,
"promptChars": 12743,
"promptTokensEst": 3186,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 4,
"testsPassed": 0,
"testsFailed": 4,
"totalDurationMs": 180501,
"totalTokens": 18467,
"avgTokPerSec": 115.23583963958032,
"promptChars": 12392,
"promptTokensEst": 3098,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 25,
"testsPassed": 0,
"testsFailed": 25,
"totalDurationMs": 282681,
"totalTokens": 27665,
"avgTokPerSec": 111.29688837623901,
"promptChars": 12675,
"promptTokensEst": 3169,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 5,
"testsPassed": 0,
"testsFailed": 5,
"totalDurationMs": 171686,
"totalTokens": 17525,
"avgTokPerSec": 114.88288274375243,
"promptChars": 12618,
"promptTokensEst": 3155,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":18,"testsPassed":0,"testsFailed":18,"totalDurationMs":208078,"totalTokens":20783,"avgTokPerSec":114.94478559756693,"promptChars":13278,"promptTokensEst":3320,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":13362,"promptTokensEst":3341,"score":0,"stars":"","error":"Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":9,"testsPassed":0,"testsFailed":9,"totalDurationMs":221174,"totalTokens":22354,"avgTokPerSec":114.09551344946065,"promptChars":13234,"promptTokensEst":3309,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":13317,"promptTokensEst":3329,"score":0,"stars":"","error":"Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":8795,"totalTokens":954,"avgTokPerSec":124.86009274372915,"promptChars":13335,"promptTokensEst":3334,"score":0,"stars":"☆☆☆☆☆","error":"fetch failed","profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,113 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 18,
"testsPassed": 0,
"testsFailed": 18,
"totalDurationMs": 208078,
"totalTokens": 20783,
"avgTokPerSec": 114.94478559756693,
"promptChars": 13278,
"promptTokensEst": 3320,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 13362,
"promptTokensEst": 3341,
"score": 0,
"stars": "",
"error": "Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 9,
"testsPassed": 0,
"testsFailed": 9,
"totalDurationMs": 221174,
"totalTokens": 22354,
"avgTokPerSec": 114.09551344946065,
"promptChars": 13234,
"promptTokensEst": 3309,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 13317,
"promptTokensEst": 3329,
"score": 0,
"stars": "",
"error": "Puuttuvat: src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 8795,
"totalTokens": 954,
"avgTokPerSec": 124.86009274372915,
"promptChars": 13335,
"promptTokensEst": 3334,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "fetch failed",
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":1,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":133173,"totalTokens":13174,"avgTokPerSec":117.52479437665707,"promptChars":14102,"promptTokensEst":3526,"score":30,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":5,"testsPassed":0,"testsFailed":5,"totalDurationMs":267561,"totalTokens":27021,"avgTokPerSec":113.5812238661422,"promptChars":14052,"promptTokensEst":3513,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":0,"totalTokens":0,"avgTokPerSec":0,"promptChars":13914,"promptTokensEst":3479,"score":0,"stars":"","error":"Puuttuvat: src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":162271,"totalTokens":16343,"avgTokPerSec":115.53039090208604,"promptChars":14062,"promptTokensEst":3516,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":211367,"totalTokens":21183,"avgTokPerSec":113.22772767359652,"promptChars":14038,"promptTokensEst":3510,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,115 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 133173,
"totalTokens": 13174,
"avgTokPerSec": 117.52479437665707,
"promptChars": 14102,
"promptTokensEst": 3526,
"score": 30,
"stars": "★★☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 5,
"testsPassed": 0,
"testsFailed": 5,
"totalDurationMs": 267561,
"totalTokens": 27021,
"avgTokPerSec": 113.5812238661422,
"promptChars": 14052,
"promptTokensEst": 3513,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 0,
"totalTokens": 0,
"avgTokPerSec": 0,
"promptChars": 13914,
"promptTokensEst": 3479,
"score": 0,
"stars": "",
"error": "Puuttuvat: src/handlers.rs, src/lib.rs, src/main.rs, tests/api_test.rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 162271,
"totalTokens": 16343,
"avgTokPerSec": 115.53039090208604,
"promptChars": 14062,
"promptTokensEst": 3516,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 211367,
"totalTokens": 21183,
"avgTokPerSec": 113.22772767359652,
"promptChars": 14038,
"promptTokensEst": 3510,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":38807,"totalTokens":5667,"avgTokPerSec":183.83891911423427,"promptChars":21818,"promptTokensEst":5455,"score":40,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":178290,"totalTokens":26265,"avgTokPerSec":168.77786498646262,"promptChars":21840,"promptTokensEst":5460,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":151603,"totalTokens":22725,"avgTokPerSec":170.74115131582644,"promptChars":21750,"promptTokensEst":5438,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":0,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":41059,"totalTokens":6288,"avgTokPerSec":183.76827829344424,"promptChars":21848,"promptTokensEst":5462,"score":40,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":3,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":187666,"totalTokens":27278,"avgTokPerSec":166.24197655672018,"promptChars":21694,"promptTokensEst":5424,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 38807,
"totalTokens": 5667,
"avgTokPerSec": 183.83891911423427,
"promptChars": 21818,
"promptTokensEst": 5455,
"score": 40,
"stars": "★★☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 178290,
"totalTokens": 26265,
"avgTokPerSec": 168.77786498646262,
"promptChars": 21840,
"promptTokensEst": 5460,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 151603,
"totalTokens": 22725,
"avgTokPerSec": 170.74115131582644,
"promptChars": 21750,
"promptTokensEst": 5438,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 0,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 41059,
"totalTokens": 6288,
"avgTokPerSec": 183.76827829344424,
"promptChars": 21848,
"promptTokensEst": 5462,
"score": 40,
"stars": "★★☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 3,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 187666,
"totalTokens": 27278,
"avgTokPerSec": 166.24197655672018,
"promptChars": 21694,
"promptTokensEst": 5424,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":4,"testsTotal":1,"testsPassed":0,"testsFailed":1,"totalDurationMs":231122,"totalTokens":22952,"avgTokPerSec":113.75113825466987,"promptChars":17604,"promptTokensEst":4401,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":1},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":5,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":260314,"totalTokens":26144,"avgTokPerSec":113.40388181735229,"promptChars":17539,"promptTokensEst":4385,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":2},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":4,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":227228,"totalTokens":22381,"avgTokPerSec":113.5362722539456,"promptChars":17630,"promptTokensEst":4408,"score":0,"stars":"☆☆☆☆☆","error":"Testit kaatuivat","profile":"large","promptName":"code-rs","round":3},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":1,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":102052,"totalTokens":9984,"avgTokPerSec":117.77973450501808,"promptChars":17571,"promptTokensEst":4393,"score":30,"stars":"★★☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":4},{"model":"qwen3-coder:30b","scenario":"blog","reqOk":true,"specOk":true,"specEntities":2,"validationIssues":0,"fixRounds":2,"testsTotal":0,"testsPassed":0,"testsFailed":0,"totalDurationMs":146321,"totalTokens":14445,"avgTokPerSec":115.61186488022163,"promptChars":17589,"promptTokensEst":4397,"score":20,"stars":"★☆☆☆☆","error":null,"profile":"large","promptName":"code-rs","round":5}];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1,117 @@
[
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 4,
"testsTotal": 1,
"testsPassed": 0,
"testsFailed": 1,
"totalDurationMs": 231122,
"totalTokens": 22952,
"avgTokPerSec": 113.75113825466987,
"promptChars": 17604,
"promptTokensEst": 4401,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 1
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 5,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 260314,
"totalTokens": 26144,
"avgTokPerSec": 113.40388181735229,
"promptChars": 17539,
"promptTokensEst": 4385,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 2
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 4,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 227228,
"totalTokens": 22381,
"avgTokPerSec": 113.5362722539456,
"promptChars": 17630,
"promptTokensEst": 4408,
"score": 0,
"stars": "☆☆☆☆☆",
"error": "Testit kaatuivat",
"profile": "large",
"promptName": "code-rs",
"round": 3
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 1,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 102052,
"totalTokens": 9984,
"avgTokPerSec": 117.77973450501808,
"promptChars": 17571,
"promptTokensEst": 4393,
"score": 30,
"stars": "★★☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 4
},
{
"model": "qwen3-coder:30b",
"scenario": "blog",
"reqOk": true,
"specOk": true,
"specEntities": 2,
"validationIssues": 0,
"fixRounds": 2,
"testsTotal": 0,
"testsPassed": 0,
"testsFailed": 0,
"totalDurationMs": 146321,
"totalTokens": 14445,
"avgTokPerSec": 115.61186488022163,
"promptChars": 17589,
"promptTokensEst": 4397,
"score": 20,
"stars": "★☆☆☆☆",
"error": null,
"profile": "large",
"promptName": "code-rs",
"round": 5
}
]

View File

@@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="fi">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Kipina Model Benchmark</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --yellow: #d29922; --red: #f85149; --blue: #58a6ff; }
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: -apple-system, 'Segoe UI', Helvetica, Arial, sans-serif; background: var(--bg); color: var(--text); padding: 2rem; max-width: 1400px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.5rem; }
.meta { color: var(--dim); font-size: 0.85rem; margin-bottom: 2rem; }
.cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 1rem; margin-bottom: 2rem; }
.card { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; }
.card .label { color: var(--dim); font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; }
.card .value { font-size: 1.8rem; font-weight: 600; margin-top: 0.25rem; }
.card .sub { color: var(--dim); font-size: 0.8rem; margin-top: 0.25rem; }
table { width: 100%; border-collapse: collapse; background: var(--card); border: 1px solid var(--border); border-radius: 8px; overflow: hidden; margin-bottom: 2rem; }
th { background: #1c2128; text-align: left; padding: 0.6rem 0.8rem; font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.05em; color: var(--dim); cursor: pointer; user-select: none; white-space: nowrap; }
th:hover { color: var(--text); }
th.sorted-asc::after { content: ' ▲'; }
th.sorted-desc::after { content: ' ▼'; }
td { padding: 0.5rem 0.8rem; border-top: 1px solid var(--border); font-size: 0.85rem; white-space: nowrap; }
tr:hover td { background: #1c2128; }
.pass { color: var(--green); }
.partial { color: var(--yellow); }
.fail { color: var(--red); }
.stars { letter-spacing: 1px; }
.bar { display: inline-block; height: 8px; border-radius: 4px; vertical-align: middle; }
.bar-bg { background: var(--border); }
.bar-fill { background: var(--green); }
.bar-partial { background: var(--yellow); }
.model-name { font-weight: 600; }
h2 { font-size: 1.1rem; margin-bottom: 1rem; color: var(--dim); }
.summary-table th:first-child, .summary-table td:first-child { min-width: 200px; }
</style>
</head>
<body>
<h1>Kipina Model Benchmark</h1>
<div class="meta" id="meta"></div>
<div class="cards" id="cards"></div>
<h2>Mallikohtainen yhteenveto</h2>
<table class="summary-table" id="summary-table"><thead></thead><tbody></tbody></table>
<h2>Kaikki tulokset</h2>
<table id="results-table"><thead></thead><tbody></tbody></table>
<script>
const RAW = [];
const starsFor = s => s >= 90 ? '★★★★★' : s >= 70 ? '★★★★☆' : s >= 50 ? '★★★☆☆' : s >= 25 ? '★★☆☆☆' : s > 0 ? '★☆☆☆☆' : '☆☆☆☆☆';
function calcScore(r) {
if (r.error && r.testsTotal === 0) return 0;
let s = 0;
if (r.specOk) s += 10;
if (!r.error || r.testsTotal > 0) s += 10;
if (r.testsTotal > 0) s += Math.round((r.testsPassed / r.testsTotal) * 60);
s += Math.max(0, 20 - (r.fixRounds || 0) * 10);
return Math.min(100, s);
}
// Laske pisteet jos puuttuvat
const DATA = RAW.map(r => {
if (r.score == null) r.score = calcScore(r);
if (!r.stars) r.stars = starsFor(r.score);
if (!r.promptTokensEst) r.promptTokensEst = r.promptChars ? Math.round(r.promptChars / 4) : 0;
return r;
});
const cls = r => (!r.error && r.testsPassed === r.testsTotal && r.testsTotal > 0) ? 'pass' : (r.testsTotal > 0 && r.testsPassed > 0) ? 'partial' : 'fail';
const pctBar = (passed, total, w=80) => {
if (total === 0) return '-';
const pct = passed/total*100;
const c = pct === 100 ? 'bar-fill' : 'bar-partial';
return `<span class="bar bar-bg" style="width:${w}px"><span class="bar ${c}" style="width:${Math.round(pct/100*w)}px"></span></span> ${passed}/${total}`;
};
// Meta
const totalTime = DATA.reduce((s,r) => s + r.totalDurationMs, 0);
document.getElementById('meta').textContent = `${new Date().toLocaleDateString('fi-FI')}${DATA.length} ajoa — ${(totalTime/1000/60).toFixed(1)} min`;
// Cards
const models = [...new Set(DATA.map(r => r.model))];
const scenarios = [...new Set(DATA.map(r => r.scenario))];
const avgScore = DATA.length ? Math.round(DATA.reduce((s,r) => s + r.score, 0) / DATA.length) : 0;
const totalPassed = DATA.reduce((s,r) => s + r.testsPassed, 0);
const totalTests = DATA.reduce((s,r) => s + r.testsTotal, 0);
const passRate = totalTests ? Math.round(totalPassed/totalTests*100) : 0;
const bestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, avg: Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length) };
}).sort((a,b) => b.avg - a.avg)[0];
const fastestModel = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
return { model: m, speed: Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length) };
}).sort((a,b) => b.speed - a.speed)[0];
document.getElementById('cards').innerHTML = `
<div class="card"><div class="label">Keskiarvo</div><div class="value">${starsFor(avgScore)}</div><div class="sub">${avgScore} pistetta</div></div>
<div class="card"><div class="label">Testien läpäisy</div><div class="value">${passRate}%</div><div class="sub">${totalPassed}/${totalTests} testiä</div></div>
<div class="card"><div class="label">Paras malli</div><div class="value" style="font-size:1.2rem">${bestModel?.model || '-'}</div><div class="sub">${bestModel?.avg || 0}p</div></div>
<div class="card"><div class="label">Nopein</div><div class="value" style="font-size:1.2rem">${fastestModel?.model || '-'}</div><div class="sub">${fastestModel?.speed || 0} tok/s</div></div>
<div class="card"><div class="label">Malleja</div><div class="value">${models.length}</div><div class="sub">${scenarios.length} skenaariota</div></div>
<div class="card"><div class="label">Kokonaisaika</div><div class="value">${(totalTime/1000/60).toFixed(1)}</div><div class="sub">minuuttia</div></div>
`;
// Summary table
const sumHead = document.querySelector('#summary-table thead');
const sumBody = document.querySelector('#summary-table tbody');
sumHead.innerHTML = '<tr><th>Malli</th>' + scenarios.map(s => `<th>${s}</th>`).join('') + '<th>Yht.</th><th>Out tok</th><th>Aika</th><th>tok/s</th><th>Pisteet</th></tr>';
const modelRows = models.map(m => {
const mrs = DATA.filter(r => r.model === m);
const tp = mrs.reduce((s,r) => s + r.testsPassed, 0);
const tt = mrs.reduce((s,r) => s + r.testsTotal, 0);
const tok = mrs.reduce((s,r) => s + r.totalTokens, 0);
const time = mrs.reduce((s,r) => s + r.totalDurationMs, 0);
const speed = Math.round(mrs.reduce((s,r) => s + r.avgTokPerSec, 0) / mrs.length);
const avg = Math.round(mrs.reduce((s,r) => s + r.score, 0) / mrs.length);
const scenCols = scenarios.map(s => {
const r = mrs.find(r => r.scenario === s);
if (!r) return '<td>-</td>';
return `<td class="${cls(r)}">${pctBar(r.testsPassed, r.testsTotal, 60)} <span style="color:var(--dim)">${(r.totalDurationMs/1000).toFixed(0)}s</span></td>`;
}).join('');
return { avg, html: `<tr><td class="model-name">${m}</td>${scenCols}<td>${pctBar(tp, tt)}</td><td>${(tok/1000).toFixed(1)}K</td><td>${(time/1000).toFixed(0)}s</td><td>${speed}</td><td><span class="stars">${starsFor(avg)}</span> ${avg}p</td></tr>` };
}).sort((a,b) => b.avg - a.avg);
sumBody.innerHTML = modelRows.map(r => r.html).join('');
// Results table
const resHead = document.querySelector('#results-table thead');
const resBody = document.querySelector('#results-table tbody');
const resCols = ['Malli','Skenaario','Speksi','Testit','Korjaus','Ctx','Out tok','Aika','tok/s','Pisteet'];
resHead.innerHTML = '<tr>' + resCols.map((c,i) => `<th data-col="${i}">${c}</th>`).join('') + '</tr>';
let sortCol = 9, sortAsc = false;
function renderResults() {
const sorted = [...DATA].sort((a,b) => {
const vals = [
[a.model, b.model],
[a.scenario, b.scenario],
[a.specEntities, b.specEntities],
[a.testsPassed/Math.max(a.testsTotal,1), b.testsPassed/Math.max(b.testsTotal,1)],
[a.fixRounds, b.fixRounds],
[a.promptTokensEst, b.promptTokensEst],
[a.totalTokens, b.totalTokens],
[a.totalDurationMs, b.totalDurationMs],
[a.avgTokPerSec, b.avgTokPerSec],
[a.score, b.score],
][sortCol];
const cmp = typeof vals[0] === 'string' ? vals[0].localeCompare(vals[1]) : vals[0] - vals[1];
return sortAsc ? cmp : -cmp;
});
resBody.innerHTML = sorted.map(r => {
const c = cls(r);
return `<tr>
<td class="model-name">${r.model}</td>
<td>${r.scenario}</td>
<td>${r.specOk ? `${r.specEntities}e` : '<span class="fail">✗</span>'}</td>
<td class="${c}">${pctBar(r.testsPassed, r.testsTotal)}</td>
<td>${r.fixRounds > 0 ? r.fixRounds + '×' : '-'}</td>
<td>${r.promptTokensEst > 0 ? '~'+(r.promptTokensEst/1000).toFixed(1)+'K' : '-'}</td>
<td>${r.totalTokens > 0 ? (r.totalTokens/1000).toFixed(1)+'K' : '-'}</td>
<td>${(r.totalDurationMs/1000).toFixed(0)}s</td>
<td>${r.avgTokPerSec.toFixed(0)}</td>
<td><span class="stars">${r.stars}</span> ${r.score}p</td>
</tr>`;
}).join('');
document.querySelectorAll('#results-table th').forEach((th,i) => {
th.className = i === sortCol ? (sortAsc ? 'sorted-asc' : 'sorted-desc') : '';
});
}
document.querySelector('#results-table thead').addEventListener('click', e => {
const col = parseInt(e.target.dataset.col);
if (isNaN(col)) return;
if (sortCol === col) sortAsc = !sortAsc;
else { sortCol = col; sortAsc = false; }
renderResults();
});
renderResults();
</script>
</body>
</html>

View File

@@ -0,0 +1 @@
[]

Some files were not shown because too many files have changed in this diff Show More