Älykäs reititys: capability=heavy priorisoi isoimman mallin solmun
Hub:
- Parsii node_models:sta suurimman mallin parametrimäärän (B)
per solmu (esim. qwen3:32b → 32, qwen2.5-coder:7b → 7)
- Tallentaa node_max_param_b: HashMap<u64, u32>
- ChatCompletionRequest: uusi capability-kenttä ("heavy"/"light")
- Reitityslogiikka: capability=heavy → valitsee solmun jolla on
suurin malli; oletus → natiivi ensin kuten ennenkin
Frontend (pipeline):
- JSON-speksin generointi: capability=heavy
- QA-korjaussilmukan koodikorjaus: capability=heavy
- Observer/README-arviointi: capability=heavy
- Vaatimukset (Client): oletus (kevyt, kelpaa pieni malli)
Tämä mahdollistaa sen, että A40-koneella pyörivä Qwen3:32B
saa raskaat tehtävät ja selaimen 0.5B-malli hoitaa kevyet.
This commit is contained in:
@@ -846,6 +846,7 @@ OUTPUT FORMAT:
|
||||
top_k: opts.topK ?? settings.topK ?? undefined,
|
||||
max_tokens: opts.maxTokens ?? settings.maxTokens ?? undefined,
|
||||
repeat_penalty: opts.repeatPenalty ?? settings.repeatPenalty ?? undefined,
|
||||
capability: opts.capability || undefined, // "heavy" → isoin malli
|
||||
};
|
||||
|
||||
const res = await fetch('/api/v1/chat/completions', {
|
||||
@@ -1422,7 +1423,7 @@ Blog → Author: name,email,bio(Text|None) / Post: title, content(Text), author_
|
||||
highlightAgent('manager');
|
||||
explainStep('Arkkitehtuuri', `${mgr.name} analysoi vaatimukset ja tuottaa JSON-speksin: entiteetit, kentät, tyypit.`);
|
||||
|
||||
const specRaw = await kpnRun(mgr.model, `${brief}\n\nOutput a JSON spec for this project.`, false, { ...mgr, prompt: SPEC_SYSTEM });
|
||||
const specRaw = await kpnRun(mgr.model, `${brief}\n\nOutput a JSON spec for this project.`, false, { ...mgr, prompt: SPEC_SYSTEM, capability: 'heavy' });
|
||||
const spec = specRaw ? extractJson(specRaw) : null;
|
||||
promptLog.push({ step: 1, agentKey: 'manager', agentName: mgr.name, model: mgr.model, label: 'JSON-speksi', systemPrompt: SPEC_SYSTEM, userPrompt: brief, response: specRaw || '' });
|
||||
|
||||
@@ -1493,7 +1494,7 @@ Blog → Author: name,email,bio(Text|None) / Post: title, content(Text), author_
|
||||
explainStep(`Korjaus: ${fname}`, `${fixAgent.name} korjaa validoinnin löytämät ongelmat.`);
|
||||
|
||||
const fixPrompt = `Fix the following issues in this Python file. Return ONLY the complete corrected file, no explanations.\n\nISSUES:\n${fIssues.join('\n')}\n\nCURRENT FILE (${fname}):\n\`\`\`python\n${files[fname]}\`\`\``;
|
||||
const fixResult = await kpnRun(fixAgent.model, fixPrompt, false, { ...fixAgent, prompt: 'You are a Python code fixer. Return ONLY the corrected Python file. No markdown fences, no explanations — just valid Python code.' });
|
||||
const fixResult = await kpnRun(fixAgent.model, fixPrompt, false, { ...fixAgent, prompt: 'You are a Python code fixer. Return ONLY the corrected Python file. No markdown fences, no explanations — just valid Python code.', capability: 'heavy' });
|
||||
|
||||
if (fixResult) {
|
||||
// Poistetaan markdown-koodiblokit jos LLM palauttaa ne
|
||||
@@ -1557,7 +1558,7 @@ Blog → Author: name,email,bio(Text|None) / Post: title, content(Text), author_
|
||||
`## Architecture\nDescribe the project structure and design decisions.\n\n` +
|
||||
`## Risk Assessment\n| Severity | Issue |\n|----------|-------|\n| ... | ... |\n\n` +
|
||||
`Project code:\n${finalCode}`;
|
||||
const readme = await kpnRun(obs.model, obsPrompt, false, obs);
|
||||
const readme = await kpnRun(obs.model, obsPrompt, false, { ...obs, capability: 'heavy' });
|
||||
if (readme) {
|
||||
files['README.md'] = readme;
|
||||
// Tallennetaan raportti globaalisti jotta tarkkailija-klikkaus avaa sen
|
||||
|
||||
Reference in New Issue
Block a user