Manual Evaluation Report

Generated from the local RAG responses. Use Run Evaluation to refresh this page from the current model behavior. Fill scores (1–5) manually.

Generated at: 2026-04-27 16:13:40

RAG Parameters

Model: gpt-4.1
Temperature: 0.2
Context: single page text (per question)
Prompt: baseline + LARF + StudyBuddy system constraints

Model Temp Prompt Sweep Temps Sweep Prompts

Auto Average

4.04

0% 0/0 0.0s

Run History

Timestamp	Mode	Model	Temp	Prompt	Auto Avg	Duration (s)	Status