Refactored LLM Evaluation to Tutoring System
- Day: 2025-05-13
- Time: 00:05 to 00:20
- Project: Teaching
- Workspace: WP 1: Strategic / Growth & Development
- Status: In Progress
- Priority: MEDIUM
- Assignee: Matías Nehuen Iglesias
- Tags: LLM, Jinja2, Python, Tutoring, Pedagogy
Description
Session Goal
The primary goal of this session was to refactor the existing LLM evaluation system to a more modular and tutoring-focused approach, enhancing the educational experience for computer science students.
Key Activities
- Streamlined LLM Evaluation Design: Implemented a modular design using Jinja2 templates and a Python evaluator class to improve maintainability and robustness.
- Modularization of evaluator.py: Proposed separating the instantiation of the system from student input, allowing for character customization and instruction reuse without hardcoding.
- Refined Evaluation Prompt Structure: Enhanced the evaluation prompt’s readability and alignment with ChatCompletion formats, ensuring clear separation between system instructions and user inputs.
- Transformation to Tutoring Focus: Adjusted the evaluation prompt to a tutoring focus, promoting active understanding and critical thinking in computer science students.
- Pedagogical Shift: Proposed a pedagogical shift from evaluator to tutor, emphasizing guidance and student support over direct correction.
Achievements
- Successfully designed a modular and reusable evaluation system.
- Developed a refined prompt structure that supports both evaluation and tutoring.
Pending Tasks
- Further testing and integration of the new tutoring-focused prompts with existing LLM tools to ensure compatibility and effectiveness.
Evidence
- source_file=2025-05-13.sessions.jsonl, line_number=1, event_count=0, session_id=2c183da0a9a50d7142e5561c53184540650217944658ad26efdf3e0e968b2d16
- event_ids: []