π 2025-05-13 β Session: Refactored LLM Evaluation to Tutoring System
π 00:05β00:20
π·οΈ Labels: LLM, Jinja2, Python, Tutoring, Pedagogy
π Project: Teaching
β Priority: MEDIUM
Session Goal
The primary goal of this session was to refactor the existing LLM evaluation system to a more modular and tutoring-focused approach, enhancing the educational experience for computer science students.
Key Activities
- Streamlined LLM Evaluation Design: Implemented a modular design using Jinja2 templates and a Python evaluator class to improve maintainability and robustness.
- Modularization of evaluator.py: Proposed separating the instantiation of the system from student input, allowing for character customization and instruction reuse without hardcoding.
- Refined Evaluation Prompt Structure: Enhanced the evaluation promptβs readability and alignment with ChatCompletion formats, ensuring clear separation between system instructions and user inputs.
- Transformation to Tutoring Focus: Adjusted the evaluation prompt to a tutoring focus, promoting active understanding and critical thinking in computer science students.
- Pedagogical Shift: Proposed a pedagogical shift from evaluator to tutor, emphasizing guidance and student support over direct correction.
Achievements
- Successfully designed a modular and reusable evaluation system.
- Developed a refined prompt structure that supports both evaluation and tutoring.
Pending Tasks
- Further testing and integration of the new tutoring-focused prompts with existing LLM tools to ensure compatibility and effectiveness.
