πŸ“… 2025-05-13 β€” Session: Refactored LLM Evaluation to Tutoring System

πŸ•’ 00:05–00:20
🏷️ Labels: LLM, Jinja2, Python, Tutoring, Pedagogy
πŸ“‚ Project: Teaching
⭐ Priority: MEDIUM

Session Goal

The primary goal of this session was to refactor the existing LLM evaluation system to a more modular and tutoring-focused approach, enhancing the educational experience for computer science students.

Key Activities

  • Streamlined LLM Evaluation Design: Implemented a modular design using Jinja2 templates and a Python evaluator class to improve maintainability and robustness.
  • Modularization of evaluator.py: Proposed separating the instantiation of the system from student input, allowing for character customization and instruction reuse without hardcoding.
  • Refined Evaluation Prompt Structure: Enhanced the evaluation prompt’s readability and alignment with ChatCompletion formats, ensuring clear separation between system instructions and user inputs.
  • Transformation to Tutoring Focus: Adjusted the evaluation prompt to a tutoring focus, promoting active understanding and critical thinking in computer science students.
  • Pedagogical Shift: Proposed a pedagogical shift from evaluator to tutor, emphasizing guidance and student support over direct correction.

Achievements

  • Successfully designed a modular and reusable evaluation system.
  • Developed a refined prompt structure that supports both evaluation and tutoring.

Pending Tasks

  • Further testing and integration of the new tutoring-focused prompts with existing LLM tools to ensure compatibility and effectiveness.