Refactored LLM Evaluation to Tutoring System

  • Day: 2025-05-13
  • Time: 00:05 to 00:20
  • Project: Teaching
  • Workspace: WP 1: Strategic / Growth & Development
  • Status: In Progress
  • Priority: MEDIUM
  • Assignee: Matías Nehuen Iglesias
  • Tags: LLM, Jinja2, Python, Tutoring, Pedagogy

Description

Session Goal

The primary goal of this session was to refactor the existing LLM evaluation system to a more modular and tutoring-focused approach, enhancing the educational experience for computer science students.

Key Activities

  • Streamlined LLM Evaluation Design: Implemented a modular design using Jinja2 templates and a Python evaluator class to improve maintainability and robustness.
  • Modularization of evaluator.py: Proposed separating the instantiation of the system from student input, allowing for character customization and instruction reuse without hardcoding.
  • Refined Evaluation Prompt Structure: Enhanced the evaluation prompt’s readability and alignment with ChatCompletion formats, ensuring clear separation between system instructions and user inputs.
  • Transformation to Tutoring Focus: Adjusted the evaluation prompt to a tutoring focus, promoting active understanding and critical thinking in computer science students.
  • Pedagogical Shift: Proposed a pedagogical shift from evaluator to tutor, emphasizing guidance and student support over direct correction.

Achievements

  • Successfully designed a modular and reusable evaluation system.
  • Developed a refined prompt structure that supports both evaluation and tutoring.

Pending Tasks

  • Further testing and integration of the new tutoring-focused prompts with existing LLM tools to ensure compatibility and effectiveness.

Evidence

  • source_file=2025-05-13.sessions.jsonl, line_number=1, event_count=0, session_id=2c183da0a9a50d7142e5561c53184540650217944658ad26efdf3e0e968b2d16
  • event_ids: []