πŸ“… 2024-10-04 β€” Session: AI Evaluation Workflow Implementation

πŸ•’ 22:45–23:55
🏷️ Labels: AI, Evaluation, Openai, Workflow, Automation, Schema
πŸ“‚ Project: Dev
⭐ Priority: MEDIUM

Session Goal

The goal of this session was to design and implement an AI workflow for evaluating notebooks using predefined rubrics. This involved integrating the OpenAI API and ensuring proper error handling and schema validation.

Key Activities

  • Designed a framework for an AI evaluation workflow using traffic light evaluations and structured data storage.
  • Updated the AIEvaluator class to integrate with the OpenAI API, improving error handling and JSON result storage.
  • Developed a JSON schema for rubric evaluations, categorizing results as β€˜green’, β€˜yellow’, or β€˜red’.
  • Implemented a Python function to extract specific consigna schemas from a rubric evaluation schema.
  • Resolved errors related to invalid JSON schema in OpenAI API calls, ensuring proper structure and validation.

Achievements

  • Successfully designed and partially implemented an AI evaluation workflow.
  • Improved the AIEvaluator class for better API integration and error handling.
  • Created a robust JSON schema for rubric evaluations.

Pending Tasks

  • Complete the integration of the AI evaluation workflow with the OpenAI API.
  • Further test the error handling mechanisms and schema validations.