Developed AI prompt and evaluation strategy for Onyx Hammer

Day: 2025-06-18
Time: 23:30 to 00:00
Project: Dev
Workspace: WP 2: Operational
Status: In Progress
Priority: MEDIUM
Assignee: Matías Nehuen Iglesias
Tags: AI, Prompt Engineering, Onyx Hammer, Rubric Design, Model Evaluation

Description

Session Goal

The session aimed to develop a comprehensive strategy for AI prompt engineering and evaluation, specifically focusing on the Onyx Hammer project and AI robustness in Physics.

Key Activities

Reviewed the structural breakdown of the Onyx Hammer project, including its mission, compensation model, and current status.
Outlined a strategy for prompt engineering and AI robustness evaluation, targeting weaknesses in AI models within the Physics domain.
Crafted a complex AI prompt focused on atmospheric entry dynamics, highlighting the need for expert reasoning and modeling.
Analyzed quiz items from Step 3 of the Onyx Hammer framework to understand AI model performance and failure criteria.
Constructed a rubric for evaluating AI responses, emphasizing clarity, objectivity, and alignment with prompt demands.

Achievements

Developed a detailed strategy for prompt engineering and AI robustness evaluation.
Created an expert-level AI prompt for Physics modeling.
Analyzed and clarified quiz responses and failure criteria for AI models.
Constructed effective rubric criteria for AI evaluation.

Pending Tasks

Implement the developed strategy and prompts in real-world testing scenarios.
Monitor AI model performance using the constructed rubric and adjust criteria as needed.

Evidence

source_file=2025-06-18.sessions.jsonl, line_number=1, event_count=0, session_id=e66b7ff4255dd1c7e68f5c460064d99629d607343090e825ce95a704f5e128f7
event_ids: []

M.I. Journal

Journal Entries

Frequent Keywords

Developed AI prompt and evaluation strategy for Onyx Hammer

Developed AI prompt and evaluation strategy for Onyx Hammer

Description

Session Goal

Key Activities

Achievements

Pending Tasks

Evidence

Graph View

Table of Contents

Backlinks