Developed AI prompt and evaluation strategy for Onyx Hammer

📅 2025-06-18 — Session: Developed AI prompt and evaluation strategy for Onyx Hammer

🕒 23:30–00:00
🏷️ Labels: AI, Prompt Engineering, Onyx Hammer, Rubric Design, Model Evaluation
📂 Project: Dev

Session Goal

The session aimed to develop a comprehensive strategy for AI prompt engineering and evaluation, specifically focusing on the Onyx Hammer project and AI robustness in Physics.

Key Activities

Reviewed the structural breakdown of the Onyx Hammer project, including its mission, compensation model, and current status.
Outlined a strategy for prompt engineering and AI robustness evaluation, targeting weaknesses in AI models within the Physics domain.
Crafted a complex AI prompt focused on atmospheric entry dynamics, highlighting the need for expert reasoning and modeling.
Analyzed quiz items from Step 3 of the Onyx Hammer framework to understand AI model performance and failure criteria.
Constructed a rubric for evaluating AI responses, emphasizing clarity, objectivity, and alignment with prompt demands.

Achievements

Developed a detailed strategy for prompt engineering and AI robustness evaluation.
Created an expert-level AI prompt for Physics modeling.
Analyzed and clarified quiz responses and failure criteria for AI models.
Constructed effective rubric criteria for AI evaluation.

Pending Tasks

Implement the developed strategy and prompts in real-world testing scenarios.
Monitor AI model performance using the constructed rubric and adjust criteria as needed.

M.I. Journal

Journal Entries

Frequent Keywords

Developed AI prompt and evaluation strategy for Onyx Hammer

📅 2025-06-18 — Session: Developed AI prompt and evaluation strategy for Onyx Hammer

Session Goal

Key Activities

Achievements

Pending Tasks

Graph View

Table of Contents

Backlinks