📅 2025-08-15 — Session: Conducted cost and performance analysis for AI embeddings

🕒 12:00–12:10
🏷️ Labels: Ai Embeddings, Cost Analysis, Gpu Performance, Pricing, Python
📂 Project: Business
⭐ Priority: MEDIUM

Session Goal

The session aimed to analyze pricing and performance metrics for AI embedding models from various providers, including OpenAI, Cohere, Google Vertex AI, and AWS Titan, for the year 2025.

Key Activities

  1. Pricing Queries: Gathered search queries related to the pricing of AI embedding models from OpenAI, Cohere, Google, and AWS.
  2. Cost Estimation: Developed a Python script to estimate costs for processing abstracts using different AI models based on token consumption.
  3. Time Calculations: Created a function to calculate processing time for abstracts at various throughput rates.
  4. GPU Performance: Calculated throughput for different GPU configurations and their processing capabilities.
  5. Cluster Cost Calculation: Implemented a function to compute costs for using GPU clusters based on hours, number of GPUs, and cost per GPU hour.
  6. Comprehensive Cost and Time Analysis: Analyzed costs and time estimates for embedding 40 million abstracts, providing recommendations for hosted APIs and self-hosted solutions.

Achievements

  • Completed a detailed cost and time analysis for embedding 40 million abstracts, including practical recommendations for implementation.

Pending Tasks

  • Further validation of cost models against real-world data to ensure accuracy and reliability.
  • Exploration of additional AI embedding providers for a more comprehensive comparison.