Designed Generalized Bill Ingestion System

  • Day: 2025-02-24
  • Time: 21:15 to 21:50
  • Project: Dev
  • Workspace: WP 2: Operational
  • Status: In Progress
  • Priority: MEDIUM
  • Assignee: Matías Nehuen Iglesias
  • Tags: Bill Ingestion, Data Parsing, Automation, Financial Reporting, Pdf Processing

Description

Session Goal

The session aimed to design a generalized bill ingestion system capable of parsing bills from PDFs, standardizing data, and storing it in a structured format for financial reporting.

Key Activities

  • Developed a plan for a unified table format and custom parsers for different bill types.
  • Analyzed directory structure for organizing raw data and recommended improvements for an automated ingestion pipeline.
  • Explored differences between AySA documents (CuponPago vs Factura) and proposed a data structure for managing these documents.
  • Outlined the process for extracting and structuring data from credit card statements, including Visa and Mastercard.
  • Designed a general schema for managing financial documents, including bills and credit card transactions.
  • Conducted a deep dive into property tax bills, analyzing their structures and suggesting parser creation.

Achievements

  • Established a comprehensive framework for bill ingestion and financial data processing.
  • Clarified the differences and data structure needs for various financial documents.

Pending Tasks

  • Implement the proposed unified parsing approach for Visa and Mastercard statements.
  • Develop and test custom parsers for property tax bills and other unique document types.

Evidence

  • source_file=2025-02-24.sessions.jsonl, line_number=0, event_count=0, session_id=d37951e904384a13865f0fd0b1978b1d36d3bc48f09a15b0832897525878e210
  • event_ids: []