Designed Generalized Bill Ingestion System
- Day: 2025-02-24
- Time: 21:15 to 21:50
- Project: Dev
- Workspace: WP 2: Operational
- Status: In Progress
- Priority: MEDIUM
- Assignee: Matías Nehuen Iglesias
- Tags: Bill Ingestion, Data Parsing, Automation, Financial Reporting, Pdf Processing
Description
Session Goal
The session aimed to design a generalized bill ingestion system capable of parsing bills from PDFs, standardizing data, and storing it in a structured format for financial reporting.
Key Activities
- Developed a plan for a unified table format and custom parsers for different bill types.
- Analyzed directory structure for organizing raw data and recommended improvements for an automated ingestion pipeline.
- Explored differences between AySA documents (CuponPago vs Factura) and proposed a data structure for managing these documents.
- Outlined the process for extracting and structuring data from credit card statements, including Visa and Mastercard.
- Designed a general schema for managing financial documents, including bills and credit card transactions.
- Conducted a deep dive into property tax bills, analyzing their structures and suggesting parser creation.
Achievements
- Established a comprehensive framework for bill ingestion and financial data processing.
- Clarified the differences and data structure needs for various financial documents.
Pending Tasks
- Implement the proposed unified parsing approach for Visa and Mastercard statements.
- Develop and test custom parsers for property tax bills and other unique document types.
Evidence
- source_file=2025-02-24.sessions.jsonl, line_number=0, event_count=0, session_id=d37951e904384a13865f0fd0b1978b1d36d3bc48f09a15b0832897525878e210
- event_ids: []