📅 2025-02-24 — Session: Designed Generalized Bill Ingestion System and Data Schema
🕒 21:15–21:50
🏷️ Labels: Bill Ingestion, Data Extraction, Financial Schema, Automation, Credit Card Processing, Property Tax
📂 Project: Dev
⭐ Priority: MEDIUM
Session Goal:
The session aimed to design a generalized system for bill ingestion, data extraction from financial documents, and creating a unified data schema for financial reporting.
Key Activities:
- Bill Ingestion System Design: Developed a plan for parsing bills from PDFs, standardizing the data, and storing it in a structured format. This included creating custom parsers for different bill types and automating the data ingestion process.
- Directory Structure Analysis: Analyzed the directory structure for organizing raw data, identified strengths and potential issues, and provided recommendations for an automated ingestion pipeline.
- Document Understanding: Differentiated between CuponPago and Factura documents from AySA, proposed a data structure design, and outlined an ingestion plan.
- Credit Card Data Extraction: Discussed extracting data from credit card statements, including key fields and integration into a financial data processing pipeline.
- Unified Parsing Approach: Proposed a unified approach for parsing Visa and Mastercard statements.
- General Financial Schema Design: Outlined a schema for managing financial documents, including tables for bills, credit card transactions, and payments.
- Property Tax Bills Analysis: Analyzed property tax bills to suggest a parser design for effective handling.
Achievements:
- Developed a comprehensive plan for a generalized bill ingestion system.
- Created a unified data schema for financial documents.
- Proposed a unified parsing approach for Visa and Mastercard statements.
Pending Tasks:
- Implement the designed parsers and ingestion pipeline.
- Test the unified data schema with sample data.
- Develop a parser for property tax bills based on the analysis.