πŸ“… 2025-07-05 β€” Session: Developed and Refined CSV Data Processing Pipelines

πŸ•’ 23:45–00:00
🏷️ Labels: Csv Processing, Data Transformation, Python, Pandas, Banking
πŸ“‚ Project: Dev
⭐ Priority: MEDIUM

Session Goal:

The session aimed to develop and refine data processing pipelines for CSV files from Erste Bank and Banco Galicia, focusing on encoding issues, data cleaning, and transformation.

Key Activities:

  • Resolved a CSV file encoding error by changing the encoding to β€˜utf-16’, providing a Python code snippet for implementation.
  • Outlined a complete pipeline for reading, cleaning, and exporting Erste Bank CSV files, ensuring proper handling of irregular fields.
  • Validated transaction data structures and suggested further analysis and automation steps.
  • Processed Galicia transaction data from Excel files, normalizing financial figures and standardizing dates.
  • Inspected DataFrame column names to diagnose issues with the expected β€˜Fecha’ column.
  • Adjusted file loading to correctly set column names and process data without standard headers.
  • Reimported β€˜ace_tools’ and displayed corrected DataFrame of Galicia transactions.
  • Completed the transformation of Galicia tables, offering options to add data to CSV pipeline or review them.
  • Provided a Python script to process Banco Galicia extracts, converting data into a standardized format.

Achievements:

  • Successfully developed pipelines for processing CSV and Excel files from Erste Bank and Banco Galicia.
  • Resolved encoding issues and standardized data formats for further analysis.

Pending Tasks:

  • Refactor the Python script into a reusable function for processing Banco Galicia extracts.
  • Further automate the transaction analysis and data validation processes.