Finalized dataset and extrapolation for forecasting

  • Day: 2024-09-28
  • Time: 16:05 to 16:45
  • Project: Dev
  • Workspace: WP 2: Operational
  • Status: In Progress
  • Priority: MEDIUM
  • Assignee: Matías Nehuen Iglesias
  • Tags: Data Analysis, Forecasting, Python, Pandas, Extrapolation

Description

Session Goal

The session aimed to finalize a dataset for model fitting by extending the DataFrame with forecasted values, ensuring continuity, and preparing it for future analysis.

Key Activities

  • Developed Python scripts to extend a DataFrame (combined_df) to January 2025 using linear extrapolation and a combination of linear extrapolation with median month differences.
  • Implemented extrapolation functions to handle individual last valid dates for various time series columns.
  • Corrected the extrapolation method to ensure each series is handled independently, avoiding NaN issues.
  • Merged extrapolated values into the existing DataFrame, ensuring proper alignment and avoiding new row creation.
  • Automated the filling of NaN values in DataFrame columns using extrapolated data and cleaned up redundant columns.

Achievements

  • Successfully extended the dataset to include forecasted values up to January 2025.
  • Ensured accurate forecasting by handling each time series independently and correcting extrapolation methods.
  • Improved data integrity by merging extrapolated values correctly and cleaning up the DataFrame.

Pending Tasks

  • Validate the forecasted dataset through model fitting and performance evaluation to ensure its readiness for predictive analysis.

Evidence

  • source_file=2024-09-28.sessions.jsonl, line_number=0, event_count=0, session_id=6288251d61bfb632e6d29aa9f275b9fb9e08fee49e6c2987451d68cd44921081
  • event_ids: []