Finalized dataset and extrapolation for forecasting
- Day: 2024-09-28
- Time: 16:05 to 16:45
- Project: Dev
- Workspace: WP 2: Operational
- Status: In Progress
- Priority: MEDIUM
- Assignee: Matías Nehuen Iglesias
- Tags: Data Analysis, Forecasting, Python, Pandas, Extrapolation
Description
Session Goal
The session aimed to finalize a dataset for model fitting by extending the DataFrame with forecasted values, ensuring continuity, and preparing it for future analysis.
Key Activities
- Developed Python scripts to extend a DataFrame (
combined_df) to January 2025 using linear extrapolation and a combination of linear extrapolation with median month differences. - Implemented extrapolation functions to handle individual last valid dates for various time series columns.
- Corrected the extrapolation method to ensure each series is handled independently, avoiding NaN issues.
- Merged extrapolated values into the existing DataFrame, ensuring proper alignment and avoiding new row creation.
- Automated the filling of NaN values in DataFrame columns using extrapolated data and cleaned up redundant columns.
Achievements
- Successfully extended the dataset to include forecasted values up to January 2025.
- Ensured accurate forecasting by handling each time series independently and correcting extrapolation methods.
- Improved data integrity by merging extrapolated values correctly and cleaning up the DataFrame.
Pending Tasks
- Validate the forecasted dataset through model fitting and performance evaluation to ensure its readiness for predictive analysis.
Evidence
- source_file=2024-09-28.sessions.jsonl, line_number=0, event_count=0, session_id=6288251d61bfb632e6d29aa9f275b9fb9e08fee49e6c2987451d68cd44921081
- event_ids: []