📅 2024-09-11 — Session: Data Gathering and Processing Workflow Execution
🕒 17:00–19:00
🏷️ Labels: Data Gathering, Web Crawling, Data Processing, Investment, Optimization
📂 Project: Dev
⭐ Priority: MEDIUM
Session Goal
The goal of this session was to outline and execute workflows for data gathering and processing, focusing on value-based investment strategies.
Key Activities
- Data Gathering and Processing Workflow: Developed a comprehensive workflow for creating subsets of stores, crawling data, and processing it for value-based investment baskets.
- Precios Claros Scraping: Outlined a workflow for scraping and analyzing store data using the Precios Claros crawler.
- Store Selection: Implemented a Python approach to select the closest stores by group using DataFrame operations.
- Geospatial Data Handling: Fetched GeoJSON files for Buenos Aires and CABA, and optimized geospatial data visualization using GeoPandas and Matplotlib.
- Scrapy Crawler Optimization: Optimized a Scrapy crawler for specific store IDs and analyzed execution logs to improve efficiency.
- Savings Opportunity Calculation: Calculated potential savings by comparing current prices to median prices, aiming to optimize a basket of goods.
Achievements
- Established a structured approach for data gathering and processing.
- Enhanced scraping efficiency and data visualization techniques.
Pending Tasks
- Further optimization of data processing workflows.
- Implementation of savings opportunity strategies in investment decisions.