📅 2025-02-27 — Session: Optimized BigQuery and Backup Storage Management

🕒 16:20–17:45
🏷️ Labels: Bigquery, Backup Management, Data Optimization, Cloud Storage, Disk Cleanup
📂 Project: Dev
⭐ Priority: MEDIUM

Session Goal

The session aimed to optimize data storage and management strategies, focusing on BigQuery and backup directories.

Key Activities

  • Explored storage costs and strategies for BigQuery and Google Cloud Storage (GCS), emphasizing efficient data management and cost reduction.
  • Implemented optimization techniques for storing and querying large datasets in BigQuery using Parquet format and external tables.
  • Developed a systematic approach for reorganizing and cleaning up data storage, including merging old backups and identifying redundant files.
  • Diagnosed and resolved disk space discrepancies in backup directories, employing command-line tools for efficient analysis.
  • Executed a comprehensive cleanup plan for the French_exporters project directory, addressing duplicate files and optimizing disk space usage.
  • Utilized git filter-repo for Git history cleanup, ensuring repository efficiency.

Achievements

  • Successfully outlined strategies for cost-effective data management in BigQuery and GCS.
  • Completed a detailed plan for backup cleanup and optimization, expected to recover significant disk space.
  • Enhanced the French_exporters project directory by removing unnecessary files and compressing datasets.

Pending Tasks

  • Further investigate and implement additional cost-saving measures in cloud storage.
  • Continue monitoring and optimizing backup directories to maintain efficient storage.