📅 2023-01-04 — Session: Developed Python functions for data extraction and file handling

🕒 20:15–23:55
🏷️ Labels: Python, Data Extraction, File Handling, Optimization
📂 Project: Dev
⭐ Priority: MEDIUM

Session Goal: The session aimed to develop and optimize Python functions for data extraction and file handling, particularly focusing on handling GeoJSON files and extracting arguments from pandas DataFrames.

Key Activities:

  • Discussed Python code and potential issues related to file search and data extraction.
  • Implemented a Python function to search for strings within files and handle errors gracefully.
  • Developed a function to search for strings in text files and store results in a Pandas DataFrame.
  • Created a method to extract arguments from pd.read_csv and gpd.read_file using regular expressions.
  • Implemented a boolean series in a DataFrame to detect commented lines.
  • Provided instructions for selecting cells in a Pandas DataFrame using Visual Studio Code.
  • Listed GeoJSON files in a directory using the os module.
  • Developed a script to calculate zonal statistics for raster data using GeoJSON files.
  • Suggested code optimization techniques for data processing, including using pathlib, list comprehensions, and groupby.

Achievements:

  • Successfully implemented Python functions for file searching and data extraction.
  • Enhanced data processing scripts with optimization techniques.

Pending Tasks:

  • Further optimize the zonal statistics calculation script for larger datasets.
  • Explore additional file handling techniques using the glob module for recursive searches.