📅 2023-01-04 — Session: Developed Python functions for data extraction and file handling
🕒 20:15–23:55
🏷️ Labels: Python, Data Extraction, File Handling, Optimization
📂 Project: Dev
⭐ Priority: MEDIUM
Session Goal: The session aimed to develop and optimize Python functions for data extraction and file handling, particularly focusing on handling GeoJSON files and extracting arguments from pandas DataFrames.
Key Activities:
- Discussed Python code and potential issues related to file search and data extraction.
- Implemented a Python function to search for strings within files and handle errors gracefully.
- Developed a function to search for strings in text files and store results in a Pandas DataFrame.
- Created a method to extract arguments from
pd.read_csvandgpd.read_fileusing regular expressions. - Implemented a boolean series in a DataFrame to detect commented lines.
- Provided instructions for selecting cells in a Pandas DataFrame using Visual Studio Code.
- Listed GeoJSON files in a directory using the
osmodule. - Developed a script to calculate zonal statistics for raster data using GeoJSON files.
- Suggested code optimization techniques for data processing, including using
pathlib, list comprehensions, andgroupby.
Achievements:
- Successfully implemented Python functions for file searching and data extraction.
- Enhanced data processing scripts with optimization techniques.
Pending Tasks:
- Further optimize the zonal statistics calculation script for larger datasets.
- Explore additional file handling techniques using the
globmodule for recursive searches.