📅 2023-07-30 — Session: Optimized Disk Space and DataFrame Operations
🕒 09:00–10:45
🏷️ Labels: Disk Usage, Pandas, Data Manipulation, Linux, Error Handling
📂 Project: Dev
⭐ Priority: MEDIUM
Session Goal
The session aimed to optimize disk space on a Linux system and perform advanced data manipulation tasks using Python’s pandas library.
Key Activities
- Disk Space Management: Utilized terminal commands
dfandduto check disk usage and identify large files in the/homepartition. Removed unnecessary files usingrmto free up space. - DataFrame Manipulation: Executed several pandas operations including filtering DataFrames from multiple CSV files, selecting top rows from grouped data, creating new columns based on conditions, and using
maskfor conditional value replacement. - String Operations: Applied
str.replaceandstr.stripmethods for precise string manipulation within DataFrames. - Error Handling: Addressed the
SpecificationErrorin pandas by using multipleaggstatements for aggregation without nesting.
Achievements
- Successfully freed up disk space by removing large unnecessary files.
- Enhanced skills in pandas for data manipulation, including filtering, grouping, and string operations.
- Resolved a common error in pandas, improving data processing workflows.
Pending Tasks
- Further exploration of advanced pandas functions for more complex data manipulation scenarios.
- Continuous monitoring of disk space to prevent future storage issues.