Refactored Python Script for Jupyter I/O Extraction
- Day: 2023-02-14
- Time: 03:15 to 03:55
- Project: Dev
- Workspace: WP 2: Operational
- Status: In Progress
- Priority: MEDIUM
- Assignee: Matías Nehuen Iglesias
- Tags: Python, Jupyter, Data Processing, Code Refactoring
Description
Session Goal
The primary aim of this session was to develop and refine a Python script capable of extracting input and output file paths from Jupyter notebooks. This involves analyzing read and write commands for CSV and GeoDataFrame files.
Key Activities
- Script Development: Initiated with a basic script to extract file paths from Jupyter notebooks by scanning for read and write operations.
- Code Correction: Corrected a variable name typo from
filestofile_inputsto ensure the script functions correctly. - Dataframe Creation: Implemented a method to create a dataframe that captures input and output file information using Python and pandas.
- Code Enhancement: Enhanced the script to check if lines with specific function calls are commented out.
- Code Refactoring: Refactored the code to modularize the file extraction functionality, improving readability and maintainability.
Achievements
- Successfully developed a robust Python script that extracts and organizes input and output file information from Jupyter notebooks.
- Improved code quality and functionality through iterative enhancements and refactoring.
Pending Tasks
- Further testing and validation of the script on different types of Jupyter notebooks to ensure comprehensive functionality.
Evidence
- source_file=2023-02-14.sessions.jsonl, line_number=1, event_count=0, session_id=8d714ea4e30b9b481f0d7a39930b01289baeac81ff61362061f6d6fcc9700d14
- event_ids: []