Developed Regex-based Text Filtering Script
- Day: 2024-07-14
- Time: 00:50 to 01:35
- Project: Dev
- Workspace: WP 2: Operational
- Status: In Progress
- Priority: MEDIUM
- Assignee: Matías Nehuen Iglesias
- Tags: Python, Regex, Data Processing, GCP, Automation
Description
Session Goal:
The session aimed to develop and refine a Python script for filtering and pattern matching on text files, specifically targeting the ‘int.int’ pattern.
Key Activities:
- File Handling: The user was prompted twice to upload the missing
outline.txtfile to proceed with processing. - Script Development: A Python script was developed to filter lines from a text outline and identify lines containing a specific numeric pattern (‘int.int’). The script was revised to enhance its data processing capabilities, ensuring flexibility in pattern matching.
- Technical Exploration: Insights were gathered from the book ‘Mastering Data Engineering and Machine Learning on Google Cloud Platform,’ focusing on automation, job scheduling, and monitoring of ML solutions.
Achievements:
- Successfully developed a Python script capable of filtering lines and matching regex patterns in text files.
- Gained insights into automation and job scheduling on Google Cloud Platform.
Pending Tasks:
- Upload the
outline.txtfile to complete the processing and testing of the developed script.
Evidence
- source_file=2024-07-14.sessions.jsonl, line_number=0, event_count=0, session_id=1f06bfb90fe83b1adf34d222d03e2db2166f2253fe033ea783db68439fce30a0
- event_ids: []