Developed Regex-based Text Filtering Script

  • Day: 2024-07-14
  • Time: 00:50 to 01:35
  • Project: Dev
  • Workspace: WP 2: Operational
  • Status: In Progress
  • Priority: MEDIUM
  • Assignee: Matías Nehuen Iglesias
  • Tags: Python, Regex, Data Processing, GCP, Automation

Description

Session Goal:

The session aimed to develop and refine a Python script for filtering and pattern matching on text files, specifically targeting the ‘int.int’ pattern.

Key Activities:

  • File Handling: The user was prompted twice to upload the missing outline.txt file to proceed with processing.
  • Script Development: A Python script was developed to filter lines from a text outline and identify lines containing a specific numeric pattern (‘int.int’). The script was revised to enhance its data processing capabilities, ensuring flexibility in pattern matching.
  • Technical Exploration: Insights were gathered from the book ‘Mastering Data Engineering and Machine Learning on Google Cloud Platform,’ focusing on automation, job scheduling, and monitoring of ML solutions.

Achievements:

  • Successfully developed a Python script capable of filtering lines and matching regex patterns in text files.
  • Gained insights into automation and job scheduling on Google Cloud Platform.

Pending Tasks:

  • Upload the outline.txt file to complete the processing and testing of the developed script.

Evidence

  • source_file=2024-07-14.sessions.jsonl, line_number=0, event_count=0, session_id=1f06bfb90fe83b1adf34d222d03e2db2166f2253fe033ea783db68439fce30a0
  • event_ids: []