Developed Email Categorization System with NLP
- Day: 2024-10-05
- Time: 23:40 to 00:00
- Project: Dev
- Workspace: WP 2: Operational
- Status: In Progress
- Priority: MEDIUM
- Assignee: Matías Nehuen Iglesias
- Tags: Email Categorization, Machine Learning, NLP, Data Extraction, Feature Engineering
Description
Session Goal
The session aimed to develop a machine learning-based email categorization system using NLP techniques and classification algorithms.
Key Activities
- Planning: Outlined a framework for creating an email categorization system using manually labeled examples as training data.
- Classifier Optimization: Discussed strategies to enhance classifier performance by addressing dataset limitations, such as increasing training examples and improving feature extraction methods.
- Data Extraction: Developed a workflow to extract subjects and content from HTML files, preparing the data for machine learning.
- Feature Engineering: Proposed combining email subject lines and content as features for a classification model.
Achievements
- Established a plan for the email categorization system leveraging NLP and machine learning.
- Developed a structured approach for data extraction from HTML files.
- Proposed methods for feature engineering to improve model input.
Pending Tasks
- Confirm the directory path or upload necessary files for data extraction.
- Implement the proposed strategies for classifier optimization and feature engineering.
Evidence
- source_file=2024-10-05.sessions.jsonl, line_number=1, event_count=0, session_id=506412965551a043c30fe35676ae9a03ec7bdcaf6bb8cd3713e49e0723a296df
- event_ids: []