📅 2025-03-01 — Session: Enhanced Email NER with SpaCy

🕒 03:40–04:00
🏷️ Labels: NER, Spacy, Email Analysis, Entity Extraction, Machine Learning
📂 Project: Dev
⭐ Priority: MEDIUM

Session Goal

The objective of this session was to enhance the Named Entity Recognition (NER) capabilities for email analysis using SpaCy.

Key Activities

  • Developed a Python function to extract named entities from email bodies using SpaCy.
  • Explored SpaCy’s predefined entity types and discussed how to modify or extend them for specific use cases.
  • Analyzed the performance impact of using all available labels in SpaCy and discussed best practices.
  • Proposed strategies to improve NER accuracy, including token cleaning and custom filtering methods.
  • Discussed advanced NER models, including transformer-based models from Hugging Face, and the potential for training custom NER models.

Achievements

  • Successfully implemented a function for email NER using SpaCy.
  • Gained insights into optimizing entity extraction and improving model accuracy.

Pending Tasks

  • Implement and test the proposed strategies for improving NER accuracy.
  • Evaluate advanced NER models for potential integration.