๐Ÿ“… 2023-11-19 โ€” Session: Advanced Text Analysis with NLP Techniques

๐Ÿ•’ 21:15โ€“22:25
๐Ÿท๏ธ Labels: NLP, Python, Text Analysis, Spacy, NLTK, Docentes
๐Ÿ“‚ Project: Dev
โญ Priority: MEDIUM

Session Goal:

The session aimed to enhance text analysis capabilities using various Natural Language Processing (NLP) techniques, focusing on the word โ€˜docentesโ€™ within Cristina Fernรกndez de Kirchnerโ€™s speeches.

Key Activities:

  • Implemented Named Entity Recognition (NER) using the spaCy library to identify named entities related to โ€˜docentesโ€™.
  • Conducted co-occurrence analysis of the word โ€˜docentesโ€™ in speeches, identifying associated themes and sentiments.
  • Developed a Python function for text processing and word frequency counting, including stopwords filtering and special character handling.
  • Corrected a Python script for co-occurrence analysis, addressing tokenization and stopword removal issues.
  • Implemented stemming in co-occurrence analysis using Pythonโ€™s NLTK library, focusing on Spanish with the Snowball Stemmer.
  • Enhanced co-occurrence analysis by mapping stemmed tokens back to original words.
  • Explored dependency parsing using spaCy to analyze the grammatical structure of sentences involving โ€˜docentesโ€™.

Achievements:

  • Successfully implemented and corrected scripts for various NLP techniques, improving the analysis of โ€˜docentesโ€™ in textual data.
  • Established a structured framework for analyzing text data using advanced NLP methods.

Pending Tasks:

  • Further refinement of scripts to improve accuracy and efficiency in text analysis.
  • Exploration of additional NLP techniques to enhance analysis capabilities.