๐ 2023-11-19 โ Session: Advanced Text Analysis with NLP Techniques
๐ 21:15โ22:25
๐ท๏ธ Labels: NLP, Python, Text Analysis, Spacy, NLTK, Docentes
๐ Project: Dev
โญ Priority: MEDIUM
Session Goal:
The session aimed to enhance text analysis capabilities using various Natural Language Processing (NLP) techniques, focusing on the word โdocentesโ within Cristina Fernรกndez de Kirchnerโs speeches.
Key Activities:
- Implemented Named Entity Recognition (NER) using the spaCy library to identify named entities related to โdocentesโ.
- Conducted co-occurrence analysis of the word โdocentesโ in speeches, identifying associated themes and sentiments.
- Developed a Python function for text processing and word frequency counting, including stopwords filtering and special character handling.
- Corrected a Python script for co-occurrence analysis, addressing tokenization and stopword removal issues.
- Implemented stemming in co-occurrence analysis using Pythonโs NLTK library, focusing on Spanish with the Snowball Stemmer.
- Enhanced co-occurrence analysis by mapping stemmed tokens back to original words.
- Explored dependency parsing using spaCy to analyze the grammatical structure of sentences involving โdocentesโ.
Achievements:
- Successfully implemented and corrected scripts for various NLP techniques, improving the analysis of โdocentesโ in textual data.
- Established a structured framework for analyzing text data using advanced NLP methods.
Pending Tasks:
- Further refinement of scripts to improve accuracy and efficiency in text analysis.
- Exploration of additional NLP techniques to enhance analysis capabilities.