Developed Socioeconomic Embedding and Thesis Supervision Plan
- Day: 2025-02-27
- Time: 03:00 to 05:15
- Project: Teaching
- Workspace: WP 1: Strategic / Growth & Development
- Status: Completed
- Priority: MEDIUM
- Assignee: Matías Nehuen Iglesias
- Tags: Socioeconomic Embeddings, Thesis Supervision, Autoencoder, Feature Selection, Machine Learning
Description
Session Goal: The main objective was to develop a comprehensive plan for training machine learning models using socioeconomic data embeddings, and to outline a structured supervision plan for thesis development.
Key Activities:
- Explored training methodologies for Random Forest models on autoencoder-generated embeddings, focusing on high-dimensional socioeconomic data.
- Developed an iterative approach for enhancing embeddings using census and survey data, refining features to improve model accuracy.
- Discussed the importance of feature addition order in autoencoders and strategies for optimal feature selection.
- Finalized a feature subset for socioeconomic embeddings to create a stable low-dimensional representation.
- Analyzed the dimensionality of datasets with mixed variable types and the need for higher dimensionality in encoded representations.
- Outlined a thesis plan comparing feature selection techniques (RF, PCA, Autoencoders) for income prediction.
- Incorporated time and spatial data into the thesis model, and evaluated the need for Fourier features in time series analysis.
- Developed a thesis supervision plan, detailing essential tasks for students to ensure data quality and effective modeling processes.
Achievements:
- Established a detailed framework for socioeconomic data embedding and analysis.
- Created a structured thesis supervision plan to guide students in their research efforts.
Pending Tasks:
- Further exploration of frequency spectrum analysis in autoencoder embeddings to detect seasonal effects.
- Continuation of thesis development and refinement of the outlined methodologies.
Evidence
- source_file=2025-02-27.sessions.jsonl, line_number=0, event_count=0, session_id=6cc43c0999031e3ae1c842c39959555052f6726f3e04bf45774d5f0389ca08a5
- event_ids: []