📅 2025-02-27 — Session: Developed Socioeconomic Embedding and Thesis Supervision Plan
🕒 03:00–05:15
🏷️ Labels: Socioeconomic Embeddings, Thesis Supervision, Autoencoder, Feature Selection, Machine Learning
📂 Project: Teaching
⭐ Priority: MEDIUM
Session Goal: The main objective was to develop a comprehensive plan for training machine learning models using socioeconomic data embeddings, and to outline a structured supervision plan for thesis development.
Key Activities:
- Explored training methodologies for Random Forest models on autoencoder-generated embeddings, focusing on high-dimensional socioeconomic data.
- Developed an iterative approach for enhancing embeddings using census and survey data, refining features to improve model accuracy.
- Discussed the importance of feature addition order in autoencoders and strategies for optimal feature selection.
- Finalized a feature subset for socioeconomic embeddings to create a stable low-dimensional representation.
- Analyzed the dimensionality of datasets with mixed variable types and the need for higher dimensionality in encoded representations.
- Outlined a thesis plan comparing feature selection techniques (RF, PCA, Autoencoders) for income prediction.
- Incorporated time and spatial data into the thesis model, and evaluated the need for Fourier features in time series analysis.
- Developed a thesis supervision plan, detailing essential tasks for students to ensure data quality and effective modeling processes.
Achievements:
- Established a detailed framework for socioeconomic data embedding and analysis.
- Created a structured thesis supervision plan to guide students in their research efforts.
Pending Tasks:
- Further exploration of frequency spectrum analysis in autoencoder embeddings to detect seasonal effects.
- Continuation of thesis development and refinement of the outlined methodologies.