📅 2025-02-27 — Session: Developed Socioeconomic Embedding and Thesis Supervision Plan

🕒 03:00–05:15
🏷️ Labels: Socioeconomic Embeddings, Thesis Supervision, Autoencoder, Feature Selection, Machine Learning
📂 Project: Teaching
⭐ Priority: MEDIUM

Session Goal: The main objective was to develop a comprehensive plan for training machine learning models using socioeconomic data embeddings, and to outline a structured supervision plan for thesis development.

Key Activities:

  • Explored training methodologies for Random Forest models on autoencoder-generated embeddings, focusing on high-dimensional socioeconomic data.
  • Developed an iterative approach for enhancing embeddings using census and survey data, refining features to improve model accuracy.
  • Discussed the importance of feature addition order in autoencoders and strategies for optimal feature selection.
  • Finalized a feature subset for socioeconomic embeddings to create a stable low-dimensional representation.
  • Analyzed the dimensionality of datasets with mixed variable types and the need for higher dimensionality in encoded representations.
  • Outlined a thesis plan comparing feature selection techniques (RF, PCA, Autoencoders) for income prediction.
  • Incorporated time and spatial data into the thesis model, and evaluated the need for Fourier features in time series analysis.
  • Developed a thesis supervision plan, detailing essential tasks for students to ensure data quality and effective modeling processes.

Achievements:

  • Established a detailed framework for socioeconomic data embedding and analysis.
  • Created a structured thesis supervision plan to guide students in their research efforts.

Pending Tasks:

  • Further exploration of frequency spectrum analysis in autoencoder embeddings to detect seasonal effects.
  • Continuation of thesis development and refinement of the outlined methodologies.