📅 2023-03-07 — Session: Enhanced Web Scraping and Data Policy Planning
🕒 13:15–17:15
🏷️ Labels: Web Scraping, Python, Data Policy, Code Improvement
📂 Project: Dev
⭐ Priority: MEDIUM
Session Goal
The session aimed to enhance web scraping techniques using Python and to reflect on the importance of data policies in university collaborations.
Key Activities
- Discussed the need for clear data policies to ensure ethical and responsible data use in university collaborations, focusing on privacy, security, and ethics.
- Outlined steps for building web scrapers with Python, emphasizing key libraries and techniques.
- Improved web scraper code for concursos, enhancing modularity, variable naming, and error handling.
- Provided code improvement suggestions for DataFrame manipulation and Google search URL construction, focusing on readability and efficiency.
- Resolved an error with the
itertuples()
method in Pandas by including the index column. - Developed a thesis data scraper using BeautifulSoup and requests, and improved error handling with try-except blocks.
Achievements
- Successfully updated and improved web scraping scripts, enhancing code quality and error handling.
- Clarified the importance of data policies in academic collaborations.
Pending Tasks
- Further refine data scraping scripts for additional data sources.
- Develop comprehensive data policy guidelines for university collaborations.