π 2023-11-07 β Session: Installed and troubleshooted JabRef and web scraping tools
π 22:05β23:40
π·οΈ Labels: Jabref, Web Scraping, Python, Selenium, Linux
π Project: Dev
β Priority: MEDIUM
Session Goal
The session aimed to install and troubleshoot the JabRef application on a Debian-based system and to explore web scraping techniques for academic papers, particularly focusing on Google Scholar.
Key Activities
- JabRef Installation: Installed JabRef using a
.debpackage and resolved command recognition issues by modifying the system PATH and creating symbolic links for easier access. - Web Scraping Exploration: Developed Python scripts for extracting academic paper details from HTML using BeautifulSoup and regular expressions. Addressed challenges like pagination, legal considerations, and dynamic content loading.
- Troubleshooting: Resolved issues with ChromeDriver version mismatches and Selenium WebDriver options errors.
Achievements
- Successfully installed JabRef and ensured command recognition.
- Developed a robust Python script for web scraping academic papers, handling pagination and HTML parsing.
- Resolved several Selenium-related issues, ensuring compatibility and proper configuration.
Pending Tasks
- Further refine web scraping scripts to comply with legal guidelines, especially concerning Google Scholarβs terms of service.