πŸ“… 2023-11-07 β€” Session: Installed and troubleshooted JabRef and web scraping tools

πŸ•’ 22:05–23:40
🏷️ Labels: Jabref, Web Scraping, Python, Selenium, Linux
πŸ“‚ Project: Dev
⭐ Priority: MEDIUM

Session Goal

The session aimed to install and troubleshoot the JabRef application on a Debian-based system and to explore web scraping techniques for academic papers, particularly focusing on Google Scholar.

Key Activities

  • JabRef Installation: Installed JabRef using a .deb package and resolved command recognition issues by modifying the system PATH and creating symbolic links for easier access.
  • Web Scraping Exploration: Developed Python scripts for extracting academic paper details from HTML using BeautifulSoup and regular expressions. Addressed challenges like pagination, legal considerations, and dynamic content loading.
  • Troubleshooting: Resolved issues with ChromeDriver version mismatches and Selenium WebDriver options errors.

Achievements

  • Successfully installed JabRef and ensured command recognition.
  • Developed a robust Python script for web scraping academic papers, handling pagination and HTML parsing.
  • Resolved several Selenium-related issues, ensuring compatibility and proper configuration.

Pending Tasks

  • Further refine web scraping scripts to comply with legal guidelines, especially concerning Google Scholar’s terms of service.