π 2024-07-12 β Session: Automated Google Search Script for Profile Links
π 19:50β20:10
π·οΈ Labels: Python, Web Scraping, Automation, Google Search, CSV
π Project: Dev
β Priority: MEDIUM
Session Goal
The primary goal of this session was to develop and refine a Python script that automates Google searches to retrieve profile links from a CSV file.
Key Activities
- Developed a Python script utilizing libraries such as requests,BeautifulSoup, andgooglesearch-pythonto automate Google searches for profile links.
- Corrected a keyword argument in the google_searchfunction, changingnumtonum_resultsto comply with thegooglesearch-pythonlibrary.
- Updated the google_searchfunction by removing thepauseargument and implementingsleep_intervalfor request delays.
- Managed Google search usage limits by considering potential IP bans and legal issues, and explored solutions like using official APIs, rate limiting, and proxies.
- Created a script to structure profile links in a DataFrame and save them to a CSV file.
- Set up a rotating proxy pool for web scraping to ensure ethical compliance and avoid IP bans.
Achievements
- Successfully automated the process of retrieving and structuring profile links from a CSV file using Python.
- Implemented error corrections and updates to enhance script functionality and compliance.
Pending Tasks
- Further testing of the proxy pool setup to ensure reliability and compliance with Googleβs terms of service.
