M.I. Journal

❯

❯

Developed Headless Scraping Microservice with FastAPI

Developed Headless Scraping Microservice with FastAPI

Jul 14, 20252 min read

Web-Scraping
Fastapi
Playwright
Automation
Docker

📅 2025-07-14 — Session: Developed Headless Scraping Microservice with FastAPI

🕒 01:15–02:25
🏷️ Labels: Web Scraping, Fastapi, Playwright, Automation, Docker
📂 Project: Dev

Session Goal

The session aimed to develop a robust headless scraping microservice using FastAPI and Playwright, focusing on automation and scalability.

Key Activities

Addressed clipboard issues in headless Chrome environments using Selenium.
Developed strategies for job data extraction and handling JavaScript-heavy pages.
Explored alternatives for content copying in Streamlit apps and production-level DOM content extraction.
Planned and implemented a cloud-based headless browser solution for scalable web scraping.
Analyzed resource usage and scaling strategies for headless browsing systems.
Set up a FastAPI headless browser scraper API and tested with JavaScript-heavy pages.
Scaffolded and built a headless scraping microservice, including Dockerization steps.
Resolved DNS errors in Playwright and confirmed API functionality.
Developed curl commands for job listing scraping and handled cookie consent modals.
Investigated Spider API capabilities for dynamic content extraction.

Achievements

Successfully developed and tested a headless scraping microservice using FastAPI and Playwright.
Implemented solutions for common issues like DNS errors and cookie consent handling.
Explored and compared Spider API capabilities with Playwright for dynamic content scraping.

Pending Tasks

Further optimization of resource usage and cost analysis for scaling headless browsing systems.
Continued investigation into Spider API’s advanced features for handling complex web interactions.

Graph View

📅 2025-07-14 — Session: Developed Headless Scraping Microservice with FastAPI
Session Goal
Key Activities
Achievements
Pending Tasks

Backlinks

Monthly Journal – 2025-07

Created with Quartz v4.5.1 © 2026

Home
CV
Projects
Thesis
GitHub