M.I. Journal

❯

❯

Developed Headless Scraping Microservice with FastAPI

Developed Headless Scraping Microservice with FastAPI

Jul 14, 20252 min read

Web-Scraping
Fastapi
Playwright
Automation
Docker

Developed Headless Scraping Microservice with FastAPI

Day: 2025-07-14
Time: 01:15 to 02:25
Project: Dev
Workspace: WP 2: Operational
Status: Completed
Priority: MEDIUM
Assignee: Matías Nehuen Iglesias
Tags: Web Scraping, Fastapi, Playwright, Automation, Docker

Description

Session Goal

The session aimed to develop a robust headless scraping microservice using FastAPI and Playwright, focusing on automation and scalability.

Key Activities

Addressed clipboard issues in headless Chrome environments using Selenium.
Developed strategies for job data extraction and handling JavaScript-heavy pages.
Explored alternatives for content copying in Streamlit apps and production-level DOM content extraction.
Planned and implemented a cloud-based headless browser solution for scalable web scraping.
Analyzed resource usage and scaling strategies for headless browsing systems.
Set up a FastAPI headless browser scraper API and tested with JavaScript-heavy pages.
Scaffolded and built a headless scraping microservice, including Dockerization steps.
Resolved DNS errors in Playwright and confirmed API functionality.
Developed curl commands for job listing scraping and handled cookie consent modals.
Investigated Spider API capabilities for dynamic content extraction.

Achievements

Successfully developed and tested a headless scraping microservice using FastAPI and Playwright.
Implemented solutions for common issues like DNS errors and cookie consent handling.
Explored and compared Spider API capabilities with Playwright for dynamic content scraping.

Pending Tasks

Further optimization of resource usage and cost analysis for scaling headless browsing systems.
Continued investigation into Spider API’s advanced features for handling complex web interactions.

Evidence

source_file=2025-07-14.sessions.jsonl, line_number=5, event_count=0, session_id=2f52a08a016cf29a2525e0e0e40f9f034f2ccc2f3c94b727ab736e7b2c3a0e77
event_ids: []

Graph View

Developed Headless Scraping Microservice with FastAPI
Description
Session Goal
Key Activities
Achievements
Pending Tasks
Evidence

Backlinks

Monthly Journal 2025-07

Created with Quartz v4.5.1 © 2026

Home
CV
Projects
Thesis
GitHub