Switch to Playwright to bypass Newegg bot detection, closes #2

Newegg now blocks requests-based scraping; replace with Playwright
using headless Chromium with mouse simulation to pass bot detection.
Also fix hardcoded build output path, use os.makedirs for nested dirs,
update category labels (HDD/SATA SSD/NVMe SSD), drop near-empty 2.5"
internal and laptop HDD categories, and fix invalid HTML in index
template (h2 inside table cells).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Edward Betts 2026-04-03 15:06:49 +01:00
parent 55bb3697b6
commit 2dc799ecaa
3 changed files with 62 additions and 42 deletions

View file

@ -1,4 +1,4 @@
requests
playwright
lxml
jinja2
daiquiri