Switch to Playwright to bypass Newegg bot detection, closes #2
Newegg now blocks requests-based scraping; replace with Playwright using headless Chromium with mouse simulation to pass bot detection. Also fix hardcoded build output path, use os.makedirs for nested dirs, update category labels (HDD/SATA SSD/NVMe SSD), drop near-empty 2.5" internal and laptop HDD categories, and fix invalid HTML in index template (h2 inside table cells). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
55bb3697b6
commit
2dc799ecaa
3 changed files with 62 additions and 42 deletions
|
|
@ -1,4 +1,4 @@
|
|||
requests
|
||||
playwright
|
||||
lxml
|
||||
jinja2
|
||||
daiquiri
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue