Commit graph

5 commits

Author SHA1 Message Date
d4308685f7 Fix Playwright timeout on slow servers
wait_until="networkidle" requires zero network activity for 500ms,
which times out on ad-heavy pages when running on a server. Switch to
domcontentloaded and wait explicitly for div.item-container instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 15:32:38 +01:00
d161e82865 Record prices to SQLite database for historical tracking, closes #3
Stores item_number, title, size_gb, price, category, seen_at in
data/prices.db with (item_number, seen_at) as the primary key — one
row per item per day. Uses INSERT OR REPLACE so re-running for the
same day is safe. Called automatically from build().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 15:32:34 +01:00
323a30d9e0 Read build_dir from ~/.config/newegg-hdd/config
Config file uses INI format:

  [newegg-hdd]
  build_dir = /path/to/output

Falls back to output/ in the script directory if the config file or
key is absent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 15:23:02 +01:00
2dc799ecaa Switch to Playwright to bypass Newegg bot detection, closes #2
Newegg now blocks requests-based scraping; replace with Playwright
using headless Chromium with mouse simulation to pass bot detection.
Also fix hardcoded build output path, use os.makedirs for nested dirs,
update category labels (HDD/SATA SSD/NVMe SSD), drop near-empty 2.5"
internal and laptop HDD categories, and fix invalid HTML in index
template (h2 inside table cells).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 15:06:49 +01:00
55bb3697b6 Initial commit 2023-10-06 18:33:31 +01:00