Newegg now blocks requests-based scraping; replace with Playwright using headless Chromium with mouse simulation to pass bot detection. Also fix hardcoded build output path, use os.makedirs for nested dirs, update category labels (HDD/SATA SSD/NVMe SSD), drop near-empty 2.5" internal and laptop HDD categories, and fix invalid HTML in index template (h2 inside table cells). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
36 lines
978 B
HTML
36 lines
978 B
HTML
{% extends "page.html" %}
|
|
|
|
{% block title %}Price per TB{% endblock %}
|
|
|
|
{% block content %}
|
|
<h1>{{self.title()}}</h1>
|
|
|
|
List of hard drives available for sale from <a href="http://newegg.com/">Newegg.com</a>, sorted by the price per TB.<p>
|
|
|
|
Built by <a href="http://edwardbetts.com">Edward Betts</a>.
|
|
|
|
Comments welcome: edward@4angle.com
|
|
|
|
<p>Last updated: {{ today.strftime('%d %B %Y') }}.<p>
|
|
|
|
{% for cat in best %}
|
|
<h2>{{ cat.label }}</h2>
|
|
<table>
|
|
<tr>
|
|
<th align="right">Price<br>per TB</th>
|
|
<th align="right">Price</th>
|
|
<th align="right">Size</th>
|
|
<th align="left">Drive</th>
|
|
</tr>
|
|
{% for hdd in cat['items'][:16] %}
|
|
<tr>
|
|
<td align="right">${{ '%.2f' | format(hdd.price_per_tb) }}</td>
|
|
<td align="right">${{ hdd.price }}</td>
|
|
<td align="right">{{ hdd.size }}</td>
|
|
<td><a href="https://www.newegg.com/Product/Product.aspx?Item={{ hdd.number }}">{{ hdd.title }}</a></td>
|
|
</tr>
|
|
{% endfor %}
|
|
</table>
|
|
<p><a href="{{ cat.name }}/index.html">more</a></p>
|
|
{% endfor %}
|
|
{% endblock %}
|