templates | ||
.gitignore | ||
check.py | ||
LICENSE | ||
README.md | ||
web_view.py |
Eurotunnel price checker
Overview
This is a personal tool designed to scrape and display Eurotunnel ticket prices. The tool consists of two primary Python scripts:
check.py
: A script that runs headless browsing to scrape ticket data and save it as HTML files.web_view.py
: A Flask web application that parses the scraped HTML data and displays the available tickets and prices.
Requirements
- Python 3.x
- Flask
- lxml
- Playwright Python package
- pytz
Install them via pip:
pip install flask lxml playwright pytz
How to use
Running the scraper (check.py
)
- Update the
outbound_date
andreturn_date
variables in the script to match your desired travel dates. - Run the script manually or add it to your crontab for scheduled checks. The HTML files will be saved in a specified directory.
For example, to run the script every day at 6:34 AM:
34 6 * * * ~/src/2023/eurotunnel-scrape/check.py
Running the web viewer (web_view.py
)
- Run
web_view.py
to start the Flask web server. - Access the web interface to view available tickets and prices.
To start the web server:
python web_view.py
You can then navigate to http://localhost:5000/
to see the ticket options.
Code structure
check.py
uses the playwright package to scrape the Eurotunnel website for ticket prices and saves the resulting HTML files.web_view.py
reads these HTML files, extracts the relevant data using lxml, and displays it using a Flask web interface.
Data Classes and Functions
Train
: Data class representing a Eurotunnel train with departure time, arrival time, and price.get_filename(direction: str) -> tuple[datetime, str]
: Function to find the most recent file corresponding to a given direction ('outbound' or 'return').get_tickets(filename: str) -> tuple[date, list[Train]]
: Function to parse the HTML and get a list of available trains and prices.
Notes
- All prices are displayed for 'standard' tickets only.
- The time and price information are displayed only for trains that are within a specific time range (as defined in
web_view.py
).
Data Storage
Scraped data is saved as HTML files in the data
directory. The filenames include timestamps to indicate when the data was scraped. For example:
2023-09-29_123456_outbound.html
: Outbound data scraped on September 29, 2023, at 12:34:56.2023-10-06_234567_return.html
: Return data scraped on October 6, 2023, at 23:45:67.
Author
This tool was created by Edward Betts.
Support and Contributions
This tool is provided as-is and may require maintenance or updates as Eurotunnel's website changes. If you encounter issues or have suggestions for improvements, feel free to open an issue or submit a pull request on the GitHub repository.
License
This tool is released under the MIT License.
Disclaimer
This tool is not affiliated with Eurotunnel and is meant for personal use only. Always refer to the official Eurotunnel website for the most accurate and up-to-date information.