Add README.md and LICENSE
This commit is contained in:
parent
64e7ea653e
commit
f5be17e979
21
LICENSE
Normal file
21
LICENSE
Normal file
|
@ -0,0 +1,21 @@
|
|||
MIT License
|
||||
|
||||
Copyright (c) 2023 Edward Betts <edward@4angle.com>
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
85
README.md
Normal file
85
README.md
Normal file
|
@ -0,0 +1,85 @@
|
|||
# Eurotunnel price checker
|
||||
|
||||
## Overview
|
||||
|
||||
This is a personal tool designed to scrape and display Eurotunnel ticket prices. The tool consists of two primary Python scripts:
|
||||
|
||||
1. `check.py`: A script that runs headless browsing to scrape ticket data and save it as HTML files.
|
||||
2. `web_view.py`: A Flask web application that parses the scraped HTML data and displays the available tickets and prices.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.x
|
||||
- Flask
|
||||
- lxml
|
||||
- Playwright Python package
|
||||
- pytz
|
||||
|
||||
Install them via pip:
|
||||
|
||||
```bash
|
||||
pip install flask lxml playwright pytz
|
||||
```
|
||||
|
||||
## How to use
|
||||
|
||||
### Running the scraper (`check.py`)
|
||||
|
||||
1. Update the `outbound_date` and `return_date` variables in the script to match your desired travel dates.
|
||||
2. Run the script manually or add it to your crontab for scheduled checks. The HTML files will be saved in a specified directory.
|
||||
|
||||
For example, to run the script every day at 6:34 AM:
|
||||
|
||||
```bash
|
||||
34 6 * * * ~/src/2023/eurotunnel-scrape/check.py
|
||||
```
|
||||
|
||||
### Running the web viewer (`web_view.py`)
|
||||
|
||||
1. Run `web_view.py` to start the Flask web server.
|
||||
2. Access the web interface to view available tickets and prices.
|
||||
|
||||
To start the web server:
|
||||
|
||||
```bash
|
||||
python web_view.py
|
||||
```
|
||||
|
||||
You can then navigate to `http://localhost:5000/` to see the ticket options.
|
||||
|
||||
## Code structure
|
||||
|
||||
- `check.py` uses the playwright package to scrape the Eurotunnel website for ticket prices and saves the resulting HTML files.
|
||||
- `web_view.py` reads these HTML files, extracts the relevant data using lxml, and displays it using a Flask web interface.
|
||||
|
||||
### Data Classes and Functions
|
||||
|
||||
- `Train`: Data class representing a Eurotunnel train with departure time, arrival time, and price.
|
||||
- `get_filename(direction: str) -> tuple[datetime, str]`: Function to find the most recent file corresponding to a given direction ('outbound' or 'return').
|
||||
- `get_tickets(filename: str) -> tuple[date, list[Train]]`: Function to parse the HTML and get a list of available trains and prices.
|
||||
|
||||
## Notes
|
||||
|
||||
- All prices are displayed for 'standard' tickets only.
|
||||
- The time and price information are displayed only for trains that are within a specific time range (as defined in `web_view.py`).
|
||||
|
||||
## Data Storage
|
||||
Scraped data is saved as HTML files in the `data` directory. The filenames include timestamps to indicate when the data was scraped. For example:
|
||||
|
||||
- `2023-09-29_123456_outbound.html`: Outbound data scraped on September 29, 2023, at 12:34:56.
|
||||
- `2023-10-06_234567_return.html`: Return data scraped on October 6, 2023, at 23:45:67.
|
||||
|
||||
## Author
|
||||
|
||||
This tool was created by Edward Betts.
|
||||
|
||||
## Support and Contributions
|
||||
This tool is provided as-is and may require maintenance or updates as Eurotunnel's website changes. If you encounter issues or have suggestions for improvements, feel free to open an issue or submit a pull request on the GitHub repository.
|
||||
|
||||
## License
|
||||
|
||||
This tool is released under the [MIT License](LICENSE).
|
||||
|
||||
## Disclaimer
|
||||
|
||||
This tool is not affiliated with Eurotunnel and is meant for personal use only. Always refer to the official Eurotunnel website for the most accurate and up-to-date information.
|
Loading…
Reference in a new issue