Log searches (article/category) and message-generation events to a new interaction_log table, capturing IP address and User-Agent. Also apply NOT NULL constraints to Contribution, SentMessage, FlickrUpload, and ThumbnailCache fields that are always populated, and remove stale continue_token references from category.html. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
100 lines
2.9 KiB
Markdown
100 lines
2.9 KiB
Markdown
# Flickr Mail
|
|
|
|
Tool lives here: <https://edwardbetts.com/flickr_mail/>
|
|
|
|
Flickr Mail is a Flask app that helps find Flickr photos for Wikipedia articles
|
|
and contact photographers to request Wikipedia-compatible licensing.
|
|
|
|
## What It Does
|
|
|
|
- Searches Flickr from a Wikipedia article title/URL
|
|
- Shows license status for each result (free vs non-free CC variants)
|
|
- Builds a ready-to-send Flickr message for non-free licenses
|
|
- Finds image-less articles in a Wikipedia category
|
|
- Shows recent Commons uploads that came from Flickr mail outreach
|
|
|
|
## Project Layout
|
|
|
|
- `main.py`: Flask app routes and core logic
|
|
- `templates/`: UI templates
|
|
- `download_sent_mail.py`: sync Flickr sent messages into DB
|
|
- `download_commons_contributions.py`: sync Commons contributions into DB
|
|
- `update_flickr_uploads.py`: derive `flickr_uploads` from contributions/sent mail
|
|
- `flickr_mail.db`: SQLite database
|
|
|
|
## Database Pipeline
|
|
|
|
The recent uploads section depends on a 3-step pipeline:
|
|
|
|
1. `./download_sent_mail.py` updates `sent_messages`
|
|
2. `./download_commons_contributions.py` updates `contributions`
|
|
3. `./update_flickr_uploads.py` builds/updates `flickr_uploads`
|
|
|
|
`main.py` only reads `flickr_uploads`; it does not populate it.
|
|
|
|
## UploadWizard Detection
|
|
|
|
`update_flickr_uploads.py` supports both Commons UploadWizard comment styles:
|
|
|
|
- `User created page with UploadWizard` (older)
|
|
- `Uploaded a work by ... with UploadWizard` (newer)
|
|
|
|
It first tries to extract a Flickr URL directly from the contribution comment.
|
|
If absent, it falls back to Commons `extmetadata.Credit`.
|
|
|
|
## Local Run
|
|
|
|
Install dependencies (example):
|
|
|
|
```bash
|
|
pip install flask requests beautifulsoup4 sqlalchemy
|
|
```
|
|
|
|
Start the app:
|
|
|
|
```bash
|
|
python3 main.py
|
|
```
|
|
|
|
Then open:
|
|
|
|
- `http://localhost:5000/`
|
|
|
|
## Refresh Data
|
|
|
|
Run in this order:
|
|
|
|
```bash
|
|
./download_sent_mail.py
|
|
./download_commons_contributions.py
|
|
./update_flickr_uploads.py
|
|
```
|
|
|
|
Before running `./download_sent_mail.py`, create local auth config:
|
|
|
|
```bash
|
|
cp download_sent_mail.example.json download_sent_mail.local.json
|
|
```
|
|
|
|
Then edit `download_sent_mail.local.json` and set `cookies_str` to your full
|
|
Flickr `Cookie` header value.
|
|
|
|
## Interaction Logging
|
|
|
|
The app logs searches and message generation to the `interaction_log` table:
|
|
|
|
- `search_article`: when a user searches for a Wikipedia article title (page 1 only)
|
|
- `search_category`: when a user searches a Wikipedia category
|
|
- `generate_message`: when a non-free CC message is generated for a photo
|
|
|
|
Each row records the timestamp, interaction type, client IP (from
|
|
`X-Forwarded-For` if present), User-Agent, query, and (for message events)
|
|
the Flickr and Wikipedia URLs.
|
|
|
|
## Notes
|
|
|
|
- `download_commons_contributions.py` uses an overlap window of known-only
|
|
batches before stopping to avoid full-history scans while still catching
|
|
shallow gaps.
|
|
- If a known Commons upload is missing from `flickr_uploads`, re-run the full
|
|
3-step pipeline above.
|