flickr-mail/README.md
Edward Betts 08f5128e8d Add interaction logging and tighten model NOT NULL constraints
Log searches (article/category) and message-generation events to a new
interaction_log table, capturing IP address and User-Agent.

Also apply NOT NULL constraints to Contribution, SentMessage, FlickrUpload,
and ThumbnailCache fields that are always populated, and remove stale
continue_token references from category.html.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 12:34:04 +00:00

100 lines
2.9 KiB
Markdown

# Flickr Mail
Tool lives here: <https://edwardbetts.com/flickr_mail/>
Flickr Mail is a Flask app that helps find Flickr photos for Wikipedia articles
and contact photographers to request Wikipedia-compatible licensing.
## What It Does
- Searches Flickr from a Wikipedia article title/URL
- Shows license status for each result (free vs non-free CC variants)
- Builds a ready-to-send Flickr message for non-free licenses
- Finds image-less articles in a Wikipedia category
- Shows recent Commons uploads that came from Flickr mail outreach
## Project Layout
- `main.py`: Flask app routes and core logic
- `templates/`: UI templates
- `download_sent_mail.py`: sync Flickr sent messages into DB
- `download_commons_contributions.py`: sync Commons contributions into DB
- `update_flickr_uploads.py`: derive `flickr_uploads` from contributions/sent mail
- `flickr_mail.db`: SQLite database
## Database Pipeline
The recent uploads section depends on a 3-step pipeline:
1. `./download_sent_mail.py` updates `sent_messages`
2. `./download_commons_contributions.py` updates `contributions`
3. `./update_flickr_uploads.py` builds/updates `flickr_uploads`
`main.py` only reads `flickr_uploads`; it does not populate it.
## UploadWizard Detection
`update_flickr_uploads.py` supports both Commons UploadWizard comment styles:
- `User created page with UploadWizard` (older)
- `Uploaded a work by ... with UploadWizard` (newer)
It first tries to extract a Flickr URL directly from the contribution comment.
If absent, it falls back to Commons `extmetadata.Credit`.
## Local Run
Install dependencies (example):
```bash
pip install flask requests beautifulsoup4 sqlalchemy
```
Start the app:
```bash
python3 main.py
```
Then open:
- `http://localhost:5000/`
## Refresh Data
Run in this order:
```bash
./download_sent_mail.py
./download_commons_contributions.py
./update_flickr_uploads.py
```
Before running `./download_sent_mail.py`, create local auth config:
```bash
cp download_sent_mail.example.json download_sent_mail.local.json
```
Then edit `download_sent_mail.local.json` and set `cookies_str` to your full
Flickr `Cookie` header value.
## Interaction Logging
The app logs searches and message generation to the `interaction_log` table:
- `search_article`: when a user searches for a Wikipedia article title (page 1 only)
- `search_category`: when a user searches a Wikipedia category
- `generate_message`: when a non-free CC message is generated for a photo
Each row records the timestamp, interaction type, client IP (from
`X-Forwarded-For` if present), User-Agent, query, and (for message events)
the Flickr and Wikipedia URLs.
## Notes
- `download_commons_contributions.py` uses an overlap window of known-only
batches before stopping to avoid full-history scans while still catching
shallow gaps.
- If a known Commons upload is missing from `flickr_uploads`, re-run the full
3-step pipeline above.