Update README and AGENTS with category search and license features

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Edward Betts 2026-02-07 10:26:56 +00:00
parent c5efd429ce
commit ac1b01ea68
2 changed files with 72 additions and 21 deletions

View file

@ -14,7 +14,9 @@ licensing.
- **templates/**: Jinja2 templates using Bootstrap 5 for styling
- `base.html`: Base template with Bootstrap CSS/JS
- `combined.html`: Main UI template for search, results, and message composition
- `message.jinja`: Template for the permission request message body
- `message.jinja`: Template for the permission request message body (with
alternate text for non-free CC licenses)
- `category.html`: Category search page with visited link styling
- `show_error.html`: Error display template
## Key Components
@ -48,10 +50,22 @@ Represents a photo with:
### License Codes
Wikipedia-compatible licenses (can be used): 4 (CC BY), 5 (CC BY-SA), 7 (No
known copyright), 8 (US Government), 9 (CC0), 10 (Public Domain).
Flickr uses numeric codes for licenses. Codes 1-6 are CC 2.0, codes 11-16 are
CC 4.0 equivalents.
Not compatible: 0 (All Rights Reserved), 1-3 (NC variants), 6 (ND).
Wikipedia-compatible (`FREE_LICENSES`): 4 (CC BY 2.0), 5 (CC BY-SA 2.0),
7 (No known copyright), 8 (US Government), 9 (CC0), 10 (Public Domain),
14 (CC BY 4.0), 15 (CC BY-SA 4.0).
Non-free CC (`NONFREE_CC_LICENSES`): 1 (CC BY-NC-SA 2.0), 2 (CC BY-NC 2.0),
3 (CC BY-NC-ND 2.0), 6 (CC BY-ND 2.0), 11-13 (4.0 NC variants),
16 (CC BY-ND 4.0).
Not compatible: 0 (All Rights Reserved).
For free licenses, the message page shows an UploadWizard link instead of a
message. For non-free CC licenses, a tailored message explains which
restrictions (NC/ND) prevent Wikipedia use.
### URL Validation (`is_valid_flickr_image_url`)
@ -94,15 +108,40 @@ Run to find Flickr uploads from UploadWizard contributions that don't have
the Flickr URL in the edit comment. Queries Commons API for image metadata
and checks the Credit field for Flickr URLs.
### Category Search (`/category` route)
Finds Wikipedia articles in a category that don't have images.
**Key functions**:
- `parse_category_input()`: Accepts category name, `Category:` prefix, or full
Wikipedia URL
- `get_articles_without_images()`: Uses MediaWiki API with
`generator=categorymembers` and `prop=images` for efficient batch queries
- `has_content_image()`: Filters out non-content images (UI icons, logos) using
`NON_CONTENT_IMAGE_PATTERNS`
The `cat` URL parameter is preserved through search results and message pages
to allow back-navigation to the category.
### Previous Message Detection (`get_previous_messages`)
Checks `sent_mail/messages_index.json` for previous messages to a Flickr user.
Matches by both display name and username (case-insensitive). Results shown as
an info alert on the message page.
## Request Flow
1. User enters Wikipedia article title/URL → `start()` extracts article name
2. `search_flickr()` fetches and parses Flickr search results
3. Results displayed as clickable photo grid with license badges
4. User clicks photo → page reloads with `flickr` and `img` params
5. `flickr_usrename_to_nsid()` looks up the photographer's NSID
6. Message template rendered with photo details
7. User copies message and clicks link to Flickr's mail compose page
1. User enters Wikipedia article title/URL → `start()` extracts article name.
Alternatively, user searches by category via `/category` route.
2. `search_flickr()` fetches and parses Flickr search results.
Disambiguation suffixes like "(academic)" are removed for the search.
3. Results displayed as clickable photo grid with license badges.
4. User clicks photo → page reloads with `flickr`, `img`, `license`, and
`flickr_user` params.
5. If license is Wikipedia-compatible: show UploadWizard link.
6. Otherwise: `flickr_usrename_to_nsid()` looks up the user's NSID, previous
messages are checked, and the appropriate message template is rendered.
7. User copies message and clicks link to Flickr's mail compose page.
## Testing Changes
@ -123,6 +162,8 @@ print(result.photos[0].title, result.photos[0].license_name)
## Potential Improvements
- Cache search results to reduce Flickr requests
- Add filtering by license type
- Add filtering by license type in search results
- Handle Flickr rate limiting/blocks more gracefully
- Add tests for the parsing logic
- Add pagination for category search (continue token is already returned)
- Confirm CC 4.0 license codes 11-15 (only 16 confirmed so far)

View file

@ -22,16 +22,23 @@ photographers on Flickr whose photos can be used to enhance Wikipedia articles.
- **Integrated Flickr search**: Enter a Wikipedia article title and see Flickr
photos directly in the interface - no need to visit Flickr's search page.
- **Photo grid with metadata**: Search results display as a grid of thumbnails
showing the photographer's name and license for each photo.
- **License highlighting**: Photos with Wikipedia-compatible licenses (CC BY,
CC BY-SA, CC0, Public Domain) are highlighted with a green badge.
showing the user's name and license for each photo.
- **License handling**: Photos with Wikipedia-compatible licenses (CC BY,
CC BY-SA, CC0, Public Domain) are highlighted with a green badge and link
directly to the Commons UploadWizard. Non-free CC licenses (NC/ND) show a
tailored message explaining Wikipedia's requirements. Supports both CC 2.0
and CC 4.0 license codes.
- **One-click message composition**: Click any photo to compose a permission
request message with the photo displayed alongside.
request message with the photo displayed alongside, showing the user's Flickr
profile and current license.
- **Previous message detection**: The message page checks sent mail history and
warns if you have previously contacted the user.
- **Category search**: Find Wikipedia articles without images in a given
category, with links to search Flickr for each article.
- **Pagination**: Browse through thousands of search results with page navigation.
- **Recent uploads showcase**: The home page displays recent Wikimedia Commons
uploads that were obtained via Flickr mail requests, with links to the
Wikipedia article and photographer's Flickr profile.
- Generate messages to request permission to use photos on Wikipedia.
Wikipedia article and user's Flickr profile.
- Handle exceptions gracefully and provide detailed error information.
## Usage
@ -40,11 +47,14 @@ To use the tool, follow these steps:
1. Start the tool by running the script.
2. Access the tool through a web browser.
3. Enter the Wikipedia article title or URL.
3. Enter a Wikipedia article title or URL, or use "Find articles by category"
to discover articles that need images.
4. Browse the Flickr search results displayed in the interface.
5. Click on a photo to select it and compose a permission request message.
5. Click on a photo to select it. If the license is Wikipedia-compatible, you'll
be linked to the Commons UploadWizard. Otherwise, a message is composed to
request a license change.
6. Copy the subject and message, then click "Send message on Flickr" to contact
the photographer.
the user.
## Error Handling