Update README and AGENTS with category search and license features
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
c5efd429ce
commit
ac1b01ea68
2 changed files with 72 additions and 21 deletions
65
AGENTS.md
65
AGENTS.md
|
|
@ -14,7 +14,9 @@ licensing.
|
|||
- **templates/**: Jinja2 templates using Bootstrap 5 for styling
|
||||
- `base.html`: Base template with Bootstrap CSS/JS
|
||||
- `combined.html`: Main UI template for search, results, and message composition
|
||||
- `message.jinja`: Template for the permission request message body
|
||||
- `message.jinja`: Template for the permission request message body (with
|
||||
alternate text for non-free CC licenses)
|
||||
- `category.html`: Category search page with visited link styling
|
||||
- `show_error.html`: Error display template
|
||||
|
||||
## Key Components
|
||||
|
|
@ -48,10 +50,22 @@ Represents a photo with:
|
|||
|
||||
### License Codes
|
||||
|
||||
Wikipedia-compatible licenses (can be used): 4 (CC BY), 5 (CC BY-SA), 7 (No
|
||||
known copyright), 8 (US Government), 9 (CC0), 10 (Public Domain).
|
||||
Flickr uses numeric codes for licenses. Codes 1-6 are CC 2.0, codes 11-16 are
|
||||
CC 4.0 equivalents.
|
||||
|
||||
Not compatible: 0 (All Rights Reserved), 1-3 (NC variants), 6 (ND).
|
||||
Wikipedia-compatible (`FREE_LICENSES`): 4 (CC BY 2.0), 5 (CC BY-SA 2.0),
|
||||
7 (No known copyright), 8 (US Government), 9 (CC0), 10 (Public Domain),
|
||||
14 (CC BY 4.0), 15 (CC BY-SA 4.0).
|
||||
|
||||
Non-free CC (`NONFREE_CC_LICENSES`): 1 (CC BY-NC-SA 2.0), 2 (CC BY-NC 2.0),
|
||||
3 (CC BY-NC-ND 2.0), 6 (CC BY-ND 2.0), 11-13 (4.0 NC variants),
|
||||
16 (CC BY-ND 4.0).
|
||||
|
||||
Not compatible: 0 (All Rights Reserved).
|
||||
|
||||
For free licenses, the message page shows an UploadWizard link instead of a
|
||||
message. For non-free CC licenses, a tailored message explains which
|
||||
restrictions (NC/ND) prevent Wikipedia use.
|
||||
|
||||
### URL Validation (`is_valid_flickr_image_url`)
|
||||
|
||||
|
|
@ -94,15 +108,40 @@ Run to find Flickr uploads from UploadWizard contributions that don't have
|
|||
the Flickr URL in the edit comment. Queries Commons API for image metadata
|
||||
and checks the Credit field for Flickr URLs.
|
||||
|
||||
### Category Search (`/category` route)
|
||||
|
||||
Finds Wikipedia articles in a category that don't have images.
|
||||
|
||||
**Key functions**:
|
||||
- `parse_category_input()`: Accepts category name, `Category:` prefix, or full
|
||||
Wikipedia URL
|
||||
- `get_articles_without_images()`: Uses MediaWiki API with
|
||||
`generator=categorymembers` and `prop=images` for efficient batch queries
|
||||
- `has_content_image()`: Filters out non-content images (UI icons, logos) using
|
||||
`NON_CONTENT_IMAGE_PATTERNS`
|
||||
|
||||
The `cat` URL parameter is preserved through search results and message pages
|
||||
to allow back-navigation to the category.
|
||||
|
||||
### Previous Message Detection (`get_previous_messages`)
|
||||
|
||||
Checks `sent_mail/messages_index.json` for previous messages to a Flickr user.
|
||||
Matches by both display name and username (case-insensitive). Results shown as
|
||||
an info alert on the message page.
|
||||
|
||||
## Request Flow
|
||||
|
||||
1. User enters Wikipedia article title/URL → `start()` extracts article name
|
||||
2. `search_flickr()` fetches and parses Flickr search results
|
||||
3. Results displayed as clickable photo grid with license badges
|
||||
4. User clicks photo → page reloads with `flickr` and `img` params
|
||||
5. `flickr_usrename_to_nsid()` looks up the photographer's NSID
|
||||
6. Message template rendered with photo details
|
||||
7. User copies message and clicks link to Flickr's mail compose page
|
||||
1. User enters Wikipedia article title/URL → `start()` extracts article name.
|
||||
Alternatively, user searches by category via `/category` route.
|
||||
2. `search_flickr()` fetches and parses Flickr search results.
|
||||
Disambiguation suffixes like "(academic)" are removed for the search.
|
||||
3. Results displayed as clickable photo grid with license badges.
|
||||
4. User clicks photo → page reloads with `flickr`, `img`, `license`, and
|
||||
`flickr_user` params.
|
||||
5. If license is Wikipedia-compatible: show UploadWizard link.
|
||||
6. Otherwise: `flickr_usrename_to_nsid()` looks up the user's NSID, previous
|
||||
messages are checked, and the appropriate message template is rendered.
|
||||
7. User copies message and clicks link to Flickr's mail compose page.
|
||||
|
||||
## Testing Changes
|
||||
|
||||
|
|
@ -123,6 +162,8 @@ print(result.photos[0].title, result.photos[0].license_name)
|
|||
## Potential Improvements
|
||||
|
||||
- Cache search results to reduce Flickr requests
|
||||
- Add filtering by license type
|
||||
- Add filtering by license type in search results
|
||||
- Handle Flickr rate limiting/blocks more gracefully
|
||||
- Add tests for the parsing logic
|
||||
- Add pagination for category search (continue token is already returned)
|
||||
- Confirm CC 4.0 license codes 11-15 (only 16 confirmed so far)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue