diff --git a/AGENTS.md b/AGENTS.md index 6fe3cca..60e1614 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -14,7 +14,9 @@ licensing. - **templates/**: Jinja2 templates using Bootstrap 5 for styling - `base.html`: Base template with Bootstrap CSS/JS - `combined.html`: Main UI template for search, results, and message composition - - `message.jinja`: Template for the permission request message body + - `message.jinja`: Template for the permission request message body (with + alternate text for non-free CC licenses) + - `category.html`: Category search page with visited link styling - `show_error.html`: Error display template ## Key Components @@ -48,10 +50,22 @@ Represents a photo with: ### License Codes -Wikipedia-compatible licenses (can be used): 4 (CC BY), 5 (CC BY-SA), 7 (No -known copyright), 8 (US Government), 9 (CC0), 10 (Public Domain). +Flickr uses numeric codes for licenses. Codes 1-6 are CC 2.0, codes 11-16 are +CC 4.0 equivalents. -Not compatible: 0 (All Rights Reserved), 1-3 (NC variants), 6 (ND). +Wikipedia-compatible (`FREE_LICENSES`): 4 (CC BY 2.0), 5 (CC BY-SA 2.0), +7 (No known copyright), 8 (US Government), 9 (CC0), 10 (Public Domain), +14 (CC BY 4.0), 15 (CC BY-SA 4.0). + +Non-free CC (`NONFREE_CC_LICENSES`): 1 (CC BY-NC-SA 2.0), 2 (CC BY-NC 2.0), +3 (CC BY-NC-ND 2.0), 6 (CC BY-ND 2.0), 11-13 (4.0 NC variants), +16 (CC BY-ND 4.0). + +Not compatible: 0 (All Rights Reserved). + +For free licenses, the message page shows an UploadWizard link instead of a +message. For non-free CC licenses, a tailored message explains which +restrictions (NC/ND) prevent Wikipedia use. ### URL Validation (`is_valid_flickr_image_url`) @@ -94,15 +108,40 @@ Run to find Flickr uploads from UploadWizard contributions that don't have the Flickr URL in the edit comment. Queries Commons API for image metadata and checks the Credit field for Flickr URLs. +### Category Search (`/category` route) + +Finds Wikipedia articles in a category that don't have images. + +**Key functions**: +- `parse_category_input()`: Accepts category name, `Category:` prefix, or full + Wikipedia URL +- `get_articles_without_images()`: Uses MediaWiki API with + `generator=categorymembers` and `prop=images` for efficient batch queries +- `has_content_image()`: Filters out non-content images (UI icons, logos) using + `NON_CONTENT_IMAGE_PATTERNS` + +The `cat` URL parameter is preserved through search results and message pages +to allow back-navigation to the category. + +### Previous Message Detection (`get_previous_messages`) + +Checks `sent_mail/messages_index.json` for previous messages to a Flickr user. +Matches by both display name and username (case-insensitive). Results shown as +an info alert on the message page. + ## Request Flow -1. User enters Wikipedia article title/URL → `start()` extracts article name -2. `search_flickr()` fetches and parses Flickr search results -3. Results displayed as clickable photo grid with license badges -4. User clicks photo → page reloads with `flickr` and `img` params -5. `flickr_usrename_to_nsid()` looks up the photographer's NSID -6. Message template rendered with photo details -7. User copies message and clicks link to Flickr's mail compose page +1. User enters Wikipedia article title/URL → `start()` extracts article name. + Alternatively, user searches by category via `/category` route. +2. `search_flickr()` fetches and parses Flickr search results. + Disambiguation suffixes like "(academic)" are removed for the search. +3. Results displayed as clickable photo grid with license badges. +4. User clicks photo → page reloads with `flickr`, `img`, `license`, and + `flickr_user` params. +5. If license is Wikipedia-compatible: show UploadWizard link. +6. Otherwise: `flickr_usrename_to_nsid()` looks up the user's NSID, previous + messages are checked, and the appropriate message template is rendered. +7. User copies message and clicks link to Flickr's mail compose page. ## Testing Changes @@ -123,6 +162,8 @@ print(result.photos[0].title, result.photos[0].license_name) ## Potential Improvements - Cache search results to reduce Flickr requests -- Add filtering by license type +- Add filtering by license type in search results - Handle Flickr rate limiting/blocks more gracefully - Add tests for the parsing logic +- Add pagination for category search (continue token is already returned) +- Confirm CC 4.0 license codes 11-15 (only 16 confirmed so far) diff --git a/README.md b/README.md index 1df6a66..848e89d 100644 --- a/README.md +++ b/README.md @@ -22,16 +22,23 @@ photographers on Flickr whose photos can be used to enhance Wikipedia articles. - **Integrated Flickr search**: Enter a Wikipedia article title and see Flickr photos directly in the interface - no need to visit Flickr's search page. - **Photo grid with metadata**: Search results display as a grid of thumbnails - showing the photographer's name and license for each photo. -- **License highlighting**: Photos with Wikipedia-compatible licenses (CC BY, - CC BY-SA, CC0, Public Domain) are highlighted with a green badge. + showing the user's name and license for each photo. +- **License handling**: Photos with Wikipedia-compatible licenses (CC BY, + CC BY-SA, CC0, Public Domain) are highlighted with a green badge and link + directly to the Commons UploadWizard. Non-free CC licenses (NC/ND) show a + tailored message explaining Wikipedia's requirements. Supports both CC 2.0 + and CC 4.0 license codes. - **One-click message composition**: Click any photo to compose a permission - request message with the photo displayed alongside. + request message with the photo displayed alongside, showing the user's Flickr + profile and current license. +- **Previous message detection**: The message page checks sent mail history and + warns if you have previously contacted the user. +- **Category search**: Find Wikipedia articles without images in a given + category, with links to search Flickr for each article. - **Pagination**: Browse through thousands of search results with page navigation. - **Recent uploads showcase**: The home page displays recent Wikimedia Commons uploads that were obtained via Flickr mail requests, with links to the - Wikipedia article and photographer's Flickr profile. -- Generate messages to request permission to use photos on Wikipedia. + Wikipedia article and user's Flickr profile. - Handle exceptions gracefully and provide detailed error information. ## Usage @@ -40,11 +47,14 @@ To use the tool, follow these steps: 1. Start the tool by running the script. 2. Access the tool through a web browser. -3. Enter the Wikipedia article title or URL. +3. Enter a Wikipedia article title or URL, or use "Find articles by category" + to discover articles that need images. 4. Browse the Flickr search results displayed in the interface. -5. Click on a photo to select it and compose a permission request message. +5. Click on a photo to select it. If the license is Wikipedia-compatible, you'll + be linked to the Commons UploadWizard. Otherwise, a message is composed to + request a license change. 6. Copy the subject and message, then click "Send message on Flickr" to contact - the photographer. + the user. ## Error Handling