flickr-mail/AGENTS.md
Edward Betts 0062de8ede Add pagination for search results
- Add SearchResult dataclass with pagination metadata
- Update search_flickr() to accept page parameter
- Parse total results count from Flickr response
- Add Bootstrap pagination controls to template
- Display total result count in UI
- Update documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 17:03:30 +00:00

100 lines
3.5 KiB
Markdown

# Agent Guidelines for Flickr Mail
This document provides context for AI agents working on this codebase.
## Project Overview
Flickr Mail is a Flask web application that helps users find photos on Flickr
for Wikipedia articles and contact photographers to request Creative Commons
licensing.
## Architecture
- **main.py**: Single-file Flask application containing all routes and logic
- **templates/**: Jinja2 templates using Bootstrap 5 for styling
- `base.html`: Base template with Bootstrap CSS/JS
- `combined.html`: Main UI template for search, results, and message composition
- `message.jinja`: Template for the permission request message body
- `show_error.html`: Error display template
## Key Components
### Flickr Search (`search_flickr`, `parse_flickr_search_results`)
Searches Flickr by scraping the search results page. The page embeds JSON data
in a `modelExport` JavaScript variable which contains photo metadata.
- Uses browser-like headers (`BROWSER_HEADERS`) to avoid blocks
- Parses embedded JSON by counting braces (not regex) to handle nested structures
- Accepts optional `page` parameter for pagination (25 photos per page)
- Returns `SearchResult` dataclass containing photos and pagination metadata
### SearchResult Dataclass
Contains search results with pagination info:
- `photos`: List of `FlickrPhoto` instances
- `total_photos`: Total number of matching photos
- `current_page`: Current page number (1-indexed)
- `total_pages`: Total number of pages (capped at 160 due to Flickr's 4000 result limit)
### FlickrPhoto Dataclass
Represents a photo with:
- `id`, `title`, `path_alias`, `owner_nsid`, `username`, `realname`
- `license` (int): Flickr license code (0=ARR, 4=CC BY, 5=CC BY-SA, etc.)
- `thumb_url`, `medium_url`: Static image URLs
- `flickr_url` property: URL to photo page
- `license_name` property: Human-readable license name
### License Codes
Wikipedia-compatible licenses (can be used): 4 (CC BY), 5 (CC BY-SA), 7 (No
known copyright), 8 (US Government), 9 (CC0), 10 (Public Domain).
Not compatible: 0 (All Rights Reserved), 1-3 (NC variants), 6 (ND).
### URL Validation (`is_valid_flickr_image_url`)
Validates that image URLs passed via query params are from legitimate Flickr
static image servers:
- `live.staticflickr.com`
- `farm*.staticflickr.com`
- `c1.staticflickr.com`, `c2.staticflickr.com`
### NSID Lookup (`flickr_usrename_to_nsid`)
Converts a Flickr username/path alias to the NSID (internal user ID) needed
for the Flickr mail URL. Scrapes the user's profile page for embedded params.
## Request Flow
1. User enters Wikipedia article title/URL → `start()` extracts article name
2. `search_flickr()` fetches and parses Flickr search results
3. Results displayed as clickable photo grid with license badges
4. User clicks photo → page reloads with `flickr` and `img` params
5. `flickr_usrename_to_nsid()` looks up the photographer's NSID
6. Message template rendered with photo details
7. User copies message and clicks link to Flickr's mail compose page
## Testing Changes
Run the Flask app locally:
```bash
python3 main.py
```
Then visit http://localhost:5000/
Test search functionality:
```python
from main import search_flickr
result = search_flickr("Big Ben", page=1)
print(f"{len(result.photos)} photos, {result.total_pages} pages")
print(result.photos[0].title, result.photos[0].license_name)
```
## Potential Improvements
- Cache search results to reduce Flickr requests
- Add filtering by license type
- Handle Flickr rate limiting/blocks more gracefully
- Add tests for the parsing logic