Look for Restaurant: keys and extract name, location (address, city,
state, zip), phone, and brandColor for the wizard business info step.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Define basePath before Toast parsing block so image URLs can be
properly constructed for local file uploads.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Extract items from visible HTML instead of just __OO_STATE__ JSON
- Parse headerText spans for item names, price spans for prices
- Extract images from Menu_files/ src attributes
- Fall back to simpler headerText matching if block parsing fails
- Also extract images from __OO_STATE__ and match to items by name
- Fixes issue where only 116 items extracted instead of 163+
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Skip Claude AI for Toast menus - parse the embedded JSON directly.
This extracts all items, categories, and images from the structured
__OO_STATE__ data, which is faster and more complete than AI extraction.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Claude returns imageUrl but code only checked for images/imageSrc.
Add handling for imageUrl field to properly match images to items.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- uploadSavedPage.cfm: sanitize extracted files (whitelist safe extensions,
delete symlinks) to protect against malicious content from infected sites
- analyzeMenuUrl.cfm: detect local temp URLs and read directly from disk,
bypassing Playwright for faster processing of saved pages
- saveWizard.cfm: delete temp folder immediately after wizard completes
instead of waiting for 1-hour auto-cleanup
- setup-wizard.html: track temp folder ID and pass to saveWizard for cleanup
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Check X-Forwarded-Proto header for HTTPS (reverse proxy)
- chmod extracted files to be world-readable for Playwright
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Security: Also added nginx rule on dev server to block CFM/PHP
execution in /temp/menu-import/ directory.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When scanning extracted ZIP content from /temp/menu-import/, read
images directly from the filesystem instead of re-downloading via HTTP.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
For Cloudflare-protected sites, users can now:
1. Save the page from their browser (Webpage, Complete)
2. ZIP the HTML and assets folder
3. Upload the ZIP in the wizard
4. Server extracts to temp folder, Playwright scans local copy
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace cfhttp with Playwright headless browser
- Capture images from network requests during page render
- No longer needs to fetch subpages (JS renders everything)
- Should capture subcategory items that load dynamically
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Backend now accepts either url or html content in request body
- Frontend adds HTML file upload option below URL input
- Useful when websites block the crawler (403 errors)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The PendingTaskCount subquery was only checking ClaimedByUserID = 0
but not CompletedOn IS NULL, causing completed-but-unclaimed tasks
to show up in the business selection screen count.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>