Commit graph

25 commits

Author SHA1 Message Date
John Mizerek
c5b678ac05 Fix basePath undefined error for local temp file parsing
Define basePath before Toast parsing block so image URLs can be
properly constructed for local file uploads.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:54:10 -08:00
John Mizerek
d8e6f619ac Parse Toast menu from visible HTML for complete item extraction
- Extract items from visible HTML instead of just __OO_STATE__ JSON
- Parse headerText spans for item names, price spans for prices
- Extract images from Menu_files/ src attributes
- Fall back to simpler headerText matching if block parsing fails
- Also extract images from __OO_STATE__ and match to items by name
- Fixes issue where only 116 items extracted instead of 163+

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:53:25 -08:00
John Mizerek
b5abbe43b4 Add direct Toast menu parsing via __OO_STATE__
Skip Claude AI for Toast menus - parse the embedded JSON directly.
This extracts all items, categories, and images from the structured
__OO_STATE__ data, which is faster and more complete than AI extraction.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:34:09 -08:00
John Mizerek
1b16dd8671 Fix imageUrl field handling in menu extraction
Claude returns imageUrl but code only checked for images/imageSrc.
Add handling for imageUrl field to properly match images to items.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:31:24 -08:00
John Mizerek
5cde8ce4fa ZIP upload: add file sanitization, direct file read, and temp cleanup
- uploadSavedPage.cfm: sanitize extracted files (whitelist safe extensions,
  delete symlinks) to protect against malicious content from infected sites
- analyzeMenuUrl.cfm: detect local temp URLs and read directly from disk,
  bypassing Playwright for faster processing of saved pages
- saveWizard.cfm: delete temp folder immediately after wizard completes
  instead of waiting for 1-hour auto-cleanup
- setup-wizard.html: track temp folder ID and pass to saveWizard for cleanup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:02:41 -08:00
John Mizerek
06ca5462c2 Read images from disk for local ZIP uploads
When scanning extracted ZIP content from /temp/menu-import/, read
images directly from the filesystem instead of re-downloading via HTTP.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 07:07:33 -08:00
John Mizerek
1438267af6 Use wrapper script for Playwright to set browser path 2026-02-12 21:54:02 -08:00
John Mizerek
5c50ce2cf9 Use Playwright for JS-rendered menu scraping
- Replace cfhttp with Playwright headless browser
- Capture images from network requests during page render
- No longer needs to fetch subpages (JS renders everything)
- Should capture subcategory items that load dynamically

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 21:43:37 -08:00
John Mizerek
dbe05a8b12 Update prompt to extract imageUrl from item containers in HTML 2026-02-12 20:36:16 -08:00
John Mizerek
a1b557cdc7 Look for embedded JSON data in menu pages 2026-02-12 20:22:35 -08:00
John Mizerek
361e54c17a Add debug: Beverages HTML snippet to see subcategory structure 2026-02-12 20:06:07 -08:00
John Mizerek
794d2ceee5 Add debug for menuGroup/menuSection structure detection 2026-02-12 20:01:43 -08:00
John Mizerek
bed088d0ff Explicit subcategory rule: outer section = parent, inner sections = subcats 2026-02-12 19:56:29 -08:00
John Mizerek
2163bb3009 Explicit subcategory detection with HTML structure example 2026-02-12 19:55:00 -08:00
John Mizerek
99c2a6aa10 Add HTML snippet debug to see actual structure 2026-02-12 19:50:56 -08:00
John Mizerek
549f3cb31f Explicit Toast subcategory instructions: parent in category, subcat in subcategory 2026-02-12 19:44:10 -08:00
John Mizerek
436861970e Add h4 tag debug to find subcategory tags 2026-02-12 19:40:51 -08:00
John Mizerek
3e9f07df1a Simplify: categories as strings, subcategory on items 2026-02-12 19:37:04 -08:00
John Mizerek
dfb264eba6 Simplify image extraction to single imageUrl per item 2026-02-12 19:29:41 -08:00
John Mizerek
89adfbc92e Add JSON parse error handling with debug output 2026-02-12 19:29:00 -08:00
John Mizerek
ec59f05814 Restore working prompt, add subcategory support without breaking item extraction 2026-02-12 19:22:45 -08:00
John Mizerek
d8dacb198e Fix CFML hash escape in system prompt 2026-02-12 19:17:28 -08:00
John Mizerek
e372f67901 Improve Toast POS subcategory detection with explicit h3 search and debug output 2026-02-12 19:14:15 -08:00
John Mizerek
813628cecb Add HTML file upload option for menu import
- Backend now accepts either url or html content in request body
- Frontend adds HTML file upload option below URL input
- Useful when websites block the crawler (403 errors)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 17:13:32 -08:00
John Mizerek
f6518932db Add URL-based menu import to setup wizard 2026-02-12 16:43:37 -08:00