Commit graph

78 commits

Author SHA1 Message Date
John Mizerek
eec44011f4 Add more debug logging for title and OO_STATE extraction
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:21:47 -08:00
John Mizerek
e8dfd0ba7d Add debug logging for OO_STATE keys and title tag fallback
- Log all top-level keys in __OO_STATE__ to diagnose why Restaurant
  key isn't being found
- Extract business name from HTML title tag as fallback

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:19:32 -08:00
John Mizerek
5c49054e78 Extract business info from Toast __OO_STATE__ JSON
Look for Restaurant: keys and extract name, location (address, city,
state, zip), phone, and brandColor for the wizard business info step.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:12:24 -08:00
John Mizerek
c5b678ac05 Fix basePath undefined error for local temp file parsing
Define basePath before Toast parsing block so image URLs can be
properly constructed for local file uploads.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:54:10 -08:00
John Mizerek
d8e6f619ac Parse Toast menu from visible HTML for complete item extraction
- Extract items from visible HTML instead of just __OO_STATE__ JSON
- Parse headerText spans for item names, price spans for prices
- Extract images from Menu_files/ src attributes
- Fall back to simpler headerText matching if block parsing fails
- Also extract images from __OO_STATE__ and match to items by name
- Fixes issue where only 116 items extracted instead of 163+

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:53:25 -08:00
John Mizerek
b5abbe43b4 Add direct Toast menu parsing via __OO_STATE__
Skip Claude AI for Toast menus - parse the embedded JSON directly.
This extracts all items, categories, and images from the structured
__OO_STATE__ data, which is faster and more complete than AI extraction.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:34:09 -08:00
John Mizerek
1b16dd8671 Fix imageUrl field handling in menu extraction
Claude returns imageUrl but code only checked for images/imageSrc.
Add handling for imageUrl field to properly match images to items.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:31:24 -08:00
John Mizerek
5cde8ce4fa ZIP upload: add file sanitization, direct file read, and temp cleanup
- uploadSavedPage.cfm: sanitize extracted files (whitelist safe extensions,
  delete symlinks) to protect against malicious content from infected sites
- analyzeMenuUrl.cfm: detect local temp URLs and read directly from disk,
  bypassing Playwright for faster processing of saved pages
- saveWizard.cfm: delete temp folder immediately after wizard completes
  instead of waiting for 1-hour auto-cleanup
- setup-wizard.html: track temp folder ID and pass to saveWizard for cleanup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:02:41 -08:00
John Mizerek
06ca5462c2 Read images from disk for local ZIP uploads
When scanning extracted ZIP content from /temp/menu-import/, read
images directly from the filesystem instead of re-downloading via HTTP.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 07:07:33 -08:00
John Mizerek
1438267af6 Use wrapper script for Playwright to set browser path 2026-02-12 21:54:02 -08:00
John Mizerek
5c50ce2cf9 Use Playwright for JS-rendered menu scraping
- Replace cfhttp with Playwright headless browser
- Capture images from network requests during page render
- No longer needs to fetch subpages (JS renders everything)
- Should capture subcategory items that load dynamically

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 21:43:37 -08:00
John Mizerek
dbe05a8b12 Update prompt to extract imageUrl from item containers in HTML 2026-02-12 20:36:16 -08:00
John Mizerek
a1b557cdc7 Look for embedded JSON data in menu pages 2026-02-12 20:22:35 -08:00
John Mizerek
361e54c17a Add debug: Beverages HTML snippet to see subcategory structure 2026-02-12 20:06:07 -08:00
John Mizerek
794d2ceee5 Add debug for menuGroup/menuSection structure detection 2026-02-12 20:01:43 -08:00
John Mizerek
bed088d0ff Explicit subcategory rule: outer section = parent, inner sections = subcats 2026-02-12 19:56:29 -08:00
John Mizerek
2163bb3009 Explicit subcategory detection with HTML structure example 2026-02-12 19:55:00 -08:00
John Mizerek
99c2a6aa10 Add HTML snippet debug to see actual structure 2026-02-12 19:50:56 -08:00
John Mizerek
549f3cb31f Explicit Toast subcategory instructions: parent in category, subcat in subcategory 2026-02-12 19:44:10 -08:00
John Mizerek
436861970e Add h4 tag debug to find subcategory tags 2026-02-12 19:40:51 -08:00
John Mizerek
3e9f07df1a Simplify: categories as strings, subcategory on items 2026-02-12 19:37:04 -08:00
John Mizerek
dfb264eba6 Simplify image extraction to single imageUrl per item 2026-02-12 19:29:41 -08:00
John Mizerek
89adfbc92e Add JSON parse error handling with debug output 2026-02-12 19:29:00 -08:00
John Mizerek
ec59f05814 Restore working prompt, add subcategory support without breaking item extraction 2026-02-12 19:22:45 -08:00
John Mizerek
d8dacb198e Fix CFML hash escape in system prompt 2026-02-12 19:17:28 -08:00
John Mizerek
e372f67901 Improve Toast POS subcategory detection with explicit h3 search and debug output 2026-02-12 19:14:15 -08:00
John Mizerek
813628cecb Add HTML file upload option for menu import
- Backend now accepts either url or html content in request body
- Frontend adds HTML file upload option below URL input
- Useful when websites block the crawler (403 errors)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 17:13:32 -08:00
John Mizerek
f6518932db Add URL-based menu import to setup wizard 2026-02-12 16:43:37 -08:00