Commit graph

85 commits

Author SHA1 Message Date
John Mizerek
95dc4c49fc Strip address from business name when Toast embeds it in the name field
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 11:12:58 -08:00
John Mizerek
e403e49487 Fix Toast OO_STATE: restaurant from ROOT_QUERY, prices from prices[]
- Restaurant info is in ROOT_QUERY.restaurantV2(...) keys, not
  Restaurant:* top-level keys (Apollo cache format)
- Prices are in item.prices array [4.50], not item.price scalar
- Added null checks for imageUrls (can be null, not missing)
- Fallback to title tag for business name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 11:11:44 -08:00
John Mizerek
a0d86d6e87 Add Toast __OO_STATE__ fast-path for URL-fetched menu pages
Instead of sending 450KB of HTML to Claude (which truncates to 100K
and only extracts ~60 items), parse the structured __OO_STATE__ data
directly on the server. This captures all menus, groups, items, prices,
and images from Toast pages - 169 items for Jus Family Cafe vs 60 before.

Falls back to Claude analysis if __OO_STATE__ parsing fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 11:08:07 -08:00
John Mizerek
ced4082993 Fix JSON parsing when Claude returns text preamble before menu JSON
The Claude API sometimes returns explanatory text before the JSON
response even when instructed to return only JSON. Added extraction
logic to find the first { character and strip any leading text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 10:30:21 -08:00
John Mizerek
9acf4aa511 Add server-side h2/h3 hierarchy detection for subcategory discovery
- Parse HTML heading structure to detect h2 parents with h3 subcategories
- Append detected hierarchy to Claude prompt as explicit hint
- Post-process Claude response to enforce hierarchy even if Claude returns flat

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 22:36:36 -08:00
John Mizerek
495b03c76d Add subcategory detection to wizard URL analyzer and display
- analyzeMenuUrl.cfm: Detect subcategories from Toast subgroups and
  Claude API responses, preserve hierarchy with parentCategoryName
- setup-wizard.html: Display subcategories indented under parents
  throughout wizard flow (categories step, items review, summary, preview)
- menu-builder.html: Show subcategories nested in outline modal view

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 22:08:59 -08:00
John Mizerek
3ccc82c9f2 Add subcategory support (2 levels deep)
- saveCategory.cfm: Accept ParentCategoryID, enforce max 2-level nesting
- items.cfm: Include ParentCategoryID on virtual category rows for Android
- getForBuilder.cfm: Return ParentCategoryID in builder API response
- saveFromBuilder.cfm: Persist ParentCategoryID on save, track JS-to-DB id mapping
- saveWizard.cfm: Two-pass category creation (parents first, then subcategories)
- menu-builder.html: Parent category dropdown in properties, visual nesting in canvas,
  add subcategory button, renderItemCard() extracted for reuse

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 21:29:40 -08:00
John Mizerek
e02e124610 Increase max_tokens to 16384 for menu URL analysis
Large menus (20+ categories) were getting truncated JSON responses
at 8192 tokens, causing parse failures.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-28 16:17:13 -08:00
John Mizerek
b360284e56 Beacon delete fix, price extraction, tax rate lookup, add modifiers form 2026-02-14 19:17:48 -08:00
John Mizerek
3cd7bbb8b7 Fix tax rate lookup and add price extraction from __OO_STATE__
- Tax rate: Use Zippopotam (free, no key) to get state, then lookup
  from built-in state+local rate tables instead of API Ninjas
- Prices: Extract prices from Toast __OO_STATE__ MenuItem objects
  when visible HTML prices are missing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 12:05:24 -08:00
John Mizerek
76f089d1b9 Auto-populate tax rate from ZIP code using API Ninjas
When business info step loads with a ZIP code, automatically looks up
the combined sales tax rate and pre-fills the field. User can still
edit if needed. Field gets light green background to indicate auto-fill.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 11:52:56 -08:00
John Mizerek
26e5d92a03 Improve image analysis prompt - be more explicit about extracting all visible business info 2026-02-13 10:54:11 -08:00
John Mizerek
abf6965614 Image data overwrites HTML-extracted data (more reliable) 2026-02-13 10:53:48 -08:00
John Mizerek
1432d8e2b8 Use ## to escape hash in CFML string 2026-02-13 10:47:12 -08:00
John Mizerek
ba017348b0 Fix CFML syntax error - escape # in string 2026-02-13 10:46:32 -08:00
John Mizerek
aa447bd009 Fix extractDir path detection for ZIP scanning
- Extract UUID folder path from URL instead of using getDirectoryFromPath
- Old logic was broken: listLast on path ending with / returned empty string
- This caused the code to go up one level too far

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 10:32:18 -08:00
John Mizerek
f9bfbc8960 Analyze images in ZIP for business info
- Scan extracted ZIP for image files (jpg, png, gif, webp)
- Skip small files (<10KB, likely icons) and _files folder assets
- Send up to 3 images to Claude for business info extraction
- Merge extracted name, address, phone, hours, brandColor
- Only fills in fields not already found from HTML

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 10:16:18 -08:00
John Mizerek
cf34636879 Scan all HTML files in ZIP for business info
- Extract directory and scan all .htm/.html files recursively
- Look for business name in title tags (skip generic titles)
- Extract street addresses with regex patterns
- Extract phone numbers
- Check __OO_STATE__ in other pages for Restaurant data
- Merge found info into toastBusiness (first found wins)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 10:13:13 -08:00
John Mizerek
90ed78fa96 Fix: Extract categories from __OO_STATE__ groups
The __OO_STATE__ parsing was only extracting images, not the group names
as categories. Now extracts category names from menu.groups and maps
items to their proper categories.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 10:06:43 -08:00
John Mizerek
09e5807c94 Fix: Add default 'Menu' category when no categories found
Toast extraction was finding items but no h2.groupHeader categories,
leaving items ungrouped. showItemsStep() then rendered no checkboxes,
and confirmItems() filtered out all items (empty checkedIds set).

Now adds a default "Menu" category when items exist but categories is empty.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:35:50 -08:00
John Mizerek
b081e72347 Improve business info extraction from saved Toast pages
Added multiple fallback methods to extract business name:
1. Title tag with Toast-specific parsing
2. og:title and og:site_name meta tags
3. Header elements with restaurant/location classes
4. First h1 tag as last resort

Also added address and phone extraction from visible HTML.
Added summary logging of business info keys found.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:26:37 -08:00
John Mizerek
eec44011f4 Add more debug logging for title and OO_STATE extraction
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:21:47 -08:00
John Mizerek
e8dfd0ba7d Add debug logging for OO_STATE keys and title tag fallback
- Log all top-level keys in __OO_STATE__ to diagnose why Restaurant
  key isn't being found
- Extract business name from HTML title tag as fallback

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:19:32 -08:00
John Mizerek
5c49054e78 Extract business info from Toast __OO_STATE__ JSON
Look for Restaurant: keys and extract name, location (address, city,
state, zip), phone, and brandColor for the wizard business info step.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 09:12:24 -08:00
John Mizerek
c5b678ac05 Fix basePath undefined error for local temp file parsing
Define basePath before Toast parsing block so image URLs can be
properly constructed for local file uploads.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:54:10 -08:00
John Mizerek
d8e6f619ac Parse Toast menu from visible HTML for complete item extraction
- Extract items from visible HTML instead of just __OO_STATE__ JSON
- Parse headerText spans for item names, price spans for prices
- Extract images from Menu_files/ src attributes
- Fall back to simpler headerText matching if block parsing fails
- Also extract images from __OO_STATE__ and match to items by name
- Fixes issue where only 116 items extracted instead of 163+

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:53:25 -08:00
John Mizerek
b5abbe43b4 Add direct Toast menu parsing via __OO_STATE__
Skip Claude AI for Toast menus - parse the embedded JSON directly.
This extracts all items, categories, and images from the structured
__OO_STATE__ data, which is faster and more complete than AI extraction.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:34:09 -08:00
John Mizerek
1b16dd8671 Fix imageUrl field handling in menu extraction
Claude returns imageUrl but code only checked for images/imageSrc.
Add handling for imageUrl field to properly match images to items.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:31:24 -08:00
John Mizerek
5cde8ce4fa ZIP upload: add file sanitization, direct file read, and temp cleanup
- uploadSavedPage.cfm: sanitize extracted files (whitelist safe extensions,
  delete symlinks) to protect against malicious content from infected sites
- analyzeMenuUrl.cfm: detect local temp URLs and read directly from disk,
  bypassing Playwright for faster processing of saved pages
- saveWizard.cfm: delete temp folder immediately after wizard completes
  instead of waiting for 1-hour auto-cleanup
- setup-wizard.html: track temp folder ID and pass to saveWizard for cleanup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 08:02:41 -08:00
John Mizerek
336aef8685 Fix HTTPS detection and file permissions for ZIP upload
- Check X-Forwarded-Proto header for HTTPS (reverse proxy)
- chmod extracted files to be world-readable for Playwright

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 07:43:28 -08:00
John Mizerek
ddaac523bf Add auto-cleanup of old temp extractions (>1 hour)
Security: Also added nginx rule on dev server to block CFM/PHP
execution in /temp/menu-import/ directory.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 07:40:10 -08:00
John Mizerek
093a3b8bce Fix struct-to-string comparison in uploadSavedPage.cfm
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 07:37:19 -08:00
John Mizerek
06ca5462c2 Read images from disk for local ZIP uploads
When scanning extracted ZIP content from /temp/menu-import/, read
images directly from the filesystem instead of re-downloading via HTTP.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 07:07:33 -08:00
John Mizerek
8aeca335fd Add ZIP upload for saved webpage import
For Cloudflare-protected sites, users can now:
1. Save the page from their browser (Webpage, Complete)
2. ZIP the HTML and assets folder
3. Upload the ZIP in the wizard
4. Server extracts to temp folder, Playwright scans local copy

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-13 07:02:51 -08:00
John Mizerek
1438267af6 Use wrapper script for Playwright to set browser path 2026-02-12 21:54:02 -08:00
John Mizerek
5c50ce2cf9 Use Playwright for JS-rendered menu scraping
- Replace cfhttp with Playwright headless browser
- Capture images from network requests during page render
- No longer needs to fetch subpages (JS renders everything)
- Should capture subcategory items that load dynamically

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 21:43:37 -08:00
John Mizerek
22fc113461 Add NOW() for AddedOn in business INSERT
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 21:24:28 -08:00
John Mizerek
a32614be17 Restore CommunityMealType column in business INSERT (added column to dev DB)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 21:06:28 -08:00
John Mizerek
f8afbb57e9 Add bulk item image upload: accept all sizes, pick best, upload after save 2026-02-12 20:56:27 -08:00
John Mizerek
4471ddc92b Add automatic image downloading from URLs during menu import 2026-02-12 20:48:17 -08:00
John Mizerek
dbe05a8b12 Update prompt to extract imageUrl from item containers in HTML 2026-02-12 20:36:16 -08:00
John Mizerek
a1b557cdc7 Look for embedded JSON data in menu pages 2026-02-12 20:22:35 -08:00
John Mizerek
361e54c17a Add debug: Beverages HTML snippet to see subcategory structure 2026-02-12 20:06:07 -08:00
John Mizerek
794d2ceee5 Add debug for menuGroup/menuSection structure detection 2026-02-12 20:01:43 -08:00
John Mizerek
bed088d0ff Explicit subcategory rule: outer section = parent, inner sections = subcats 2026-02-12 19:56:29 -08:00
John Mizerek
2163bb3009 Explicit subcategory detection with HTML structure example 2026-02-12 19:55:00 -08:00
John Mizerek
99c2a6aa10 Add HTML snippet debug to see actual structure 2026-02-12 19:50:56 -08:00
John Mizerek
549f3cb31f Explicit Toast subcategory instructions: parent in category, subcat in subcategory 2026-02-12 19:44:10 -08:00
John Mizerek
436861970e Add h4 tag debug to find subcategory tags 2026-02-12 19:40:51 -08:00
John Mizerek
3e9f07df1a Simplify: categories as strings, subcategory on items 2026-02-12 19:37:04 -08:00