Improve Claude prompt to distinguish categories from items
Added explicit guidance that categories are broad section headings (5-15 typical) and items are individual products (30-150 typical). Prevents Claude from treating each menu item as its own category. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
0bb2707904
commit
84985d98d8
1 changed files with 1 additions and 1 deletions
|
|
@ -1864,7 +1864,7 @@
|
||||||
</cfif>
|
</cfif>
|
||||||
|
|
||||||
<!--- System prompt for URL analysis --->
|
<!--- System prompt for URL analysis --->
|
||||||
<cfset systemPrompt = "You are an expert at extracting structured menu data from restaurant website HTML. Extract ALL menu data visible in the HTML. Return valid JSON with these keys: business (object with name, address, phone, hours, brandColor), categories (array), modifiers (array), items (array with name, description, price, category, modifiers array, and imageUrl). CATEGORIES FORMAT: Each entry in the categories array can be either a simple string (for flat categories) OR an object with 'name' and optional 'subcategories' array. Example: [""Appetizers"", {""name"": ""Drinks"", ""subcategories"": [""Hot Drinks"", ""Cold Drinks""]}, ""Desserts""]. SUBCATEGORY DETECTION: If a section header contains nested titled sections beneath it (sub-headers with their own items), the outer section is the PARENT and inner sections are SUBCATEGORIES. For items in subcategories, set their 'category' field to the SUBCATEGORY name (not the parent). CRITICAL FOR IMAGES: Each menu item in the HTML is typically in a container (div, li, article) that also contains an img tag. Extract the img src URL and include it as 'imageUrl' for that item. Look for img tags that are siblings or children within the same menu-item container. The image URL should be the full or relative src value from the img tag - NOT the alt text. CRITICAL: Extract EVERY menu item from ALL sources including embedded JSON (__NEXT_DATA__, window state, JSON-LD). For brandColor: suggest a vibrant hex (6 digits, no hash). For prices: numbers (e.g., 12.99). Return ONLY valid JSON.">
|
<cfset systemPrompt = "You are an expert at extracting structured menu data from restaurant website HTML. Extract ALL menu data visible in the HTML. Return valid JSON with these keys: business (object with name, address, phone, hours, brandColor), categories (array), modifiers (array), items (array with name, description, price, category, modifiers array, and imageUrl). CATEGORIES vs ITEMS (CRITICAL): A CATEGORY is a broad section heading that groups multiple items (e.g., 'Appetizers', 'Tacos', 'Drinks', 'Desserts'). An ITEM is an individual food or drink product with a name, description, and price. Do NOT create a category for each individual item. A typical restaurant has 5-15 categories and 30-150 items. If you find yourself creating more categories than items, you are wrong - those are items, not categories. Each item must have a 'category' field set to the category it belongs to. CATEGORIES FORMAT: Each entry in the categories array can be either a simple string (for flat categories) OR an object with 'name' and optional 'subcategories' array. Example: [""Appetizers"", {""name"": ""Drinks"", ""subcategories"": [""Hot Drinks"", ""Cold Drinks""]}, ""Desserts""]. SUBCATEGORY DETECTION: If a section header contains nested titled sections beneath it (sub-headers with their own items), the outer section is the PARENT and inner sections are SUBCATEGORIES. For items in subcategories, set their 'category' field to the SUBCATEGORY name (not the parent). CRITICAL FOR IMAGES: Each menu item in the HTML is typically in a container (div, li, article) that also contains an img tag. Extract the img src URL and include it as 'imageUrl' for that item. Look for img tags that are siblings or children within the same menu-item container. The image URL should be the full or relative src value from the img tag - NOT the alt text. CRITICAL: Extract EVERY menu item from ALL sources including embedded JSON (__NEXT_DATA__, window state, JSON-LD). For brandColor: suggest a vibrant hex (6 digits, no hash). For prices: numbers (e.g., 12.99). Return ONLY valid JSON.">
|
||||||
|
|
||||||
<!--- Build message content --->
|
<!--- Build message content --->
|
||||||
<cfset messagesContent = arrayNew(1)>
|
<cfset messagesContent = arrayNew(1)>
|
||||||
|
|
|
||||||
Reference in a new issue