Update prompt to extract imageUrl from item containers in HTML
This commit is contained in:
parent
112f343ecf
commit
dbe05a8b12
1 changed files with 1 additions and 1 deletions
|
|
@ -329,7 +329,7 @@
|
|||
<cfset arrayAppend(response.steps, "Found " & arrayLen(h3Texts) & " h3 and " & arrayLen(h4Texts) & " h4 tags")>
|
||||
|
||||
<!--- System prompt for URL analysis --->
|
||||
<cfset systemPrompt = "You are an expert at extracting structured menu data from restaurant website HTML. Extract ALL menu data visible in the HTML. Return valid JSON with these keys: business (object with name, address, phone, hours, brandColor), categories (array of category names), modifiers (array), items (array with name, description, price, category, subcategory, modifiers array, and imageUrl if found). CRITICAL: Extract EVERY menu item from ALL sources. IMPORTANT: Look for embedded JSON data in the HTML - this often contains the COMPLETE menu including items in collapsed/lazy-loaded sections. Check for __NEXT_DATA__, window state objects, JSON-LD schema, and data attributes. Combine items from both the visible HTML AND any embedded JSON. SUBCATEGORY RULE: If a section header has NO menu items directly below it but contains NESTED sections (each with their own header and items), then: the outer section is the PARENT CATEGORY, the inner sections are SUBCATEGORIES. For items in subcategories, set category to the PARENT name and subcategory to the inner section name. For brandColor: suggest a vibrant hex (6 digits, no hash). For prices: numbers (e.g., 12.99). Return ONLY valid JSON.">
|
||||
<cfset systemPrompt = "You are an expert at extracting structured menu data from restaurant website HTML. Extract ALL menu data visible in the HTML. Return valid JSON with these keys: business (object with name, address, phone, hours, brandColor), categories (array of category names), modifiers (array), items (array with name, description, price, category, subcategory, modifiers array, and imageUrl). CRITICAL FOR IMAGES: Each menu item in the HTML is typically in a container (div, li, article) that also contains an img tag. Extract the img src URL and include it as 'imageUrl' for that item. Look for img tags that are siblings or children within the same menu-item container. The image URL should be the full or relative src value from the img tag - NOT the alt text. CRITICAL: Extract EVERY menu item from ALL sources including embedded JSON (__NEXT_DATA__, window state, JSON-LD). SUBCATEGORY RULE: If a section header has NO menu items directly below it but contains NESTED sections, the outer section is the PARENT CATEGORY and inner sections are SUBCATEGORIES. For items in subcategories, set category to the PARENT name and subcategory to the inner section name. For brandColor: suggest a vibrant hex (6 digits, no hash). For prices: numbers (e.g., 12.99). Return ONLY valid JSON.">
|
||||
|
||||
<!--- Build message content --->
|
||||
<cfset messagesContent = arrayNew(1)>
|
||||
|
|
|
|||
Reference in a new issue