Quickstart
Get started with Helix Parse in under 2 minutes.
Prerequisites
Before you begin, you'll need:
- A Helix API key (see Authentication Guide)
- A command-line terminal or API client
Parse a URL
Send a URL to extract structured content from any web page.
curl -X POST https://api.feeds.onhelix.ai/parse \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.bbc.com/news/technology-67988517"
}'
Response:
{
"success": true,
"data": {
"jobId": "a3e2f8d1-7c4b-4e9a-b6d5-1f8a2c3e4d5b",
"hasPrimaryContent": true,
"consumability": {
"isConsumable": true,
"reason": "Page contains a full news article with a clear headline, body text, and publication metadata"
},
"primaryContent": {
"title": "Apple Vision Pro: First look at the mixed reality headset",
"description": "Apple's long-awaited mixed reality headset offers a glimpse into the future of spatial computing, but at a steep price point.",
"author": "James Clayton",
"publisher": "BBC News",
"publishedAt": "2024-01-18T14:23:00.000Z",
"updatedAt": null,
"isSponsored": false,
"isDigest": false,
"accessRestrictionType": null,
"text": {
"simplifiedHtml": "<p>Apple has officially launched the Vision Pro, its first major new product category in nearly a decade.</p><p>The mixed reality headset, priced at $3,499, blends digital content with the physical world using what Apple calls \"spatial computing\".</p>"
},
"video": null,
"primaryImage": {
"url": "https://ichef.bbci.co.uk/news/1024/cpsprodpb/vivo/live/images/2024/1/18/vision-pro-hands-on.jpg",
"caption": "The Apple Vision Pro headset on display at Apple's Cupertino headquarters",
"credit": "Getty Images"
},
"originallyPublished": null
},
"scrape": {
"httpStatus": 200
}
}
}
What happens behind the scenes:
- A headless browser loads and renders the page
- AI extracts the structured content, including title, author, body text, and images
- The page is assessed for consumability to determine if it contains meaningful standalone content
Parse Raw HTML
If you already have the HTML, you can pass it directly instead of a URL.
curl -X POST https://api.feeds.onhelix.ai/parse \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"html": "<html><head><title>Product Launch Announcement</title></head><body><article><h1>Acme Corp Launches the Nova Series</h1><p>By Sarah Chen | March 5, 2025</p><p>Acme Corporation today announced the Nova Series, a new line of enterprise networking hardware designed for AI-native data centers. The Nova 9000 switch delivers 51.2 Tbps of throughput while reducing power consumption by 40% compared to previous generations.</p><p>\"Enterprises need infrastructure that keeps pace with the demands of large-scale AI workloads,\" said CTO David Park.</p></article></body></html>",
"title": "Product Launch Announcement"
}'
Response:
{
"success": true,
"data": {
"jobId": "b7d1c9e4-5a3f-4b8d-9e2c-6f7a8b9c0d1e",
"hasPrimaryContent": true,
"consumability": {
"isConsumable": true,
"reason": "Page contains a complete article about a product launch with clear structure and attribution"
},
"primaryContent": {
"title": "Acme Corp Launches the Nova Series",
"description": "Acme Corporation announces the Nova Series, a new line of enterprise networking hardware designed for AI-native data centers.",
"author": "Sarah Chen",
"publisher": null,
"publishedAt": "2025-03-05T00:00:00.000Z",
"updatedAt": null,
"isSponsored": false,
"isDigest": false,
"accessRestrictionType": null,
"text": {
"simplifiedHtml": "<p>Acme Corporation today announced the Nova Series, a new line of enterprise networking hardware designed for AI-native data centers. The Nova 9000 switch delivers 51.2 Tbps of throughput while reducing power consumption by 40% compared to previous generations.</p><p>\"Enterprises need infrastructure that keeps pace with the demands of large-scale AI workloads,\" said CTO David Park.</p>"
},
"video": null,
"primaryImage": null,
"originallyPublished": null
}
}
}
When to use this mode:
- You have pre-rendered content from your own pipeline
- Working with server-side rendered pages you've already fetched
- Integrating Parse into a custom scraping workflow
What You Get Back
All responses use the standard envelope: { "success": true, "data": { ... } }
The data object contains:
hasPrimaryContent-- boolean indicating whether the page had extractable contentconsumability-- assessment of whether the page has meaningful standalone contentprimaryContent-- the extracted content (title, author, text, images, etc.)scrape-- HTTP metadata from the scrape (URL mode only)
See the Overview for a deeper understanding of how Parse works, or the API Reference for complete field documentation.
Next Steps
- Overview: deeper understanding of how Parse works
- API Reference: complete endpoint documentation