Quickstart

Get started with Helix Parse in under 2 minutes.

Prerequisites

Before you begin, you'll need:

A Helix API key (see Authentication Guide)
A command-line terminal or API client

Parse a URL

Send a URL to extract structured content from any web page.

curl -X POST https://api.feeds.onhelix.ai/parse \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.bbc.com/news/technology-67988517"
  }'

Response:

{
  "success": true,
  "data": {
    "jobId": "a3e2f8d1-7c4b-4e9a-b6d5-1f8a2c3e4d5b",
    "hasPrimaryContent": true,
    "consumability": {
      "isConsumable": true,
      "reason": "Page contains a full news article with a clear headline, body text, and publication metadata"
    },
    "primaryContent": {
      "title": "Apple Vision Pro: First look at the mixed reality headset",
      "description": "Apple's long-awaited mixed reality headset offers a glimpse into the future of spatial computing, but at a steep price point.",
      "author": "James Clayton",
      "publisher": "BBC News",
      "publishedAt": "2024-01-18T14:23:00.000Z",
      "updatedAt": null,
      "isSponsored": false,
      "isDigest": false,
      "accessRestrictionType": null,
      "text": {
        "simplifiedHtml": "<p>Apple has officially launched the Vision Pro, its first major new product category in nearly a decade.</p><p>The mixed reality headset, priced at $3,499, blends digital content with the physical world using what Apple calls \"spatial computing\".</p>"
      },
      "video": null,
      "primaryImage": {
        "url": "https://ichef.bbci.co.uk/news/1024/cpsprodpb/vivo/live/images/2024/1/18/vision-pro-hands-on.jpg",
        "caption": "The Apple Vision Pro headset on display at Apple's Cupertino headquarters",
        "credit": "Getty Images"
      },
      "originallyPublished": null
    },
    "scrape": {
      "httpStatus": 200
    }
  }
}

What happens behind the scenes:

A headless browser loads and renders the page
AI extracts the structured content, including title, author, body text, and images
The page is assessed for consumability to determine if it contains meaningful standalone content

Parse Raw HTML

If you already have the HTML, you can pass it directly instead of a URL.

curl -X POST https://api.feeds.onhelix.ai/parse \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "html": "<html><head><title>Product Launch Announcement</title></head><body><article><h1>Acme Corp Launches the Nova Series</h1><p>By Sarah Chen | March 5, 2025</p><p>Acme Corporation today announced the Nova Series, a new line of enterprise networking hardware designed for AI-native data centers. The Nova 9000 switch delivers 51.2 Tbps of throughput while reducing power consumption by 40% compared to previous generations.</p><p>\"Enterprises need infrastructure that keeps pace with the demands of large-scale AI workloads,\" said CTO David Park.</p></article></body></html>",
    "title": "Product Launch Announcement"
  }'

Response:

{
  "success": true,
  "data": {
    "jobId": "b7d1c9e4-5a3f-4b8d-9e2c-6f7a8b9c0d1e",
    "hasPrimaryContent": true,
    "consumability": {
      "isConsumable": true,
      "reason": "Page contains a complete article about a product launch with clear structure and attribution"
    },
    "primaryContent": {
      "title": "Acme Corp Launches the Nova Series",
      "description": "Acme Corporation announces the Nova Series, a new line of enterprise networking hardware designed for AI-native data centers.",
      "author": "Sarah Chen",
      "publisher": null,
      "publishedAt": "2025-03-05T00:00:00.000Z",
      "updatedAt": null,
      "isSponsored": false,
      "isDigest": false,
      "accessRestrictionType": null,
      "text": {
        "simplifiedHtml": "<p>Acme Corporation today announced the Nova Series, a new line of enterprise networking hardware designed for AI-native data centers. The Nova 9000 switch delivers 51.2 Tbps of throughput while reducing power consumption by 40% compared to previous generations.</p><p>\"Enterprises need infrastructure that keeps pace with the demands of large-scale AI workloads,\" said CTO David Park.</p>"
      },
      "video": null,
      "primaryImage": null,
      "originallyPublished": null
    }
  }
}

When to use this mode:

You have pre-rendered content from your own pipeline
Working with server-side rendered pages you've already fetched
Integrating Parse into a custom scraping workflow

What You Get Back

All responses use the standard envelope: { "success": true, "data": { ... } }

The data object contains:

hasPrimaryContent -- boolean indicating whether the page had extractable content
consumability -- assessment of whether the page has meaningful standalone content
primaryContent -- the extracted content (title, author, text, images, etc.)
scrape -- HTTP metadata from the scrape (URL mode only)

See the Overview for a deeper understanding of how Parse works, or the API Reference for complete field documentation.

Next Steps

Overview: deeper understanding of how Parse works
API Reference: complete endpoint documentation

Prerequisites​

Parse a URL​

Parse Raw HTML​

What You Get Back​

Next Steps​

Prerequisites

Parse a URL

Parse Raw HTML

What You Get Back

Next Steps