Skip to main content

Fact Checking

Verify factual claims in text content using AI-powered analysis with evidence from authoritative sources. Get automated fact-checking results with confidence scores, supporting evidence, and detailed explanations.

Overview

Helix Fact Checking analyzes text content to extract factual claims, verify each claim against reliable sources, and provide a comprehensive assessment with supporting evidence. The system automatically handles claim extraction, source gathering, evidence collection, and result aggregation.

Key benefits:

  • Automated claim extraction: AI identifies verifiable factual statements
  • Multi-source verification: Cross-references claims against multiple sources
  • Evidence-based results: Provides specific evidence for each verification
  • Confidence scoring: Quantifies certainty in verification results
  • Flexible sourcing: Use provided URLs or automatic source discovery
  • Webhook notifications: Get notified when fact-checks complete

Quick Example

# Submit text for fact-checking
curl -X POST https://api.feeds.onhelix.ai/fact-check \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "The Moon landing occurred in 1969. Neil Armstrong was the first person to walk on the Moon.",
"sourceUrls": [
"https://www.nasa.gov/mission_pages/apollo/missions/apollo11.html"
]
}'

# Check results
curl "https://api.feeds.onhelix.ai/fact-check/{factCheckId}" \
-H "Authorization: Bearer YOUR_API_KEY"

# Get detailed claims and evidence
curl "https://api.feeds.onhelix.ai/fact-check/{factCheckId}/claims?includeEvidence=true" \
-H "Authorization: Bearer YOUR_API_KEY"

See the Quickstart Guide for a complete walkthrough.

How Fact Checking Works

Processing Pipeline

When you submit a fact-check request, the system processes it through four phases:

Phase 1: Claim Extraction

The system analyzes your text to extract individual factual claims:

What gets extracted:

  • Factual statements that can be verified
  • Historical facts, statistics, measurements
  • Attributions and quotes
  • Scientific claims and data points

What gets filtered out:

  • Opinions and subjective statements
  • Future predictions and speculation
  • Hypotheticals and conditional statements
  • Questions and rhetorical statements

Iterative refinement: The extraction process runs up to 5 iterations, with each iteration finding claims not identified in previous passes. This ensures comprehensive extraction while avoiding duplicates.

Example:

Input text:
"The Eiffel Tower was completed in 1889 and stands 324 meters tall.
Many people think it's beautiful, but some consider it an eyesore."

Extracted claims:
1. "The Eiffel Tower was completed in 1889" (factual, verifiable)
2. "The Eiffel Tower stands 324 meters tall" (factual, verifiable)

Filtered out:
- "Many people think it's beautiful" (opinion)
- "Some consider it an eyesore" (subjective)

Phase 2: Source Preparation

The system prepares sources for verification:

Using provided URLs:

  • Source documents are created from your provided URLs
  • Each URL is validated and accessible content is extracted
  • Content from accessible pages is stored for verification

Access restrictions: Sources behind paywalls, login requirements, or with technical restrictions are flagged but don't cause the fact-check to fail. Other sources are used for verification.

Automatic fallback: If all provided sources are inaccessible, the system can still complete verification using alternative sources (when available).

Phase 3: Source Crawling

Sources that need content extraction are processed:

What happens:

  • Web pages are crawled to extract readable content
  • Content is cleaned and structured for analysis
  • Metadata is captured (crawl time, access status)

Parallel processing: All sources are crawled simultaneously for efficiency.

Failure handling: Individual source failures don't halt the entire process. Verification proceeds with successful sources.

Phase 4: Claim Verification

Each claim is verified against all available sources:

Verification process:

  1. Evidence extraction: System identifies relevant passages from sources
  2. Relevance scoring: Determines how relevant evidence is to the claim (0.0 to 1.0)
  3. Evidence classification: Categorizes as supporting, contradicting, or neutral
  4. Confidence assessment: Calculates confidence in the evidence (0.0 to 1.0)

Parallel verification: All claims are verified simultaneously for speed.

Result determination:

  • Verified: Strong supporting evidence, minimal contradiction
  • Disputed: Strong contradicting evidence
  • Unverified: Insufficient evidence found
  • Mixed: Both supporting and contradicting evidence

Final Determination

After all claims are verified, the system calculates an overall assessment:

Determination types:

DeterminationDescriptionWhen Applied
mostly_accurateMost claims verified/supported≥70% claims verified or supported
mostly_inaccurateMost claims disputed/contradicted≥50% claims disputed
mixed_resultsMixed support and contradictionSignificant evidence both ways
insufficient_evidenceNot enough evidence<50% claims verified, low evidence

Confidence score: Accompanying score (0.0 to 1.0) indicates certainty in the determination based on:

  • Strength of evidence
  • Number of claims verified
  • Consistency across sources
  • Source reliability

Fact-Check Object

Each fact-check request creates a persistent object with the following data:

FieldTypeDescription
idstring (UUID)Unique fact-check identifier
textstringOriginal text that was fact-checked
sourceUrlsstring[]Provided source URLs (max 100)
statusstringCurrent status (see below)
finalDeterminationstring|nullOverall assessment
finalConfidencenumber|nullConfidence in determination (0.0-1.0)
processingTimeMsnumber|nullTime taken to complete (milliseconds)
retryCountnumberNumber of retry attempts
errorstring|nullError message if failed
createdAtstring (ISO 8601)Creation timestamp
updatedAtstring (ISO 8601)Last update timestamp
completedAtstring|null (ISO 8601)Completion timestamp
aggregatedStatsobject|nullStatistics summary
metadataobject|nullAdditional metadata

Status Values

StatusDescription
pendingJust created, not yet processing
processingCurrently being processed
completedSuccessfully completed
failedProcessing failed (after retries)

Aggregated Statistics

When completed, the aggregatedStats field contains:

{
"totalClaims": 5,
"verifiedClaims": 4,
"supportedClaims": 3,
"disputedClaims": 1,
"unverifiedClaims": 1,
"mixedClaims": 0,
"averageConfidence": 0.85,
"totalSources": 3,
"successfulSources": 3,
"restrictedSources": 0,
"failedSources": 0
}

Claims

Each extracted claim includes:

FieldTypeDescription
idstring (UUID)Unique claim identifier
claimTextstringThe factual claim statement
claimContextstring|nullAdditional context
claimTypestringType of claim (factual, statistical, etc.)
confidencenumberExtraction confidence (0.0-1.0)
textFragmentstring|nullW3C text fragment for source linking
verificationResultstringVerification outcome
evidenceConfidencenumber|nullConfidence in verification (0.0-1.0)
evidenceBalancenumberBalance of support/contradiction (-1.0 to 1.0)
createdAtstring (ISO 8601)Extraction timestamp
updatedAtstring (ISO 8601)Last update timestamp

Verification Results

ResultDescriptionEvidence Balance
verifiedSupported by strong evidencePositive (0.6 to 1.0)
disputedContradicted by evidenceNegative (-1.0 to -0.6)
unverifiedInsufficient evidenceNear zero (-0.2 to 0.2)
mixedBoth support and contradictionMixed (-0.6 to 0.6, but with evidence)

Evidence

Each piece of evidence includes:

FieldTypeDescription
idstring (UUID)Unique evidence identifier
claimIdstring (UUID)Related claim
sourceDocumentIdstring (UUID)Source document
sourceUrlstringSource URL
evidenceTextstringRelevant text passage
evidenceContextstring|nullSurrounding context
textFragmentstring|nullW3C text fragment
extractedTextstring|nullActual extracted text
evidenceTypestringsupporting, contradicting, or neutral
relevanceScorenumberRelevance to claim (0.0-1.0)
confidenceScorenumberConfidence in evidence (0.0-1.0)
explanationstring|nullWhy this evidence matters
metadataobject|nullAdditional data
createdAtstring (ISO 8601)Discovery timestamp
updatedAtstring (ISO 8601)Last update timestamp

Evidence Types

TypeDescriptionImpact
supportingEvidence that supports the claimIncreases verification confidence
contradictingEvidence that contradicts the claimLeads to disputed result
neutralEvidence neither clearly supports nor contradictsMinimal impact on verification

Source Documents

Each source document tracks:

FieldTypeDescription
idstring (UUID)Unique document identifier
factCheckIdstring (UUID)Parent fact-check
sourceUrlstringDocument URL
sitePageVersionIdstring|null (UUID)Internal page version reference
metadataobject|nullCrawl status and details
createdAtstring (ISO 8601)Creation timestamp
updatedAtstring (ISO 8601)Last update timestamp

Source Metadata

{
"hasAccessRestriction": false,
"source": "crawl",
"crawledAt": "2024-01-15T10:30:20.000Z",
"error": null,
"failedAt": null
}

Access restrictions: Sources may be flagged with hasAccessRestriction: true if they're behind paywalls, require login, or have other access limitations. The system handles this gracefully and uses available sources.

Use Cases

Content Verification

Verify articles, blog posts, or social media content by submitting text along with reference URLs for fact-checking.

Editorial Assistance

Help editors verify factual claims before publication by checking draft content and flagging claims that need review based on confidence thresholds.

Research Validation

Validate research findings and citations:

# Verify research claims against academic sources
curl -X POST https://api.feeds.onhelix.ai/fact-check \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Recent studies show that regular exercise reduces the risk of heart disease by 30-40%.",
"sourceUrls": [
"https://www.heart.org/research/",
"https://pubmed.ncbi.nlm.nih.gov/"
]
}'

Automated Moderation

Flag potentially misleading content by automatically fact-checking user submissions and taking action based on accuracy determinations.

Processing Time

Typical processing times:

Text LengthSourcesExpected Time
Short (< 500 chars)1-3 sources30-60 seconds
Medium (500-2000 chars)1-5 sources1-2 minutes
Long (> 2000 chars)5+ sources2-5 minutes

Factors affecting time:

  • Number of factual claims extracted
  • Number of sources to process
  • Source accessibility and crawl time
  • Complexity of claims

Limits and Considerations

Request Limits

  • Maximum source URLs: 100 per request
  • Text length: No strict limit, but longer texts take longer to process

Processing Behavior

  • Automatic retries: System includes built-in retry logic for transient failures
  • Graceful degradation: Processing continues even if some sources fail
  • Access restrictions: Paywalled or login-required sources are handled gracefully

Best Practices

  • Provide quality sources: Include authoritative, accessible sources when possible
  • Use webhooks: More efficient than polling for results
  • Handle failures: Implement retry logic for failed fact-checks
  • Monitor confidence scores: Low confidence may warrant manual review

Retry Failed Fact-Checks

If a fact-check fails, you can retry it:

curl -X POST https://api.feeds.onhelix.ai/fact-check/{factCheckId}/retry \
-H "Authorization: Bearer YOUR_API_KEY"

When to retry:

  • Transient network or service errors
  • Temporary source inaccessibility

Retry behavior:

  • Increments retry count
  • Resets status to pending
  • Starts fresh workflow
  • Maximum retries not enforced by API (you control retry logic)

Webhooks

Receive notifications when fact-checks complete or change status.

Available events:

  • fact_check.completed - Triggered when fact-check successfully completes
  • fact_check.failed - Triggered when fact-check processing fails
  • fact_check.status_changed - Triggered on status transitions during processing

When webhooks send:

Webhooks fire when:

  • Fact-check completes successfully with results
  • Processing fails due to errors or invalid sources
  • Status changes during the verification workflow (pending → processing → completed/failed)

What you receive:

Each webhook includes fact-check details:

  • Fact-check ID and status
  • Original claim text
  • Verdict and confidence score (for completed checks)
  • Source information and relevance
  • Completion timestamp

Example webhook payload:

{
"event": "fact_check.completed",
"timestamp": "2025-11-08T12:34:56.789Z",
"data": {
"fact_check_id": "fc_1234567890",
"status": "completed",
"claim": "The original claim text",
"verdict": "mostly_true",
"confidence": 0.85,
"completed_at": "2025-11-08T12:34:56.789Z"
}
}

See the Fact-Checking Webhooks documentation for complete payload details, security verification, and setup instructions.

Next Steps