Fact Checking

Verify factual claims in text content using AI-powered analysis with evidence from authoritative sources. Get automated fact-checking results with confidence scores, supporting evidence, and detailed explanations.

Overview

Helix Fact Checking analyzes text content to extract factual claims, verify each claim against reliable sources, and provide a comprehensive assessment with supporting evidence. The system automatically handles claim extraction, source gathering, evidence collection, and result aggregation.

Key benefits:

Automated claim extraction: AI identifies verifiable factual statements
Multi-source verification: Cross-references claims against multiple sources
Evidence-based results: Provides specific evidence for each verification
Confidence scoring: Quantifies certainty in verification results
Flexible sourcing: Use provided URLs or automatic source discovery
Webhook notifications: Get notified when fact-checks complete

Quick Example

# Submit text for fact-checking
curl -X POST https://api.feeds.onhelix.ai/fact-check \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "The Moon landing occurred in 1969. Neil Armstrong was the first person to walk on the Moon.",
    "sourceUrls": [
      "https://www.nasa.gov/mission_pages/apollo/missions/apollo11.html"
    ]
  }'

# Check results
curl "https://api.feeds.onhelix.ai/fact-check/{factCheckId}" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Get detailed claims and evidence
curl "https://api.feeds.onhelix.ai/fact-check/{factCheckId}/claims?includeEvidence=true" \
  -H "Authorization: Bearer YOUR_API_KEY"

See the Quickstart Guide for a complete walkthrough.

How Fact Checking Works

Processing Pipeline

When you submit a fact-check request, the system processes it through four phases:

Phase 1: Claim Extraction

The system analyzes your text to extract individual factual claims:

What gets extracted:

Factual statements that can be verified
Historical facts, statistics, measurements
Attributions and quotes
Scientific claims and data points

What gets filtered out:

Opinions and subjective statements
Future predictions and speculation
Hypotheticals and conditional statements
Questions and rhetorical statements

Iterative refinement: The extraction process runs up to 5 iterations, with each iteration finding claims not identified in previous passes. This ensures comprehensive extraction while avoiding duplicates.

Example:

Input text:
"The Eiffel Tower was completed in 1889 and stands 324 meters tall.
Many people think it's beautiful, but some consider it an eyesore."

Extracted claims:
1. "The Eiffel Tower was completed in 1889" (factual, verifiable)
2. "The Eiffel Tower stands 324 meters tall" (factual, verifiable)

Filtered out:
- "Many people think it's beautiful" (opinion)
- "Some consider it an eyesore" (subjective)

Phase 2: Source Preparation

The system prepares sources for verification:

Using provided URLs:

Source documents are created from your provided URLs
Each URL is validated and accessible content is extracted
Content from accessible pages is stored for verification

Access restrictions: Sources behind paywalls, login requirements, or with technical restrictions are flagged but don't cause the fact-check to fail. Other sources are used for verification.

Automatic fallback: If all provided sources are inaccessible, the system can still complete verification using alternative sources (when available).

Phase 3: Source Crawling

Sources that need content extraction are processed:

What happens:

Web pages are crawled to extract readable content
Content is cleaned and structured for analysis
Metadata is captured (crawl time, access status)

Parallel processing: All sources are crawled simultaneously for efficiency.

Failure handling: Individual source failures don't halt the entire process. Verification proceeds with successful sources.

Phase 4: Claim Verification

Each claim is verified against all available sources:

Verification process:

Evidence extraction: System identifies relevant passages from sources
Relevance scoring: Determines how relevant evidence is to the claim (0.0 to 1.0)
Evidence classification: Categorizes as supporting, contradicting, or neutral
Confidence assessment: Calculates confidence in the evidence (0.0 to 1.0)

Parallel verification: All claims are verified simultaneously for speed.

Result determination:

Verified: Strong supporting evidence, minimal contradiction
Disputed: Strong contradicting evidence
Unverified: Insufficient evidence found
Mixed: Both supporting and contradicting evidence

Final Determination

After all claims are verified, the system calculates an overall assessment:

Determination types:

Determination	Description	When Applied
mostly_accurate	Most claims verified/supported	≥70% claims verified or supported
mostly_inaccurate	Most claims disputed/contradicted	≥50% claims disputed
mixed_results	Mixed support and contradiction	Significant evidence both ways
insufficient_evidence	Not enough evidence	<50% claims verified, low evidence

Confidence score: Accompanying score (0.0 to 1.0) indicates certainty in the determination based on:

Strength of evidence
Number of claims verified
Consistency across sources
Source reliability

Fact-Check Object

Each fact-check request creates a persistent object with the following data:

Field	Type	Description
`id`	string (UUID)	Unique fact-check identifier
`text`	string	Original text that was fact-checked
`sourceUrls`	string[]	Provided source URLs (max 100)
`status`	string	Current status (see below)
`finalDetermination`	string\|null	Overall assessment
`finalConfidence`	number\|null	Confidence in determination (0.0-1.0)
`processingTimeMs`	number\|null	Time taken to complete (milliseconds)
`retryCount`	number	Number of retry attempts
`error`	string\|null	Error message if failed
`createdAt`	string (ISO 8601)	Creation timestamp
`updatedAt`	string (ISO 8601)	Last update timestamp
`completedAt`	string\|null (ISO 8601)	Completion timestamp
`aggregatedStats`	object\|null	Statistics summary
`metadata`	object\|null	Additional metadata

Status Values

Status	Description
pending	Just created, not yet processing
processing	Currently being processed
completed	Successfully completed
failed	Processing failed (after retries)

Aggregated Statistics

When completed, the aggregatedStats field contains:

{
  "totalClaims": 5,
  "verifiedClaims": 4,
  "supportedClaims": 3,
  "disputedClaims": 1,
  "unverifiedClaims": 1,
  "mixedClaims": 0,
  "averageConfidence": 0.85,
  "totalSources": 3,
  "successfulSources": 3,
  "restrictedSources": 0,
  "failedSources": 0
}

Claims

Each extracted claim includes:

Field	Type	Description
`id`	string (UUID)	Unique claim identifier
`claimText`	string	The factual claim statement
`claimContext`	string\|null	Additional context
`claimType`	string	Type of claim (factual, statistical, etc.)
`confidence`	number	Extraction confidence (0.0-1.0)
`textFragment`	string\|null	W3C text fragment for source linking
`verificationResult`	string	Verification outcome
`evidenceConfidence`	number\|null	Confidence in verification (0.0-1.0)
`evidenceBalance`	number	Balance of support/contradiction (-1.0 to 1.0)
`createdAt`	string (ISO 8601)	Extraction timestamp
`updatedAt`	string (ISO 8601)	Last update timestamp

Verification Results

Result	Description	Evidence Balance
verified	Supported by strong evidence	Positive (0.6 to 1.0)
disputed	Contradicted by evidence	Negative (-1.0 to -0.6)
unverified	Insufficient evidence	Near zero (-0.2 to 0.2)
mixed	Both support and contradiction	Mixed (-0.6 to 0.6, but with evidence)

Evidence

Each piece of evidence includes:

Field	Type	Description
`id`	string (UUID)	Unique evidence identifier
`claimId`	string (UUID)	Related claim
`sourceDocumentId`	string (UUID)	Source document
`sourceUrl`	string	Source URL
`evidenceText`	string	Relevant text passage
`evidenceContext`	string\|null	Surrounding context
`textFragment`	string\|null	W3C text fragment
`extractedText`	string\|null	Actual extracted text
`evidenceType`	string	supporting, contradicting, or neutral
`relevanceScore`	number	Relevance to claim (0.0-1.0)
`confidenceScore`	number	Confidence in evidence (0.0-1.0)
`explanation`	string\|null	Why this evidence matters
`metadata`	object\|null	Additional data
`createdAt`	string (ISO 8601)	Discovery timestamp
`updatedAt`	string (ISO 8601)	Last update timestamp

Evidence Types

Type	Description	Impact
supporting	Evidence that supports the claim	Increases verification confidence
contradicting	Evidence that contradicts the claim	Leads to disputed result
neutral	Evidence neither clearly supports nor contradicts	Minimal impact on verification

Source Documents

Each source document tracks:

Field	Type	Description
`id`	string (UUID)	Unique document identifier
`factCheckId`	string (UUID)	Parent fact-check
`sourceUrl`	string	Document URL
`sitePageVersionId`	string\|null (UUID)	Internal page version reference
`metadata`	object\|null	Crawl status and details
`createdAt`	string (ISO 8601)	Creation timestamp
`updatedAt`	string (ISO 8601)	Last update timestamp

Source Metadata

{
  "hasAccessRestriction": false,
  "source": "crawl",
  "crawledAt": "2024-01-15T10:30:20.000Z",
  "error": null,
  "failedAt": null
}

Access restrictions: Sources may be flagged with hasAccessRestriction: true if they're behind paywalls, require login, or have other access limitations. The system handles this gracefully and uses available sources.

Use Cases

Content Verification

Verify articles, blog posts, or social media content by submitting text along with reference URLs for fact-checking.

Editorial Assistance

Help editors verify factual claims before publication by checking draft content and flagging claims that need review based on confidence thresholds.

Research Validation

Validate research findings and citations:

# Verify research claims against academic sources
curl -X POST https://api.feeds.onhelix.ai/fact-check \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Recent studies show that regular exercise reduces the risk of heart disease by 30-40%.",
    "sourceUrls": [
      "https://www.heart.org/research/",
      "https://pubmed.ncbi.nlm.nih.gov/"
    ]
  }'

Automated Moderation

Flag potentially misleading content by automatically fact-checking user submissions and taking action based on accuracy determinations.

Processing Time

Typical processing times:

Text Length	Sources	Expected Time
Short (< 500 chars)	1-3 sources	30-60 seconds
Medium (500-2000 chars)	1-5 sources	1-2 minutes
Long (> 2000 chars)	5+ sources	2-5 minutes

Factors affecting time:

Number of factual claims extracted
Number of sources to process
Source accessibility and crawl time
Complexity of claims

Limits and Considerations

Request Limits

Maximum source URLs: 100 per request
Text length: No strict limit, but longer texts take longer to process

Processing Behavior

Automatic retries: System includes built-in retry logic for transient failures
Graceful degradation: Processing continues even if some sources fail
Access restrictions: Paywalled or login-required sources are handled gracefully

Best Practices

Provide quality sources: Include authoritative, accessible sources when possible
Use webhooks: More efficient than polling for results
Handle failures: Implement retry logic for failed fact-checks
Monitor confidence scores: Low confidence may warrant manual review

Retry Failed Fact-Checks

If a fact-check fails, you can retry it:

curl -X POST https://api.feeds.onhelix.ai/fact-check/{factCheckId}/retry \
  -H "Authorization: Bearer YOUR_API_KEY"

When to retry:

Transient network or service errors
Temporary source inaccessibility

Retry behavior:

Increments retry count
Resets status to pending
Starts fresh workflow
Maximum retries not enforced by API (you control retry logic)

Webhooks

Receive notifications when fact-checks complete or change status.

Available events:

fact_check.completed - Triggered when fact-check successfully completes
fact_check.failed - Triggered when fact-check processing fails
fact_check.status_changed - Triggered on status transitions during processing

When webhooks send:

Webhooks fire when:

Fact-check completes successfully with results
Processing fails due to errors or invalid sources
Status changes during the verification workflow (pending → processing → completed/failed)

What you receive:

Each webhook includes fact-check details:

Fact-check ID and status
Original claim text
Verdict and confidence score (for completed checks)
Source information and relevance
Completion timestamp

Example webhook payload:

{
  "event": "fact_check.completed",
  "timestamp": "2025-11-08T12:34:56.789Z",
  "data": {
    "fact_check_id": "fc_1234567890",
    "status": "completed",
    "claim": "The original claim text",
    "verdict": "mostly_true",
    "confidence": 0.85,
    "completed_at": "2025-11-08T12:34:56.789Z"
  }
}

See the Fact-Checking Webhooks documentation for complete payload details, security verification, and setup instructions.

Next Steps

Quickstart Guide: Create your first fact-check in 5 minutes
API Reference: Complete endpoint documentation
Webhooks: Set up webhook notifications
Authentication: API key best practices

Fact Checking

Overview​

Quick Example​

How Fact Checking Works​

Processing Pipeline​

Phase 1: Claim Extraction​

Phase 2: Source Preparation​

Phase 3: Source Crawling​

Phase 4: Claim Verification​

Final Determination​

Fact-Check Object​

Status Values​

Aggregated Statistics​

Claims​

Verification Results​

Evidence​

Evidence Types​

Source Documents​

Source Metadata​

Use Cases​

Content Verification​

Editorial Assistance​

Research Validation​

Automated Moderation​

Processing Time​

Limits and Considerations​

Request Limits​

Processing Behavior​

Best Practices​

Retry Failed Fact-Checks​

Webhooks​

Next Steps​