Fact Checking
Verify factual claims in text content using AI-powered analysis with evidence from authoritative sources. Get automated fact-checking results with confidence scores, supporting evidence, and detailed explanations.
Overview
Helix Fact Checking analyzes text content to extract factual claims, verify each claim against reliable sources, and provide a comprehensive assessment with supporting evidence. The system automatically handles claim extraction, source gathering, evidence collection, and result aggregation.
Key benefits:
- Automated claim extraction: AI identifies verifiable factual statements
- Multi-source verification: Cross-references claims against multiple sources
- Evidence-based results: Provides specific evidence for each verification
- Confidence scoring: Quantifies certainty in verification results
- Flexible sourcing: Use provided URLs or automatic source discovery
- Webhook notifications: Get notified when fact-checks complete
Quick Example
# Submit text for fact-checking
curl -X POST https://api.feeds.onhelix.ai/fact-check \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "The Moon landing occurred in 1969. Neil Armstrong was the first person to walk on the Moon.",
"sourceUrls": [
"https://www.nasa.gov/mission_pages/apollo/missions/apollo11.html"
]
}'
# Check results
curl "https://api.feeds.onhelix.ai/fact-check/{factCheckId}" \
-H "Authorization: Bearer YOUR_API_KEY"
# Get detailed claims and evidence
curl "https://api.feeds.onhelix.ai/fact-check/{factCheckId}/claims?includeEvidence=true" \
-H "Authorization: Bearer YOUR_API_KEY"
See the Quickstart Guide for a complete walkthrough.
How Fact Checking Works
Processing Pipeline
When you submit a fact-check request, the system processes it through four phases:
Phase 1: Claim Extraction
The system analyzes your text to extract individual factual claims:
What gets extracted:
- Factual statements that can be verified
- Historical facts, statistics, measurements
- Attributions and quotes
- Scientific claims and data points
What gets filtered out:
- Opinions and subjective statements
- Future predictions and speculation
- Hypotheticals and conditional statements
- Questions and rhetorical statements
Iterative refinement: The extraction process runs up to 5 iterations, with each iteration finding claims not identified in previous passes. This ensures comprehensive extraction while avoiding duplicates.
Example:
Input text:
"The Eiffel Tower was completed in 1889 and stands 324 meters tall.
Many people think it's beautiful, but some consider it an eyesore."
Extracted claims:
1. "The Eiffel Tower was completed in 1889" (factual, verifiable)
2. "The Eiffel Tower stands 324 meters tall" (factual, verifiable)
Filtered out:
- "Many people think it's beautiful" (opinion)
- "Some consider it an eyesore" (subjective)
Phase 2: Source Preparation
The system prepares sources for verification:
Using provided URLs:
- Source documents are created from your provided URLs
- Each URL is validated and accessible content is extracted
- Content from accessible pages is stored for verification
Access restrictions: Sources behind paywalls, login requirements, or with technical restrictions are flagged but don't cause the fact-check to fail. Other sources are used for verification.
Automatic fallback: If all provided sources are inaccessible, the system can still complete verification using alternative sources (when available).
Phase 3: Source Crawling
Sources that need content extraction are processed:
What happens:
- Web pages are crawled to extract readable content
- Content is cleaned and structured for analysis
- Metadata is captured (crawl time, access status)
Parallel processing: All sources are crawled simultaneously for efficiency.
Failure handling: Individual source failures don't halt the entire process. Verification proceeds with successful sources.
Phase 4: Claim Verification
Each claim is verified against all available sources:
Verification process:
- Evidence extraction: System identifies relevant passages from sources
- Relevance scoring: Determines how relevant evidence is to the claim (0.0 to 1.0)
- Evidence classification: Categorizes as supporting, contradicting, or neutral
- Confidence assessment: Calculates confidence in the evidence (0.0 to 1.0)
Parallel verification: All claims are verified simultaneously for speed.
Result determination:
- Verified: Strong supporting evidence, minimal contradiction
- Disputed: Strong contradicting evidence
- Unverified: Insufficient evidence found
- Mixed: Both supporting and contradicting evidence
Final Determination
After all claims are verified, the system calculates an overall assessment:
Determination types:
| Determination | Description | When Applied |
|---|---|---|
| mostly_accurate | Most claims verified/supported | ≥70% claims verified or supported |
| mostly_inaccurate | Most claims disputed/contradicted | ≥50% claims disputed |
| mixed_results | Mixed support and contradiction | Significant evidence both ways |
| insufficient_evidence | Not enough evidence | <50% claims verified, low evidence |
Confidence score: Accompanying score (0.0 to 1.0) indicates certainty in the determination based on:
- Strength of evidence
- Number of claims verified
- Consistency across sources
- Source reliability
Fact-Check Object
Each fact-check request creates a persistent object with the following data:
| Field | Type | Description |
|---|---|---|
id | string (UUID) | Unique fact-check identifier |
text | string | Original text that was fact-checked |
sourceUrls | string[] | Provided source URLs (max 100) |
status | string | Current status (see below) |
finalDetermination | string|null | Overall assessment |
finalConfidence | number|null | Confidence in determination (0.0-1.0) |
processingTimeMs | number|null | Time taken to complete (milliseconds) |
retryCount | number | Number of retry attempts |
error | string|null | Error message if failed |
createdAt | string (ISO 8601) | Creation timestamp |
updatedAt | string (ISO 8601) | Last update timestamp |
completedAt | string|null (ISO 8601) | Completion timestamp |
aggregatedStats | object|null | Statistics summary |
metadata | object|null | Additional metadata |
Status Values
| Status | Description |
|---|---|
| pending | Just created, not yet processing |
| processing | Currently being processed |
| completed | Successfully completed |
| failed | Processing failed (after retries) |
Aggregated Statistics
When completed, the aggregatedStats field contains:
{
"totalClaims": 5,
"verifiedClaims": 4,
"supportedClaims": 3,
"disputedClaims": 1,
"unverifiedClaims": 1,
"mixedClaims": 0,
"averageConfidence": 0.85,
"totalSources": 3,
"successfulSources": 3,
"restrictedSources": 0,
"failedSources": 0
}
Claims
Each extracted claim includes:
| Field | Type | Description |
|---|---|---|
id | string (UUID) | Unique claim identifier |
claimText | string | The factual claim statement |
claimContext | string|null | Additional context |
claimType | string | Type of claim (factual, statistical, etc.) |
confidence | number | Extraction confidence (0.0-1.0) |
textFragment | string|null | W3C text fragment for source linking |
verificationResult | string | Verification outcome |
evidenceConfidence | number|null | Confidence in verification (0.0-1.0) |
evidenceBalance | number | Balance of support/contradiction (-1.0 to 1.0) |
createdAt | string (ISO 8601) | Extraction timestamp |
updatedAt | string (ISO 8601) | Last update timestamp |
Verification Results
| Result | Description | Evidence Balance |
|---|---|---|
| verified | Supported by strong evidence | Positive (0.6 to 1.0) |
| disputed | Contradicted by evidence | Negative (-1.0 to -0.6) |
| unverified | Insufficient evidence | Near zero (-0.2 to 0.2) |
| mixed | Both support and contradiction | Mixed (-0.6 to 0.6, but with evidence) |
Evidence
Each piece of evidence includes:
| Field | Type | Description |
|---|---|---|
id | string (UUID) | Unique evidence identifier |
claimId | string (UUID) | Related claim |
sourceDocumentId | string (UUID) | Source document |
sourceUrl | string | Source URL |
evidenceText | string | Relevant text passage |
evidenceContext | string|null | Surrounding context |
textFragment | string|null | W3C text fragment |
extractedText | string|null | Actual extracted text |
evidenceType | string | supporting, contradicting, or neutral |
relevanceScore | number | Relevance to claim (0.0-1.0) |
confidenceScore | number | Confidence in evidence (0.0-1.0) |
explanation | string|null | Why this evidence matters |
metadata | object|null | Additional data |
createdAt | string (ISO 8601) | Discovery timestamp |
updatedAt | string (ISO 8601) | Last update timestamp |
Evidence Types
| Type | Description | Impact |
|---|---|---|
| supporting | Evidence that supports the claim | Increases verification confidence |
| contradicting | Evidence that contradicts the claim | Leads to disputed result |
| neutral | Evidence neither clearly supports nor contradicts | Minimal impact on verification |
Source Documents
Each source document tracks:
| Field | Type | Description |
|---|---|---|
id | string (UUID) | Unique document identifier |
factCheckId | string (UUID) | Parent fact-check |
sourceUrl | string | Document URL |
sitePageVersionId | string|null (UUID) | Internal page version reference |
metadata | object|null | Crawl status and details |
createdAt | string (ISO 8601) | Creation timestamp |
updatedAt | string (ISO 8601) | Last update timestamp |
Source Metadata
{
"hasAccessRestriction": false,
"source": "crawl",
"crawledAt": "2024-01-15T10:30:20.000Z",
"error": null,
"failedAt": null
}
Access restrictions: Sources may be flagged with hasAccessRestriction: true if they're behind paywalls, require login, or have other access limitations. The system handles this gracefully and uses available sources.
Use Cases
Content Verification
Verify articles, blog posts, or social media content by submitting text along with reference URLs for fact-checking.
Editorial Assistance
Help editors verify factual claims before publication by checking draft content and flagging claims that need review based on confidence thresholds.
Research Validation
Validate research findings and citations:
# Verify research claims against academic sources
curl -X POST https://api.feeds.onhelix.ai/fact-check \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Recent studies show that regular exercise reduces the risk of heart disease by 30-40%.",
"sourceUrls": [
"https://www.heart.org/research/",
"https://pubmed.ncbi.nlm.nih.gov/"
]
}'
Automated Moderation
Flag potentially misleading content by automatically fact-checking user submissions and taking action based on accuracy determinations.
Processing Time
Typical processing times:
| Text Length | Sources | Expected Time |
|---|---|---|
| Short (< 500 chars) | 1-3 sources | 30-60 seconds |
| Medium (500-2000 chars) | 1-5 sources | 1-2 minutes |
| Long (> 2000 chars) | 5+ sources | 2-5 minutes |
Factors affecting time:
- Number of factual claims extracted
- Number of sources to process
- Source accessibility and crawl time
- Complexity of claims
Limits and Considerations
Request Limits
- Maximum source URLs: 100 per request
- Text length: No strict limit, but longer texts take longer to process
Processing Behavior
- Automatic retries: System includes built-in retry logic for transient failures
- Graceful degradation: Processing continues even if some sources fail
- Access restrictions: Paywalled or login-required sources are handled gracefully
Best Practices
- Provide quality sources: Include authoritative, accessible sources when possible
- Use webhooks: More efficient than polling for results
- Handle failures: Implement retry logic for failed fact-checks
- Monitor confidence scores: Low confidence may warrant manual review
Retry Failed Fact-Checks
If a fact-check fails, you can retry it:
curl -X POST https://api.feeds.onhelix.ai/fact-check/{factCheckId}/retry \
-H "Authorization: Bearer YOUR_API_KEY"
When to retry:
- Transient network or service errors
- Temporary source inaccessibility
Retry behavior:
- Increments retry count
- Resets status to
pending - Starts fresh workflow
- Maximum retries not enforced by API (you control retry logic)
Webhooks
Receive notifications when fact-checks complete or change status.
Available events:
fact_check.completed- Triggered when fact-check successfully completesfact_check.failed- Triggered when fact-check processing failsfact_check.status_changed- Triggered on status transitions during processing
When webhooks send:
Webhooks fire when:
- Fact-check completes successfully with results
- Processing fails due to errors or invalid sources
- Status changes during the verification workflow (pending → processing → completed/failed)
What you receive:
Each webhook includes fact-check details:
- Fact-check ID and status
- Original claim text
- Verdict and confidence score (for completed checks)
- Source information and relevance
- Completion timestamp
Example webhook payload:
{
"event": "fact_check.completed",
"timestamp": "2025-11-08T12:34:56.789Z",
"data": {
"fact_check_id": "fc_1234567890",
"status": "completed",
"claim": "The original claim text",
"verdict": "mostly_true",
"confidence": 0.85,
"completed_at": "2025-11-08T12:34:56.789Z"
}
}
See the Fact-Checking Webhooks documentation for complete payload details, security verification, and setup instructions.
Next Steps
- Quickstart Guide: Create your first fact-check in 5 minutes
- API Reference: Complete endpoint documentation
- Webhooks: Set up webhook notifications
- Authentication: API key best practices