This guide describes how to consume ExactEDI output in downstream systems, ETL pipelines, and analytics platforms.
Overview
ExactEDI provides three primary output formats for integration:
| Format | Command | Use Case |
|---|---|---|
| JSONL | parse | Segment-level streaming data for detailed analysis |
| Facts JSON | explain | PHI-safe transaction summaries for analytics and AI |
| SQLite | import | Persistent storage for historical analysis |
This guide focuses on JSONL and Facts JSON formats. For SQLite integration, see DATABASE_SCHEMA.md.
JSONL Output Format (parse command)
The parse command outputs line-delimited JSON (JSONL), with one JSON object per segment.
File Layout
Each line is a complete JSON object representing a single X12 segment. Files can be processed incrementally without loading the entire dataset into memory.
{"seg":"ISA","elem":[...],"pos":{...}}
{"seg":"GS","elem":[...],"pos":{...}}
{"seg":"ST","elem":[...],"pos":{...}}
...
{"seg":"SE","elem":[...],"pos":{...}}
{"seg":"GE","elem":[...],"pos":{...}}
{"seg":"IEA","elem":[...],"pos":{...}}
Record Structure
Each JSONL record contains three fields:
| Field | Type | Description |
|---|---|---|
seg | string | Segment identifier (ISA, GS, ST, CLM, CLP, etc.) |
elem | array | Array of element values (strings) in segment order |
pos | object | Position metadata |
Position Metadata
The pos object provides location information for traceability:
| Field | Type | Description |
|---|---|---|
idx | integer | Zero-indexed segment position in file |
byte | integer | Byte offset of segment start |
Element Array
The elem array contains segment elements in their original order, preserving:
- Leading/trailing whitespace (for fixed-length fields like ISA)
- Empty elements (represented as empty strings)
- Component separators (embedded in element strings, not split)
Example ISA Segment:
{
"seg": "ISA",
"elem": [
"00",
" ",
"00",
" ",
"ZZ",
"PAYER835 ",
"ZZ",
"PROVIDER835 ",
"230220",
"1100",
"^",
"00501",
"000000005",
"0",
"P",
":"
],
"pos": {
"byte": 0,
"idx": 0
}
}
Example CLM Segment (837P Claim):
{
"seg": "CLM",
"elem": [
"CLM0001",
"1500.00",
"",
"",
"11:B:1",
"Y",
"A",
"Y",
"Y"
],
"pos": {
"byte": 2048,
"idx": 42
}
}
Streaming Characteristics
Memory Efficiency: JSONL output is streamed line-by-line. Files of any size can be processed with constant memory usage by reading one line at a time.
Deterministic Ordering: Segments appear in the exact order they occur in the source X12 file.
No Lookahead: Each segment is output immediately after parsing. No buffering or transaction grouping occurs.
Common Use Cases
Filter Specific Segments: Extract only CLM (claim header) segments for claim-level analysis.
Transaction Boundary Detection: Identify ST/SE pairs to group segments into transactions.
Element Extraction: Parse specific element positions for targeted data extraction (e.g., NPI from NM109).
Facts JSON Format (explain command)
The explain command outputs a single JSON object containing PHI-safe summaries of the entire file.
Schema Version
Current schema: 1.0.0
The schema_version field in the output indicates the format version. Consumers should validate this field and handle unknown versions gracefully.
Root Object Structure
{
"file": { /* file metadata */ },
"counts": { /* envelope counts */ },
"transactions": [ /* transaction summaries */ ],
"validation": { /* error/warning summary */ }
}
File Metadata
{
"file": {
"filename": "enhanced_835.x12",
"file_size": 663,
"parse_timestamp": "2026-01-14T00:59:31Z"
}
}
| Field | Type | Required | Description |
|---|---|---|---|
filename | string | Yes | Original filename (no path) |
file_size | integer | Yes | File size in bytes |
parse_timestamp | string | Yes | ISO 8601 UTC timestamp when file was parsed |
Envelope Counts
{
"counts": {
"interchanges": 1,
"groups": 1,
"transactions": 1,
"segments": 16
}
}
| Field | Type | Required | Description |
|---|---|---|---|
interchanges | integer | Yes | Number of ISA/IEA envelope pairs |
groups | integer | Yes | Number of GS/GE envelope pairs |
transactions | integer | Yes | Number of ST/SE transaction sets |
segments | integer | Yes | Total segment count |
Transaction Summaries
The transactions array contains one object per transaction set (ST/SE pair):
{
"transactions": [
{
"type": "835",
"claim_id": "CLM0001",
"claim_status": "1",
"group_control": "5",
"interchange_control": "000000005",
"transaction_control": "0005",
"original_reference": "TRC0987654321",
"payer_id": "MEGAPAYER123",
"payer_name": "MEGA INSURANCE",
"billing_provider_npi": "1112223333",
"total_charge": 1250.0,
"total_payment": 600.0,
"patient_responsibility": 650.0,
"service_line_count": 2,
"procedure_codes": ["99213", "99214"]
}
]
}
Common Transaction Fields
| Field | Type | Tx Types | Description |
|---|---|---|---|
type | string | All | Transaction type: "837P", "837I", "837D", "835" |
transaction_control | string | All | ST02/SE02 control number |
group_control | string | All | GS06/GE02 control number |
interchange_control | string | All | ISA13/IEA02 control number |
claim_id | string | All | CLM01 (837) or CLP01 (835) |
total_charge | number | All | Sum of billed amounts |
total_payment | number | 835 | Sum of paid amounts (0.0 for 837) |
patient_responsibility | number | 835 | Patient responsibility amount |
service_line_count | integer | All | Number of service lines (SV1/SV2/SVC) |
837-Specific Fields
| Field | Type | Description |
|---|---|---|
billing_provider_npi | string | NPI from 2010AA loop NM109 |
payer_id | string | Payer identifier from 2010BB loop NM109 |
payer_name | string | Payer name from 2010BB loop NM103 |
place_of_service | string | CLM05-1 (837P) or CLM05 (837I) |
diagnosis_codes | array[string] | ICD-10 codes from HI segments |
procedure_codes | array[string] | CPT/HCPCS codes from SV1/SV2 segments |
service_dates | array[string] | Date ranges from DTP segments (YYYYMMDD-YYYYMMDD) |
original_reference | string | REF segment reference numbers |
835-Specific Fields
| Field | Type | Description |
|---|---|---|
claim_status | string | CLP02 status code ("1" = processed as primary, "2" = processed as secondary, etc.) |
original_reference | string | Patient control number or trace number |
procedure_codes | array[string] | Procedure codes from SVC segments |
Validation Summary
{
"validation": {
"errors": 0,
"warnings": 0
}
}
| Field | Type | Required | Description |
|---|---|---|---|
errors | integer | Yes | Count of validation errors |
warnings | integer | Yes | Count of validation warnings |
When errors or warnings are present, a diagnostics array is added:
{
"validation": {
"errors": 1,
"warnings": 0
},
"diagnostics": [
"[ERROR] SEGMENT_COUNT_MISMATCH: SE01=11, actual=10 (segment #11, byte offset 486)"
]
}
Optional Fields
The following fields appear only when relevant data is present:
diagnosis_codes- Only for 837 transactions with HI segmentsprocedure_codes- Only when SV1/SV2/SVC segments contain parseable codesservice_dates- Only when DTP segments are presentdiagnostics- Only when errors or warnings exist
PHI Safety
Facts JSON excludes all protected health information:
Not Included:
- Patient names, dates of birth, addresses
- Medical record numbers (MRNs)
- Social Security Numbers
- Subscriber/member IDs
- Account numbers
- Free-text descriptions
Included (PHI-safe):
- Control numbers (ISA/GS/ST)
- Payer and provider NPIs
- Organization names (payers, providers)
- Service dates (not patient DOB)
- Diagnosis and procedure codes
- Monetary amounts
Versioning Considerations
The schema follows semantic versioning:
- Major version change (2.0.0): Breaking changes, field renames, structure changes
- Minor version change (1.1.0): New optional fields added
- Patch version change (1.0.1): Documentation updates, no schema changes
Backward Compatibility: Consumers should ignore unknown fields to remain compatible with minor and patch updates.
Version Detection:
Check the schema_version field (if present) or inspect the structure:
facts = json.loads(facts_json)
schema_version = facts.get("schema_version", "1.0.0")
if schema_version.startswith("1."):
# Compatible with 1.x schema
process_v1_facts(facts)
else:
# Unknown schema version
raise ValueError(f"Unsupported schema version: {schema_version}")
Integration Examples
Python: Processing JSONL Output
Streaming Segment Parser:
import json
import sys
def process_jsonl(file_path):
"""Process JSONL output line-by-line with constant memory usage."""
segment_counts = {}
with open(file_path, 'r') as f:
for line_num, line in enumerate(f, 1):
try:
record = json.loads(line)
seg_id = record['seg']
segment_counts[seg_id] = segment_counts.get(seg_id, 0) + 1
# Process specific segments
if seg_id == 'CLM':
claim_id = record['elem'][0]
charge = float(record['elem'][1])
print(f"Claim {claim_id}: ${charge:.2f}")
elif seg_id == 'CLP':
claim_id = record['elem'][0]
status = record['elem'][1]
payment = float(record['elem'][3])
print(f"Payment {claim_id}: ${payment:.2f} (status {status})")
except json.JSONDecodeError as e:
print(f"Error parsing line {line_num}: {e}", file=sys.stderr)
except (KeyError, IndexError, ValueError) as e:
print(f"Error processing line {line_num}: {e}", file=sys.stderr)
print("\nSegment Counts:")
for seg_id, count in sorted(segment_counts.items()):
print(f" {seg_id}: {count}")
# Usage
process_jsonl('claims.jsonl')
Extract Transactions:
import json
def extract_transactions(jsonl_path):
"""Group segments by transaction (ST/SE pairs)."""
transactions = []
current_tx = None
with open(jsonl_path, 'r') as f:
for line in f:
record = json.loads(line)
seg_id = record['seg']
if seg_id == 'ST':
# Start new transaction
current_tx = {
'type': record['elem'][0],
'control': record['elem'][1],
'segments': [record]
}
elif seg_id == 'SE':
# End transaction
if current_tx:
current_tx['segments'].append(record)
transactions.append(current_tx)
current_tx = None
elif current_tx:
# Add segment to current transaction
current_tx['segments'].append(record)
return transactions
# Usage
transactions = extract_transactions('claims.jsonl')
for tx in transactions:
print(f"Transaction {tx['control']} (type {tx['type']}): {len(tx['segments'])} segments")
Filter and Transform:
import json
import sys
def extract_npis(jsonl_path, output_csv):
"""Extract provider NPIs from NM1 segments."""
with open(jsonl_path, 'r') as infile, open(output_csv, 'w') as outfile:
outfile.write("segment_index,entity_type,npi\n")
for line in infile:
record = json.loads(line)
if record['seg'] == 'NM1':
elem = record['elem']
# NM108 = ID code qualifier, NM109 = ID code
if len(elem) >= 9 and elem[7] == 'XX':
entity_type = elem[0] # NM101
npi = elem[8] # NM109
idx = record['pos']['idx']
outfile.write(f"{idx},{entity_type},{npi}\n")
# Usage
extract_npis('claims.jsonl', 'npis.csv')
Python: Processing Facts JSON
Basic Analysis:
import json
def analyze_facts(facts_path):
"""Analyze Facts JSON for summary statistics."""
with open(facts_path, 'r') as f:
facts = json.load(f)
print(f"File: {facts['file']['filename']}")
print(f"Size: {facts['file']['file_size']:,} bytes")
print(f"Parsed: {facts['file']['parse_timestamp']}")
print()
counts = facts['counts']
print(f"Segments: {counts['segments']:,}")
print(f"Transactions: {counts['transactions']}")
print(f"Groups: {counts['groups']}")
print(f"Interchanges: {counts['interchanges']}")
print()
validation = facts['validation']
print(f"Errors: {validation['errors']}")
print(f"Warnings: {validation['warnings']}")
if 'diagnostics' in facts:
print("\nDiagnostics:")
for diag in facts['diagnostics']:
print(f" {diag}")
print()
# Analyze transactions
total_charge = 0.0
total_payment = 0.0
for tx in facts.get('transactions', []):
tx_type = tx['type']
charge = tx.get('total_charge', 0.0)
payment = tx.get('total_payment', 0.0)
total_charge += charge
total_payment += payment
print(f"{tx_type} Transaction {tx['transaction_control']}:")
print(f" Claim ID: {tx.get('claim_id', 'N/A')}")
print(f" Charge: ${charge:,.2f}")
print(f" Payment: ${payment:,.2f}")
if 'procedure_codes' in tx:
print(f" Procedures: {', '.join(tx['procedure_codes'])}")
print()
print(f"Totals: ${total_charge:,.2f} charged, ${total_payment:,.2f} paid")
# Usage
analyze_facts('facts.json')
ETL Pipeline Integration:
import json
import psycopg2
from datetime import datetime
def load_facts_to_postgres(facts_path, db_conn):
"""Load Facts JSON into PostgreSQL data warehouse."""
with open(facts_path, 'r') as f:
facts = json.load(f)
cursor = db_conn.cursor()
# Insert file record
cursor.execute("""
INSERT INTO edi_files (filename, file_size, parsed_at, segment_count, is_valid)
VALUES (%s, %s, %s, %s, %s)
RETURNING id
""", (
facts['file']['filename'],
facts['file']['file_size'],
facts['file']['parse_timestamp'],
facts['counts']['segments'],
facts['validation']['errors'] == 0
))
file_id = cursor.fetchone()[0]
# Insert transactions
for tx in facts.get('transactions', []):
cursor.execute("""
INSERT INTO transactions (
file_id, tx_type, control_number, claim_id,
total_charge, total_payment, service_line_count
)
VALUES (%s, %s, %s, %s, %s, %s, %s)
""", (
file_id,
tx['type'],
tx['transaction_control'],
tx.get('claim_id'),
tx.get('total_charge', 0.0),
tx.get('total_payment', 0.0),
tx.get('service_line_count', 0)
))
db_conn.commit()
cursor.close()
# Usage
conn = psycopg2.connect("dbname=warehouse user=etl")
load_facts_to_postgres('facts.json', conn)
conn.close()
JavaScript/Node.js: Processing JSONL
Streaming Parser:
const fs = require('fs');
const readline = require('readline');
async function processJSONL(filePath) {
const fileStream = fs.createReadStream(filePath);
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity
});
const segmentCounts = {};
let lineNum = 0;
for await (const line of rl) {
lineNum++;
try {
const record = JSON.parse(line);
const segId = record.seg;
segmentCounts[segId] = (segmentCounts[segId] || 0) + 1;
// Process specific segments
if (segId === 'CLM') {
const claimId = record.elem[0];
const charge = parseFloat(record.elem[1]);
console.log(`Claim ${claimId}: ${charge.toFixed(2)}`);
}
} catch (err) {
console.error(`Error parsing line ${lineNum}: ${err.message}`);
}
}
console.log('\nSegment Counts:');
Object.entries(segmentCounts)
.sort(([a], [b]) => a.localeCompare(b))
.forEach(([seg, count]) => console.log(` ${seg}: ${count}`));
}
// Usage
processJSONL('claims.jsonl');
JavaScript/Node.js: Processing Facts JSON
Summary Report:
const fs = require('fs');
function analyzeFacts(factsPath) {
const facts = JSON.parse(fs.readFileSync(factsPath, 'utf8'));
console.log(`File: ${facts.file.filename}`);
console.log(`Size: ${facts.file.file_size.toLocaleString()} bytes`);
console.log(`Parsed: ${facts.file.parse_timestamp}\n`);
console.log(`Segments: ${facts.counts.segments.toLocaleString()}`);
console.log(`Transactions: ${facts.counts.transactions}`);
console.log(`Validation: ${facts.validation.errors} errors, ${facts.validation.warnings} warnings\n`);
// Aggregate by transaction type
const byType = {};
for (const tx of facts.transactions || []) {
const type = tx.type;
if (!byType[type]) {
byType[type] = {
count: 0,
totalCharge: 0,
totalPayment: 0
};
}
byType[type].count++;
byType[type].totalCharge += tx.total_charge || 0;
byType[type].totalPayment += tx.total_payment || 0;
}
console.log('Transaction Summary:');
for (const [type, stats] of Object.entries(byType)) {
console.log(` ${type}: ${stats.count} transactions`);
console.log(` Charges: ${stats.totalCharge.toLocaleString(undefined, {minimumFractionDigits: 2})}`);
console.log(` Payments: ${stats.totalPayment.toLocaleString(undefined, {minimumFractionDigits: 2})}`);
}
}
// Usage
analyzeFacts('facts.json');
Command-Line Integration Patterns
Shell Pipeline Processing
Extract segment counts:
exactedi parse file.x12 | jq -r '.seg' | sort | uniq -c
Filter CLM segments:
exactedi parse file.x12 | jq 'select(.seg == "CLM")'
Convert JSONL to CSV:
exactedi parse file.x12 | \
jq -r '[.seg, .pos.idx, .pos.byte] | @csv' > segments.csv
Validate and extract facts in one pipeline:
exactedi validate file.x12 && \
exactedi explain file.x12 | jq '.transactions[] | {claim: .claim_id, charge: .total_charge}'
Batch Processing
Process directory of files:
#!/bin/bash
for file in /data/incoming/*.x12; do
base=$(basename "$file" .x12)
# Parse to JSONL
exactedi parse "$file" > "/data/parsed/${base}.jsonl"
# Extract facts
exactedi explain "$file" > "/data/facts/${base}.json"
# Import to database
exactedi import "$file" --db /data/analytics.db
done
Parallel processing with GNU parallel:
find /data/incoming -name "*.x12" | \
parallel -j 4 "exactedi explain {} > /data/facts/{/.}.json"
Error Handling
JSONL Parse Errors
ExactEDI outputs all parseable segments before encountering errors. Error messages are written to stderr, not stdout.
Example error handling:
# Separate stdout (JSONL) and stderr (errors)
exactedi parse file.x12 > segments.jsonl 2> parse_errors.log
# Check exit code
if [ $? -ne 0 ]; then
echo "Parse failed. See parse_errors.log"
fi
Python error handling:
import subprocess
import sys
result = subprocess.run(
['exactedi', 'parse', 'file.x12'],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True
)
if result.returncode == 0:
for line in result.stdout.splitlines():
record = json.loads(line)
# Process record
else:
print(f"Parse failed: {result.stderr}", file=sys.stderr)
sys.exit(1)
Facts JSON Validation Errors
When validation errors exist, the Facts JSON still outputs successfully but includes error details:
facts = json.load(open('facts.json'))
if facts['validation']['errors'] > 0:
print("File has validation errors:")
for diag in facts.get('diagnostics', []):
print(f" {diag}")
# Decide whether to process anyway or reject
if facts['validation']['errors'] > 10:
raise ValueError("Too many errors, rejecting file")
Performance Considerations
Memory Usage
JSONL (parse):
- Memory usage: 5-10 MB regardless of file size
- Output is streamed line-by-line
- Suitable for multi-gigabyte files
Facts JSON (explain):
- Memory usage: Proportional to transaction count
- Entire JSON object built in memory
- For 1 GB files with 100,000+ transactions, expect 10-20 GB RAM usage
Recommendation:
Use parse for large files when only segment-level data is needed. Use explain for files under 500 MB or when transaction summaries are required.
Processing Throughput
Tested on AMD EPYC 7763 (single-threaded):
| File Size | Segments | parse Output Time | explain Output Time |
|---|---|---|---|
| 50 MB | 280,000 | 8.2 sec | 10.5 sec |
| 200 MB | 1,120,000 | 32.5 sec | 45.2 sec |
| 1 GB | 6,544,197 | 3m 24s | 4m 15s |
Optimization tips:
- Process files in parallel using multiple ExactEDI instances
- Use
parsefor filtering before heavy processing - Consider splitting large files by interchange (ISA/IEA) before processing
API Contract Guarantees
JSONL Format Stability
The JSONL output format is stable:
- Fields
seg,elem, andposwill always be present - Field names and structure will not change
- New fields may be added in future versions (ignore unknown fields)
Facts JSON Schema Evolution
The Facts JSON schema follows semantic versioning:
- Minor version updates (1.x) add optional fields only
- Major version updates (2.0) may rename or restructure fields
- Consumers should check
schema_versionfield
Output Encoding
All output is UTF-8 encoded. Non-ASCII characters in X12 data are preserved as-is.
Decimal Precision
Monetary amounts in Facts JSON use IEEE 754 double-precision floating-point. For financial calculations requiring exact decimal arithmetic, convert to fixed-point representations:
from decimal import Decimal
total_charge = Decimal(str(tx['total_charge']))
Compliance and PHI Handling
HIPAA Considerations
JSONL Output:
- Contains raw X12 segment data
- May include PHI (patient names, SSNs, addresses in specific segments)
- Treat as PHI and apply appropriate safeguards
Facts JSON Output:
- Designed to be PHI-safe by excluding patient identifiers
- Contains only aggregated statistics and codes
- Still subject to organizational policies (some orgs treat all EDI-derived data as PHI)
Recommendation: Apply the same security controls to all ExactEDI output as you would to source X12 files until organizational data governance reviews Facts JSON for PHI content.
Audit Logging
For compliance, log ExactEDI invocations:
#!/bin/bash
LOG="/var/log/exactedi/audit.log"
echo "$(date -Iseconds) - User: $(whoami) - File: $1" >> "$LOG"
exactedi parse "$1"
Support
For integration questions:
- Professional tier: support@exactedi.com
- Enterprise tier: Dedicated Slack channel
- Documentation: https://docs.exactedi.com
Include sample input/output files and code snippets when reporting integration issues.