Debugging¶

Find NaN/infinity issues, compare model implementations, and analyze structural differences with the debugging module.

Note

During the current preview, set the following environment variables to ensure operation-level debug metadata is preserved and available to these tools:

export USE_LOCAL_COREAI=1
export ENABLE_DEBUG_INFO=1

Quick start¶

from coreai_torch.debugging.validator import create_validator_for_exported_program

# Find NaN/inf issues in PyTorch models
model = MyModel().eval()
exported = torch.export.export(model, args=(torch.randn(1, 10),))

validator = create_validator_for_exported_program(exported)
result = await validator.check_for_nans(inputs=(torch.randn(1, 10),))

if result.failed_nodes:
    print(f"NaN detected at: {result.failed_nodes[0]}")

Finding NaN/infinity issues¶

Use when: Your model produces NaN or infinity values and you need to find which operation caused the issue.

PyTorch models¶

from coreai_torch.debugging.validator import create_validator_for_exported_program

# Export your model
exported_program = torch.export.export(model, args=example_input)

# Create validator
validator = create_validator_for_exported_program(exported_program)

# Check for numerical issues
nan_result = await validator.check_for_nans(inputs=example_input)
inf_result = await validator.check_for_infs(inputs=example_input)

# Get first failing operation
if nan_result.failed_nodes:
    print(f"First NaN at: {nan_result.failed_nodes[0]}")

Core AI programs¶

from coreai_torch.debugging.validator import create_validator_for_coreai_program

# Convert to Core AI
converter = TorchConverter().add_exported_program(exported_program)
coreai_program = converter.to_coreai()
coreai_program.optimize()

# Create validator
validator = await create_validator_for_coreai_program(coreai_program, "main")

# Check for issues
result = await validator.check_for_nans(inputs={"x": torch.randn(2, 4)})

Comparing model implementations¶

Use when: You need to verify that PyTorch and Core AI models produce the same outputs after conversion.

Cross-framework comparison¶

Compare PyTorch vs Core AI to verify conversion correctness:

from coreai_torch.debugging.comparator import create_comparator_for_programs

# Create comparator between PyTorch and Core AI
comparator = await create_comparator_for_programs(
    source_program=exported_program,
    target_program=coreai_program,
    target_entry_point="main"
)

# Compare outputs with tolerance
result = await comparator.compare_with_tolerance(
    inputs={"x": example_input},
    rtol=1e-5,
    atol=1e-8
)

# Check for differences
if result.failed_nodes:
    for source_op, target_op in result.failed_nodes:
        print(f"Mismatch: {source_op} vs {target_op}")

Core AI inspector¶

Use when: You need to examine intermediate values from specific operations in a deployed Core AI model.

Capture intermediate values from deployed Core AI models:

from coreai_torch.debugging.inspector import CoreAIInspector
from coreai.runtime import AIModel

# Load deployed Core AI model
asset_path = Path("my_model.aimodel")
ai_model = await AIModel.load(asset_path)

# Create inspector
inspector = CoreAIInspector(model=ai_model, function_name="main")

# Get operation IDs to inspect (from debug info)
coreai_op_ids = [1, 5, 10, 15]

# Capture intermediate values
results = await inspector.get_intermediates_for_ops(
    coreai_op_ids,
    inputs={"x": np.random.randn(2, 4).astype(np.float32)}
)

# Check results
for op_id, outputs in results.items():
    print(f"Op {op_id}: {len(outputs) if outputs else 0} outputs")

Structural graph analysis¶

Use when: You want to understand how model structure changes between different versions or after optimization passes.

Graph difference analysis¶

Analyze structural differences between model implementations using graph isomorphism:

from coreai_torch.debugging.graph_diff import (
    compute_exported_program_diff,
    compute_coreai_program_diff,
    write_diff
)

# Compare two PyTorch programs
source_program = torch.export.export(model_v1, example_input)
target_program = torch.export.export(model_v2, example_input)

diff = compute_exported_program_diff(source_program, target_program)

# Check structural compatibility
if diff.is_isomorphic:
    print("✓ Graphs have identical structure")
else:
    print(f"✗ Found {diff.summary.unmapped_source_node_count} structural differences")

    # Write detailed diff report to stdout
    write_diff(
        diff,
        diff.source_graph,
        diff.target_graph,
        max_items=20
    )

Performance profiling¶

Use when: You need to identify slow operations and performance bottlenecks in your Core AI model.

Profile operation timing in Core AI programs:

from coreai_torch.debugging.benchmarker import benchmark_coreai_program

# Run benchmark
result = await benchmark_coreai_program(
    coreai_program=coreai_program,
    inputs={"x": torch.randn(2, 4)},
    num_runs=50
)

# Show timing summary
result.write_summary(sys.stdout)

# Get module-level timing
module_timings = result.get_module_timings()
for name, module in module_timings.items():
    print(f"{name}: {module.aggregated_op_stats.average:.3f}ms avg")

Custom validation¶

Use when: You need to check for specific conditions beyond NaN/infinity (e.g., value ranges, specific patterns).

Create custom checks beyond NaN/infinity:

def check_large_values(outputs):
    """Check if any output has values > threshold"""
    return any(
        abs(arr).max() > 1000.0 if arr is not None else False
        for arr in outputs
    )

# Use custom check
result = await validator.check(check_large_values, inputs=example_input)

Configuration¶

Search strategies¶

Choose how to search through operations:

from coreai_torch.debugging.search_strategy import LevelOrderStrategy

# Binary search (default - fastest for finding first issue)
strategy = LevelOrderStrategy.bisection(graph, batch_size=10)

# Top-down (systematic from inputs to outputs)
strategy = LevelOrderStrategy.top_down(graph)

# Adaptive (automatically selects best approach)
strategy = LevelOrderStrategy.auto(graph)

Batch size¶

# Control batch size for memory efficiency
strategy = LevelOrderStrategy.bisection(graph, batch_size=5)  # Smaller batches
strategy = LevelOrderStrategy.bisection(graph, batch_size=20)  # Larger batches
validator = create_validator_for_exported_program(exported)

Torch utilities¶

Use when: You need to save intermediate values to disk for later analysis or share debug data.

Saving intermediate values¶

Save all intermediate tensor values from PyTorch model execution:

from coreai_torch.debugging.torch_utils import save_intermediates, load_intermediates
from pathlib import Path

# Export your PyTorch model
exported_program = torch.export.export(model, args=example_input)

# Save intermediate values to disk
metadata_path = save_intermediates(
    program=exported_program,
    inputs=example_input,
    output_dir=Path("./debug_output")
)

print(f"Intermediates saved to: {metadata_path}")

Loading intermediate values¶

Load saved intermediate values for analysis:

# Load intermediate values from disk
debug_trace = load_intermediates(Path("./debug_output/main.aimodelintermediates"))

# Access saved values
print(f"Inputs: {list(debug_trace.inputs.keys())}")
print(f"Outputs: {list(debug_trace.outputs.keys())}")
print(f"Intermediates: {len(debug_trace.intermediates)} operations")

# Analyze specific intermediate values
for node_name, tensor in debug_trace.intermediates.items():
    print(f"{node_name}: shape {tensor.shape}, mean {tensor.mean():.3f}")

Custom value filtering¶

Filter which intermediate values to save:

def custom_filter(node, result):
    """Only save convolution and linear layer outputs"""
    return any(op in str(node.target).lower() for op in ["conv", "linear", "matmul"])

# Save only filtered operations
metadata_path = save_intermediates(
    program=exported_program,
    inputs=example_input,
    output_dir=Path("./debug_output"),
    node_filter=custom_filter
)

The debugging module provides tools for validating model correctness, analyzing structural changes, and identifying performance issues.

Debugging¶

Quick start¶

Finding NaN/infinity issues¶

PyTorch models¶

Core AI programs¶

Comparing model implementations¶

Cross-framework comparison¶

Core AI inspector¶

Structural graph analysis¶

Graph difference analysis¶

Performance profiling¶

Custom validation¶

Configuration¶

Search strategies¶

Batch size¶

Torch utilities¶

Saving intermediate values¶

Loading intermediate values¶

Custom value filtering¶

See also¶