# Debugging

Find NaN/infinity issues, compare model implementations, and analyze structural differences with the debugging module.

:::{note}
During the current preview, set the following environment variables to ensure
operation-level debug metadata is preserved and available to these tools:

```bash
export USE_LOCAL_COREAI=1
export ENABLE_DEBUG_INFO=1
```
:::

## Quick start

```python
from coreai_torch.debugging.validator import create_validator_for_exported_program

# Find NaN/inf issues in PyTorch models
model = MyModel().eval()
exported = torch.export.export(model, args=(torch.randn(1, 10),))

validator = create_validator_for_exported_program(exported)
result = await validator.check_for_nans(inputs=(torch.randn(1, 10),))

if result.failed_nodes:
    print(f"NaN detected at: {result.failed_nodes[0]}")
```

## Finding NaN/infinity issues

**Use when:** Your model produces NaN or infinity values and you need to find which operation caused the issue.

### PyTorch models

```python
from coreai_torch.debugging.validator import create_validator_for_exported_program

# Export your model
exported_program = torch.export.export(model, args=example_input)

# Create validator
validator = create_validator_for_exported_program(exported_program)

# Check for numerical issues
nan_result = await validator.check_for_nans(inputs=example_input)
inf_result = await validator.check_for_infs(inputs=example_input)

# Get first failing operation
if nan_result.failed_nodes:
    print(f"First NaN at: {nan_result.failed_nodes[0]}")
```

### Core AI programs

```python
from coreai_torch.debugging.validator import create_validator_for_coreai_program

# Convert to Core AI
converter = TorchConverter().add_exported_program(exported_program)
coreai_program = converter.to_coreai()
coreai_program.optimize()

# Create validator
validator = await create_validator_for_coreai_program(coreai_program, "main")

# Check for issues
result = await validator.check_for_nans(inputs={"x": torch.randn(2, 4)})
```

## Comparing model implementations

**Use when:** You need to verify that PyTorch and Core AI models produce the same outputs after conversion.

### Cross-framework comparison

Compare PyTorch vs Core AI to verify conversion correctness:

```python
from coreai_torch.debugging.comparator import create_comparator_for_programs

# Create comparator between PyTorch and Core AI
comparator = await create_comparator_for_programs(
    source_program=exported_program,
    target_program=coreai_program,
    target_entry_point="main"
)

# Compare outputs with tolerance
result = await comparator.compare_with_tolerance(
    inputs={"x": example_input},
    rtol=1e-5,
    atol=1e-8
)

# Check for differences
if result.failed_nodes:
    for source_op, target_op in result.failed_nodes:
        print(f"Mismatch: {source_op} vs {target_op}")
```


## Core AI inspector

**Use when:** You need to examine intermediate values from specific operations in a deployed Core AI model.

Capture intermediate values from deployed Core AI models:

```python
from coreai_torch.debugging.inspector import CoreAIInspector
from coreai.runtime import AIModel

# Load deployed Core AI model
asset_path = Path("my_model.aimodel")
ai_model = await AIModel.load(asset_path)

# Create inspector
inspector = CoreAIInspector(model=ai_model, function_name="main")

# Get operation IDs to inspect (from debug info)
coreai_op_ids = [1, 5, 10, 15]

# Capture intermediate values
results = await inspector.get_intermediates_for_ops(
    coreai_op_ids,
    inputs={"x": np.random.randn(2, 4).astype(np.float32)}
)

# Check results
for op_id, outputs in results.items():
    print(f"Op {op_id}: {len(outputs) if outputs else 0} outputs")
```

## Structural graph analysis

**Use when:** You want to understand how model structure changes between different versions or after optimization passes.

### Graph difference analysis

Analyze structural differences between model implementations using graph isomorphism:

```python
from coreai_torch.debugging.graph_diff import (
    compute_exported_program_diff,
    compute_coreai_program_diff,
    write_diff
)

# Compare two PyTorch programs
source_program = torch.export.export(model_v1, example_input)
target_program = torch.export.export(model_v2, example_input)

diff = compute_exported_program_diff(source_program, target_program)

# Check structural compatibility
if diff.is_isomorphic:
    print("✓ Graphs have identical structure")
else:
    print(f"✗ Found {diff.summary.unmapped_source_node_count} structural differences")

    # Write detailed diff report to stdout
    write_diff(
        diff,
        diff.source_graph,
        diff.target_graph,
        max_items=20
    )
```


## Performance profiling

**Use when:** You need to identify slow operations and performance bottlenecks in your Core AI model.

Profile operation timing in Core AI programs:

```python
from coreai_torch.debugging.benchmarker import benchmark_coreai_program

# Run benchmark
result = await benchmark_coreai_program(
    coreai_program=coreai_program,
    inputs={"x": torch.randn(2, 4)},
    num_runs=50
)

# Show timing summary
result.write_summary(sys.stdout)

# Get module-level timing
module_timings = result.get_module_timings()
for name, module in module_timings.items():
    print(f"{name}: {module.aggregated_op_stats.average:.3f}ms avg")
```

## Custom validation

**Use when:** You need to check for specific conditions beyond NaN/infinity (e.g., value ranges, specific patterns).

Create custom checks beyond NaN/infinity:

```python
def check_large_values(outputs):
    """Check if any output has values > threshold"""
    return any(
        abs(arr).max() > 1000.0 if arr is not None else False
        for arr in outputs
    )

# Use custom check
result = await validator.check(check_large_values, inputs=example_input)
```

## Configuration

### Search strategies

Choose how to search through operations:

```python
from coreai_torch.debugging.search_strategy import LevelOrderStrategy

# Binary search (default - fastest for finding first issue)
strategy = LevelOrderStrategy.bisection(graph, batch_size=10)

# Top-down (systematic from inputs to outputs)
strategy = LevelOrderStrategy.top_down(graph)

# Adaptive (automatically selects best approach)
strategy = LevelOrderStrategy.auto(graph)
```

### Batch size
```python
# Control batch size for memory efficiency
strategy = LevelOrderStrategy.bisection(graph, batch_size=5)  # Smaller batches
strategy = LevelOrderStrategy.bisection(graph, batch_size=20)  # Larger batches
validator = create_validator_for_exported_program(exported)
```

## Torch utilities

**Use when:** You need to save intermediate values to disk for later analysis or share debug data.

### Saving intermediate values

Save all intermediate tensor values from PyTorch model execution:

```python
from coreai_torch.debugging.torch_utils import save_intermediates, load_intermediates
from pathlib import Path

# Export your PyTorch model
exported_program = torch.export.export(model, args=example_input)

# Save intermediate values to disk
metadata_path = save_intermediates(
    program=exported_program,
    inputs=example_input,
    output_dir=Path("./debug_output")
)

print(f"Intermediates saved to: {metadata_path}")
```

### Loading intermediate values

Load saved intermediate values for analysis:

```python
# Load intermediate values from disk
debug_trace = load_intermediates(Path("./debug_output/main.aimodelintermediates"))

# Access saved values
print(f"Inputs: {list(debug_trace.inputs.keys())}")
print(f"Outputs: {list(debug_trace.outputs.keys())}")
print(f"Intermediates: {len(debug_trace.intermediates)} operations")

# Analyze specific intermediate values
for node_name, tensor in debug_trace.intermediates.items():
    print(f"{node_name}: shape {tensor.shape}, mean {tensor.mean():.3f}")
```

### Custom value filtering

Filter which intermediate values to save:

```python
def custom_filter(node, result):
    """Only save convolution and linear layer outputs"""
    return any(op in str(node.target).lower() for op in ["conv", "linear", "matmul"])

# Save only filtered operations
metadata_path = save_intermediates(
    program=exported_program,
    inputs=example_input,
    output_dir=Path("./debug_output"),
    node_filter=custom_filter
)
```

The debugging module provides tools for validating model correctness, analyzing structural changes, and identifying performance issues.

## See also

- {doc}`../guides/conversion-workflows` — understand the full pipeline from export to `DeployableProgram`.
- {doc}`supported-aten-ops` — check which ATen ops have built-in lowering rules if you hit a conversion error.
- {doc}`TorchConverter` — the main conversion class; `to_coreai()` produces the program you validate here.