# Issue: PBSS Snapshot Restore Failure — diskRoot Stuck at Genesis

**Status:** ✅ **RESOLVED** via [PR #363](https://github.com/XDCIndia/go-ethereum/pull/363)  
**Labels:** `bug`, `pbss`, `snapshot`, `state-management`, `critical`  
**Affected Schemes:** Path-Based State Storage (PBSS)  
**Severity:** High — Fleet disaster recovery broken for PBSS nodes

---

## Summary

PBSS (path-based state scheme) nodes fail to restore from cold snapshots when the original node was not gracefully shut down. The node restarts with `diskRoot` stuck at genesis, triggering a **"snapshot walk-back exceeded safe limit"** error and forcing a multi-million block rewind or full sync restart.

The fix in PR #363 removes the `XdcBulkSyncMode.Load()` gate that was preventing `triedb.Commit()` from being called during normal operation, ensuring `diskRoot` advances every 900 blocks (P1 checkpoint).

---

## Problem Description

### Observed Behavior

When restoring a PBSS node from a cold snapshot (tar archive of datadir):

```
ERROR snapshot walk-back exceeded safe limit without finding a checkpoint snapshot
  start=32,214,369 current=30,214,369 limit=2,000,000
WARN  Failed to load snapshot: head doesn't match snapshot
INFO  Rebuilding state snapshot
INFO  Trie missing, state snapshotting paused
```

The node:
1. Fails to find a valid checkpoint snapshot within the 2M block safety limit
2. Cannot reconstruct the trie from the journal (journal missing or stale)
3. Falls back to a slow, unreliable recovery path
4. Often requires a full sync restart from genesis

### Impact

- **Fleet snapshot restore is unreliable** for PBSS nodes
- **Disaster recovery broken** — cannot trust cold backups
- **Crash recovery fails** — nodes that lose power cannot restart cleanly
- **Cross-server migrations fail** — snapshot moved to new hardware = full resync

---

## Root Cause Analysis

### 1. `diskRoot` Never Advances During Normal Operation

In `core/blockchain.go`, the P1 checkpoint trie commit logic was gated by `XdcBulkSyncMode.Load()`:

```go
// BEFORE (broken)
if checkpoint > 0 && (checkpoint%c.config.XDPoS.XDPoSConfig.Epoch) == 0 && XdcBulkSyncMode.Load() {
    tstart := time.Now()
    c.blockchain.TrieDB().Commit(block.Root(), false)
    // ...
}
```

**What went wrong:**
- `XdcBulkSyncMode` is only `true` during initial bulk sync (downloading blocks from peers)
- During normal operation (importing new blocks as they arrive), `XdcBulkSyncMode` is `false`
- `triedb.Commit()` was never called during normal operation
- `diskRoot` (the on-disk trie root) remained at the genesis root forever

### 2. PathDB Journal Recovery Depends on `diskRoot`

In `triedb/pathdb/journal.go`:

```go
func (db *Database) loadLayers() layer {
    diskRoot := rawdb.ReadAccountTrieNode(db.diskdb, nil)
    // ... attempts to reconstruct layers from diskRoot
}
```

When the node restarts:
1. `loadLayers()` reads `diskRoot` from the database
2. If `diskRoot` is stale (genesis), the journal layers don't match the chain head
3. PathDB attempts a "snapshot walk-back" to find a matching checkpoint
4. Walk-back exceeds the 2M block safety limit → failure

### 3. Why HBSS Was Not Affected

Hash-based state scheme (HBSS) persists state via full trie commits during `statedb.Commit()`, which is called unconditionally. PBSS uses diff layers and only flushes to disk when `triedb.Commit()` is called or the buffer exceeds 64MB/128 layers.

---

## Solution (PR #363)

### Change

Remove the `XdcBulkSyncMode.Load()` gate from the P1 checkpoint trie commit logic:

```go
// AFTER (fixed)
if checkpoint > 0 && (checkpoint%c.config.XDPoS.XDPoSConfig.Epoch) == 0 {
    tstart := time.Now()
    c.blockchain.TrieDB().Commit(block.Root(), false)
    // ...
}
```

### Why This Is Safe

1. **Idempotent**: `triedb.Commit()` is safe to call multiple times — it only writes dirty nodes
2. **Every 900 blocks**: XDPoS epoch = 900 blocks, so commits are infrequent
3. **Aligns with consensus**: P1 checkpoint is a natural persistence point
4. **Required for XDC**: GP5's state root computation, skip-tx-execution, and XDPoS epochs make explicit commits necessary

### Files Changed

- `core/blockchain.go` — Removed `XdcBulkSyncMode.Load()` gate (~line 1924)

---

## Test Results

### Test 1: Same-Server HBSS Restore (xdc07)
| Metric | Value |
|--------|-------|
| Pre-snapshot block | 255,423 |
| Post-restore block | 255,423 ✅ |
| Walk-back error | None ✅ |

### Test 2: Same-Server PBSS Restore (xdc07)
| Metric | Value |
|--------|-------|
| Pre-snapshot block | 255,423 |
| Post-restore block | 255,423 ✅ |
| Journal loaded | `merkle.journal` ✅ |
| Walk-back error | None ✅ |

### Test 3: Cross-Server PBSS Restore (xdc03 → 183)
| Metric | Value |
|--------|-------|
| Starting block | 1,845,090 |
| Snapshot size | 655MB |
| Result | Node resumed successfully ✅ |

### Test 4: Clean Shutdown PBSS Restore (APO 183)
| Metric | Value |
|--------|-------|
| Pre-snapshot block | 4,184,400 |
| Post-restore block | 4,184,400 ✅ |
| Walk-back error | None ✅ |
| Node status | Healthy, running ✅ |

---

## Fleet Deployment Recommendations

### Immediate Actions

1. **Deploy v40+ image** to all PBSS fleet nodes:
   ```bash
   docker pull anilchinchawale/gp5-xdc:v40
   ```

2. **Restart PBSS nodes** to activate the fix (they will commit `diskRoot` at next P1 checkpoint)

3. **Verify diskRoot advancement** after ~900 blocks:
   ```bash
   # Check pathdb layer logs for diskRoot updates
   docker logs <container> | grep "diskRoot"
   ```

### Backup Strategy

- **HBSS nodes**: Continue existing snapshot process
- **PBSS nodes**: Safe to use cold snapshots with v40+
- **Graceful shutdown recommended** but not required — crash recovery now works

### Monitoring

Add alerts for:
```
# Prometheus/Grafana
- rate(snapshot_walk_back_errors_total[5m]) > 0
- pathdb_journal_load_failures > 0
```

---

## Related Issues

- Fixes #362 (PBSS diskRoot stuck at genesis)
- Related to #218 (crash-resume failures)
- Related to #196 (state snapshot corruption)

---

## Changelog

| Date | Change |
|------|--------|
| 2026-04-21 | Issue created, PR #363 merged |
| 2026-04-21 | v40 image built and tested |
| 2026-04-21 | HBSS/PBSS snapshot restore validated |
