Add comprehensive benchmark suite (io_bench_test.go):
- BenchmarkEndToEndRead/Write: full SCSI stack (512B to 256KB)
- BenchmarkEndToEndReadParallel/WriteParallel: concurrent IO
- BenchmarkFileBackingStoreRead/Write: isolated backing store
pprof-guided optimizations:
- Guard hot-path log.Debugf with log.GetLevel() check in scsi.go,
sbc.go, backingstore.go — eliminates 22% CPU overhead from logrus
Entry allocation even when debug logging is disabled
- Add FileBackingStore.ReadAt for zero-copy reads directly into
caller's buffer, bypassing Read()'s per-call make([]byte, tl)
- Use ReadAt via interface assertion in bsPerformCommand to read
directly into InSDBBuffer, eliminating allocation + copy
Results (256KB reads): +42% throughput, allocs reduced from 10 to 5
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>