The Problem with Data Loss
In 2006, I joined Inmage Systems Hyderabad team working on continuous disaster recovery platforms. At the time, most businesses understood that data loss was catastrophic, but few had systematic ways to prevent it. My work centred on two distinct philosophies of data protection — and understanding the difference between them changed how I think about resilience engineering forever.
File-Based Recovery
File-based recovery tracks changes at the file system level. When a file changes, the backup system detects that change, captures the new version, and stores it. This approach is:
- ·Granular: You can restore individual files without touching everything else
- ·Storage-efficient: Only changed files are captured after the initial backup
- ·Transparent: Easy for users and administrators to understand
The challenge is consistency. If an application is mid-write when a backup runs, you can end up with a partially-written file that's logically inconsistent. For databases and transactional systems, this is a serious problem.
Volume-Based Recovery
Volume-based (or block-level) recovery operates below the file system. It captures changes at the disk block level, regardless of what files those blocks belong to. This gives you:
- ·Application consistency: You can capture a consistent snapshot of an entire volume at a point in time
- ·Speed: Block-level operations are faster than file enumeration at scale
- ·Database-friendliness: Transactional systems can be quiesced and snapshotted consistently
The tradeoff is granularity. Restoring a single accidentally-deleted file from a volume snapshot requires mounting the entire snapshot and extracting just that file — operationally clunky.
What We Built at InMage later acquired by Microsoft
The platform I worked on combined both approaches. We used volume snapshots for crash-consistent point-in-time recovery (our RPO was sub-minute), while maintaining a file-level catalog for granular restore operations. The Rsync-based replication layer I built handled delta synchronisation efficiently across the network, ensuring offsite copies stayed current without hammering bandwidth.
A key lesson: the right DR strategy depends on your RTO (Recovery Time Objective) and RPO (Recovery Point Objective). Know those numbers before you pick a technology.
Still Relevant in 2025
The fundamentals haven't changed, even as the tooling has. When I advise teams on backup strategy today — whether they're using AWS snapshots, PostgreSQL WAL archiving, or Kubernetes PVC snapshots — the same questions apply: What's your RPO? What granularity of restore do you need? Can you guarantee consistency?
Get those answers right first. Then pick your tools.