Offline Briefcase Sneaker Sync

Problem

I have 1.6TB of “stuff”, that I want to keep synced between home and off-site.

I’ve tried several ways of online syncing – Crashplan, BtSync, Rsync, etc – and it always breaks down somehow.  I’m tired of relying on the internet to get it done.

I travel back and forth between these locations all the time.  How about a sneaker-sync?

  • Two desktop computers, with 1.6 TB of stuff
  • 1 128G USB drive to store the Deltas
  • A program to run at either site, that can figure out what to put on the USB drive, to bring the two repositories in sync with each other.

Solution Proposal

Here’s my thoughts on what it would take.   It can be given a pretty UI later.

  • USB flash drive of arbitrary size
  • SneakerSync.exe, .config
    • Relative location to path where files are kept.
    • maximum size of the USB flash drive to use.
    • % to keep for deleted files
    • Repository/db of tracking stuff.

Target computer #1:

  • F:\Share   <—has the tons of stuff

Target computer #2:

  • \\NetworkLocation\Share   <—also has the tons of stuff.   To be kept in sync.

To make things simpler, I’m just going to call this the “share”. 

Command line Usage:

  • sneakersync.exe  f:\share   when at computer 1
  • sneakersync.exe \\networklocation\share when at computer 2

What it does:

  • Scans to see what the latest/greatest are at a given computer.
  • Determines files that need to be synced to other sites
  • Determines files that are out of date with reference to other sites
  • Updates local files if it knows what it has on the stick is fresher
  • Grabs any files that DEFINITELY need syncing on some other computer somewhere else
  • Backfills other interesting files if there’s space available.  Probably in most-recently-updated order. 
  • doesn’t delete anything – saves copies on usb stick, rotating buffer.

Future Autorun usage:

Autoruns, looks around for file systems it can update that it knows about, does the update.

To pull this off, what would it need

Define: “Logical Share” = the concept of the file system, regardless of how many machines it is spread across.  Think of it as “the dropbox”, for example, when using dropbox.

Define “Physical share” = the files on a disk on a computer.

We would need to track:

  • Logical file system contents, plus which computer has the “most recent” or authoritative version of a file.
  • physical file system contents, as best known, for all physical file systems.
    • With each physical file system file, the determination that it needs to be updated (ie, its stale, logical has newer)
  • What’s in the briefcase for transfer
  • What’s in the briefcase in the trash can.

Side Effects / Interesting conditions.

There will be conflicts where the file has been modified on both computers.  I’d say, alert the user, take the most recent one, and stick both copies of files into the trash can.

If you somehow loose all physical instances of a share, you should have the most recent N files in your briefcase.  Hopefully that’s enough to get you back to a backup.

When will I write this code

Maybe never.  Its been in my head for a few years; I have some other stuff I’m working on right now.  Maybe somebody else is interested in writing this and monetizing it?  Smile  

Or even better, maybe somebody else has already written this.