CND68RemoteFastSyncDesignSimple
Remote Fast Synchronization: simplified mode
Goals
Projects overview
| Project type | Example | Source Files Count | Total Files Count | Total Files Size | Build time |
|---|---|---|---|---|---|
| Small | CLucene | 50-300 | 100-500 | <=10 M | < 1 min |
| Medium | MySQL | 1,000-3,000 | 1,500 - 6,000 | <=150 M | 6-10 min |
| Large | ACE | 10,000-30,000 | 15,000-60,000 | <= 600 M | 60-120 min |
Goals
On a fast network (for example, inside SPBDC local network):
By non-massive modification I mean changing no more than 10% files.
| Project type | First (full) copy | Subsequent (non-massive modifications) |
|---|---|---|
| Small | 20 sec | 5 sec |
| Medium | 1 min | 10 sec |
| Large | 5 min | 45 sec |
On a slow network (for example, when accessing SWAN via VPN from home)
No more than 10% slower than rsync without daemon.
Technologies considered
- RSync.
- JarSync (http://jarsync.sourceforge.net).
- Misc Java FTP clients
- Java Remote File Access (http://freshmeat.net/projects/remotefileaccess).
- Unison (http://www.cis.upenn.edu/bcpierce/unison).
- Other. Unfortunately I didn't make records accurate enough.
Design specifications
The approach that is prototyped for now is as follows:
- Transport: copying files via JSch
- All files of a project are ZIPped (in java - no external tools need) into a single archive
- Once a file is transferred, its timestamp is remembered; on subsequent runs, only files with changed timestamps are copied
Here are the results.
On SWAN
| Project | 1-st copy | 2-nd copy, unchanged | 10%-changed |
|---|---|---|---|
| small | 0:08 | 0:01 | 0:02 |
| mysql | 0:37 | 0:03 | 0:05 |
| opensolaris | 3:37 | 0:17 | 0:48 |
On a slow network (accessing SWAN from home via VPN)
| Project | 1-st copy | 2-nd copy, unchanged | 10%-changed | RSync 10% changed |
|---|---|---|---|---|
| small | 0:14 | 0:04 | 0:07 | n/a |
| mysql | 6:38 | 0:04 | 0:34 | 0:35 |
| opensolaris | 23:00 | 0:20 | 02:38 | n/a |

