[owncloud-devel] GSoC Proposal for Large File Sync

Klaas Freitag freitag at owncloud.com
Wed Mar 23 10:37:52 GMT 2016


On 22.03.2016 15:35, Tomaz Canabrava wrote:
>
>
> On Tue, Mar 22, 2016 at 11:33 AM, Roeland Douma <rullzer at owncloud.com
> <mailto:rullzer at owncloud.com>> wrote:
>
>     This is a bit of an oversimplification.
>     Just cutting up a file does not work. You need a rolling checksum. Else
>     you are going to be unable to detect moves within a file.
>
>
>     Assume a file.
>
>     'aaaaabbbbbccccc'
>
>     Now your chunks are 5 bytes. And you decide to modify the file
>     locally to:
>
>     'adaaaabbbbbccccc'
>
>
>     Now if you would have 'static' chunks. This will force you to
>     reupload all chunks.
>     Stuff like this is not uncommon.
>
>
>     But this is indeed exactly what zsync is.

Yes, that is true.

>     Basically you need to store the zsync file as meta data. Because
>     calculating the checksum
>     on the server is not really a scaleable solution.
>
> What if the chechsum-chunk-calculation resulting file is also uploaded
> for the server, this way we don't need to use the server to calculate
> that at all.

Yes, for files that are new (or changed) on the client, this is a good 
idea, so the server can store the list and as long as the file did not 
change, it can be reused.

However, there will be the need of recalculations on the server in case 
a file is changed through the web interface or, in case of external 
storage, through third party.

As we need to keep all systems ready to continue to work completely 
without all this to stay compatible with old servers the latter case can 
be neglected for the start.

Keep on rocking & volume up,

Klaas





More information about the Devel mailing list