Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, if I start with two devices holding lists, and I place the union of those lists on each device, what is that called?

Do you have a reference for the technical definition of sync that this fails to meet?



> So, if I start with two devices holding lists, and I place the union of those lists on each device, what is that called?

Not enough information. Naive set theory isn't adequate to cover file operations. Files have sizes and dates, but members of sets don't normally have these properties.

File synchronization isn't necessarily a simple and blind union of two sets. Some synchronization operations require that files be deleted, others require that files be restored or provided if absent. Which is true is left to the operator -- he must decide.

Consider this example. Two directory trees, A and B, contain the same number of files, but some of the files have sizes different than the other. Does the synchronization operation copy files from B to A, or A to B? The answer is that the operator must decide.

Rsync, a popular file synchronization utility, can delete files, or not, depending on what options the operator selects. It also much be told which is the source and which the destination.

> Do you have a reference for the technical definition of sync that this fails to meet?

http://en.wikipedia.org/wiki/File_synchronization

A quote: "File synchronization (or syncing) in computing is the process of ensuring that computer files in two or more locations are updated via certain rules. In one-way file synchronization, also called mirroring, updated files are copied from a 'source' location to one or more 'target' locations, but no files are copied back to the source location. In two-way file synchronization, updated files are copied in both directions, usually with the purpose of keeping the two locations identical to each other."

In the first case, algorithm must be told which is the source and which the destination. The operator must decide. In the second, the algorithm must know how to proceed -- are newer files overwritten, or are older files overwritten? The operator must decide.

Most people don't understand this -- in all synchronization operations, the role of the operator is key to getting any desirable result.

In answer to your question above, the online definition I quoted above doesn't allow for a simple union of two file trees, because there is no such thing -- the algorithm requires additional information, information provided by the operator.

In a two-way synchronization, if the algorithm encounters a file that is (a) not present on both systems, or (b) one copy is larger, or (c) one copy has a newer date, the algorithm cannot proceed unless and until the operator expresses a preference -- establishes the rules.

In short, there is no automatic file synchronization -- all examples require the conscious participation of a decision-maker.

Let's say I have edited some files in a large, complex programming project. The outcome is successful. I want to synchronize two file trees to reflect the successful outcome, but with a minimum of file operations. Therefore I tell the synchronization program to copy newer and/or newly created files from B to A.

Next example. I have edited and changed a complex software project, but the changes are a disaster and I want to recover the original state, but with a minimum of file operations. So I tell the sync program to restore older files from A to B, and delete new files from B not present on A.

Neither of the above operations can proceed to a desirable outcome unless I give the sync program explicit rules.

Conclusion: There is no such thing as a union of file systems that doesn't need to be told how to accomplish that end.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: