The primary focus is, indeed, archiving (and replaying and sharing, with ease) states of data. Merging has been explored, but nothing concrete on it yet. Even something relatively simple, like merging code, often requires a human to resolve (sometimes with real effort). Imagine trying to do that with filesystem snapshots of database files.
We are exploring it. We have some thoughts on higher level understanding of data that might make it possible.
But definitely starting with the basics, as you said.
Would you mind sharing more details about how they are confusing? Always happy to take feedback. Feel free to comment here or on the community Slack, although a GitHub issue may be the best place. Whatever works... and much appreciated.
I have seen a growth of such "vcs-like" databases, but I think the preponderance remains SQL stores like MySQL/MSSQL/Postegres or NoSQL like Mongo/Cassandra/Redis/Couch/etc. For those - or anything that has its own model of storage or processing and, in the end, is backed by filesystem-type storage, dotmesh provides a really nice solution.
I haven't used Relaxo itself, but personally, I like the fact that independent groups are thinking of version control semantics for data. Tells me it is heading in a positive direction.
Relaxo used to be a couch query server (https://github.com/ioquatix/relaxo-query-server - not so useful any more) and ruby front end (https://github.com/ioquatix/relaxo-model - still useful). But I got frustrated with the direction of couchdb 2.x so I rewrote it to do everything in-process and use git as the document store. It organically grew from that.
Unless you are operating at scale, doing things in-process is vastly more convenient. Sending ruby code to the query server to perform map-reduce was a cumbersome process at best. It's easier just to write model code and have it work as expected.
Systems like Postgres a great when you have a single database and multiple front-end consumers though. You'd need to put a front-end on top of relaxo in order to gain the same benefits, but it would be pretty trivial to do so - just that its never been something that I've needed to implement. The API you'd actually want is one that interfaces directly with your Ruby model instances, rather than database tables and rows. I think there is room for improvement here - probably implementing a websocket API that exposes the raw git object model and then allowing consumers to work on top of that.
The architecture is super simple, I'd suggest that the first place to look is the source code.
There are really only two ways of accessing the underlying data store - a read-only dataset and a read/write changeset which can be committed.
It's purely a key-value storage at the core - a key being a path and a value being whatever you want.
On top of that you can build more complex things, e.g. https://github.com/ioquatix/relaxo-model which provides relational object storage and basic indexes (e.g. has one, has many, etc)
This was a lot of fun. I don't care that it isn't hyper-practical, or a jQuery plugin may or may not exist. We aren't that serious about ourselves, are we?
Not that I am in love with "World Backup Day," but I think it is worth starting the conversation why people do not back up?
Personally, I am surprised Microsoft or Apple has not bought out CrashPlan or Mozy (from EMC) or Carbonite and made it standard on their platform. Apple especially, since it already has recurring services it charges for.
Come to think of it, great concept for my next article. But I wanted the discussion.
Dunno, I always liked storing date/time as epoch. Every language under the sun seems to have a native method for working with it. Yeah, I need to deal with de/serialization, but it is a small price to pay, no?
Fair enough @pimlottc. But most of the time I am far more concerned with accurately capturing a moment in time than I am with making it instantly readable. I also have helped some companies that ran into serious datetime management issues when they worked with strings that led engineers to assume a certain timezone, others to assume others, and chaos ensued.
Epochs may not be instantly readable, but they do force everyone onto the same page.
In any case, I tend to view a timezone as a separate piece of data than the actual moment in time (but I know others have a different paradigm): datetime = accurate moment in time; timezone = the timezone for which this should be viewed, or was captured or, etc.
So by epoch, obviously you mean the number of seconds since Jan 1 1970. Midnight - 00:00, right? Ah... but then... UTC? or TAI? There's a 35 second difference, after all. There's a school of thought that the UNIX epoch counts from 1970-01-01 00:00:10 TAI...
POSIX specifies the Epoch to be UTC and has since at least 2001. People may have other opinions on how it should be specified, but if you're going to follow POSIX as it exists, you're not left with a choice in the matter.
But POSIX also claims there are 86400 seconds in a day, which is not always true for UTC. There are two ways of dealing with that - POSIX says that the correct way is to just count seconds, but reset every UTC midnight to (number of days since 1970-1-1) * 86400, which means that when leap seconds occur some epoch numbers are ambiguous (or in a leap-second-deletion, are skipped). NTP ignores POSIX and says that the way to deal with this is to vary the length of a second during a day which contains a leap second.
And we're talking about JSON here, so isn't the ECMA-262 standard for dates more relevant than the POSIX standard? ECMAScript has some very fuzzy ideas about dates.
POSIX specifies that there are 86400 seconds in a day. POSIX is not making claims about reality, it is specifying its own reality. That's what standards do.
ECMA-262 isn't really relevant at all, since it's not the (or even an) authority on JSON. JSON was simply derived from it -- in an incompatible way at that. It's doubly irrelevant since you were talking about Unix, so that's what I was addressing.
By the way, your phrasing is odd/confusing. You're talking about "epochs" in a strange way. In Unix/POSIX land, there is one epoch, "The time zero hours, zero minutes, zero seconds, on January 1, 1970 Coordinated Universal Time (UTC).". Unix timestamps are derived from the epoch, they do not define it.
And JavaScript is milliseconds as well. I think either would be fine, as long as it is an agreed standard. Personally, I use ms, because most programming langs can instantly convert with any additional math. But it probably is wasteful. Then again, 1000 is just a few bits on each...
I love it. When I first started in IT (1994, don't ask...) working with London and Tokyo and San Fran and Singapore and everyone wrote dates differently. I just started writing YYYY-MM-DD everywhere, and all of the questions went away.
Looks like ISO-8601 to me! We've standardized on using this format (extended to include time where necessary) whenever our JSON objects include a date or date/time. We've also standardized on UTC. Since our system clocks are already synchronized that way, it's easy for us and we simply i18n/l12n them on entry and/or display.
It was, I just didn't know it at the time! I was simply looking for some way to write emails and spec docs in a way that everyone would have a common frame of reference with zero extra work.
Yeah, in storage (as above), I use epoch all the way through and convert as needed. But as the thread above shows, not everyone likes this path...
We are exploring it. We have some thoughts on higher level understanding of data that might make it possible.
But definitely starting with the basics, as you said.