the fact that SQLite is a library (embedded DB) means only one node can access the DB at a time. This would not be appropriate for many apps that require HA.
The same thing you do in a multi-reader, single-writer MySQL cluster, except with SQLite there's one less thing to go down, because the database is embedded in your process.
I’m not familiar with MySQL specifically, but people worried about HA have standby replicas. Enterprise DBs make this almost trivial to do and it’s very possible with PG extensions. I’m sure someone has built a system like that with SQLite, but it’s much less ideal than other database systems.
Exactly what is "less ideal" about it? SQLite has distinct advantages over n-tier databases: it's embedded, so you're not round-tripping to the network for fetches, which makes simple query patterns that are untenable for things like Postgres reasonable for SQLite. It's also, of course, faster (about as fast as you can possibly get) for some access patterns.
We’re not talking about latency though. The argument is around the HA story. Totally agree that if your app and DB are on the same machine and if HA isn’t a hard requirement, SQLite is probably what you should go to before PG.
However, if you can’t have any downtime, you’d have to build some bespoke solution to make sure you have standby replicas ready to go if there’s an issue on the primary instance. For a lot of enterprises, that’s a total nonstarter.
Just playing devil's advocate (I don't have much of a dog in the fight):
* SQLite's replication isn't built in, you have to use another library (LiteStream, LiteFS, etc) to achieve it. That in itself is an indication it's not inherently designed for such, and at a minimum will require a lot more testing to ensure your HA/DR failover scenarios work like you're envisioning. Perception matters.
* Litestream, LiteFS today are in "beta mode", fairly DIY, mostly CLI and YAML configs, making it comparatively complex and error-prone to configure vs. most n-tier databases which offer OOTB replication via GUI, and it's usually just a checkbox with cloud PaaS databases.
* "No one ever got fired for choosing IBM" there are tons of blogs and walkthroughs of setting up HA/DR for all the major RDBMSes, it's a safe choice if you run into any issues, whereas you might find yourself deep in an obscure dev forum debugging Docker dependencies or networking protocols. It's a risk.
* Needless to say, commercial support also matters.
* On LiteStream's own tips page ( https://litestream.io/tips/ ) there's a subheader called "Data Loss Window", if I were competing in any way with this solution, I think I'd bring that up pretty early.
Broadly, I think it'd be fine for a lightly written, mostly read OLTP system that's relatively stable. Anything else and I might get itchy about relying on just the WAL for replication.
SQLite fills a great ethos of "no muss, no fuss" but replication a lot times is 100% fuss.
It seems like you keep missing the SQLite caveat — it’s the same process as your application. HA doesn’t make sense here in the context of SQLite by itself.
Here’s how I think about it, but I don’t do replicates with SQLite often, so I might be wrong.
You really can’t compare SQLite to a Postgres or MySQL or Oracle server. With a database server you worry about replication/HA etc for the server. If the server is up, it can be queried, independent of the application.
With SQLite, you worry about replicating the data. If your application is running, it can access the data. You just need to make sure the data is consistent between nodes (if you use multiple nodes).
The data can be replicated, but if your SQLite node goes down, there’s no need to worry about the application because that means the application is also down. Similarly, if the application is up, your database is also up!
I mean this is our entire point. SQLite isn’t built for HA. You’re stuck replicating it yourself or using non-enterprise-ready solutions. It’s not SQLite’s fault, it’s just not the tool for the job if you need HA.
SQLite is one of the most reliable codebases in the industry. You're ignoring what the comment says. In an n-tier architecture, both your app server and your database server needs to be HA, because they're servers. With SQLite, the app server is the database server. It's still HA, there's simply one fewer thing to HA-ify.
I don't know what "non-enterprise-ready" means, but it sounds like a selling point to me. The XML can come later.
When the app server is stateless, HA for the app server
is usually a near trivially easy problem to solve though, and a whole lot of web devs are used to treating the database as a black box that they'll treat as always up so I get the fear and uncertainty of suddenly having the HA setup interfere with "their" domain.
Coming at it from a devops angle, I'm used to being the one to have to deal with the database HA myself, and then having one tier less becomes a lot more appealing, and not really more scary than ensuring any other database setup is resilient.
The comment I replied to upthread was also talking about n-tier replication features that weren't built in.
LiteFS isn't the only way to replicate a SQLite database.
Do you want Oracle servers? Because "Nobody got fired for choosing IBM" is how you get Oracle servers.
If you read just one additional sentence in to the Litestream "Data Loss Window" section, you'll find: "This is how many replication tools work including PostgreSQL’s log-shipping replication".
I don't know what you mean by "relying on just the WAL". The WAL is the database. What else would you replicate? How are you contrasting this against Postgres replication?
Sometimes I honestly think that the problem is that the SQLite ecosystem makes it easy to understand how things work, so people get scared. Whereas other things comes off as magic and magic is comforting, you don't need to know how it works, you can just trust it.
It is so hard for me to imagine ever trusting magic, but I think you are very right that people seem to prefer it. I think this discrepancy explains so much of the conflict I get into in my career.
> LiteFS isn't the only way to replicate a SQLite database.
What are other ways to replicate that you're aware of?
Another approach I like is to use a replicated block storage like GCP Regional Persistent Disks. This is making failover trivial. But then I'm not sure the replica can be used as a read-only copy.
> What are other ways to replicate that you're aware of?
Litestream (wal -> object storage), Rqlite (wrap in http layer + raft + streaming changes between nodes; rqlite kinda defeats much of the purpose for me), Dqlite (in-process server-thread + raft + streaming changes; dqlite obscures that there are server threads plus network communication involved, but it's there), LiteReplica (VFS extension streaming to a replica), Verneull (VFS extension -> object storage), Sqlite Ceph VFS (VFS extension storing to RADOS)
Very different sets of tradeoffs. E.g. several of these requires you to run a separate tool to do the replication. Several depends on object storage either for the primary storage (the Ceph extension) or as the replication target / to restore from, which may be fine if you already have a compatible object store. Some can use e.g. Consul to control which server is the writer.
> Another approach I like is to use a replicated block storage like GCP Regional Persistent Disks. This is making failover trivial. But then I'm not sure the replica can be used as a read-only copy.
Block storage that can guarantee in-order read/writes is fine. Network filesystems that can guarantee in-order writes and support proper locking can also allow writes (with the usual caveats) from multiple nodes. The problem here is that you really must be sure, and often - e.g. for NFS - it can appear to work just fine but be just broken enough you'll run into it at just the wrong moment.
Abstractions. Sometimes you need them. Different operating systems, CPU architectures, applications, clients, libraries, versions, hell different companies running proprietary apps, licensing, regulations... A database with a dumb network socket on a random host is fairly well abstracted.
As far as HA goes, that abstraction can then go toward allowing your apps and DB to be built for different reliability/durability/etc requirements, potentially saving money, gaining better performance or reliability, meeting weird compliance requirements, contractual requirements, vendor or software limitations, etc.
It's also easier to troubleshoot and maintain the database separate from the application in some circumstances. If SQLite is bundled with your application, it might be harder to deal with a problem affecting one or the other. In traditional organizations, you can hire people to manage databases and people to manage applications, rather than one person who has to deal with both, or even weirder, two people that would be working on one db-in-your-app.
Sometimes this is true, and sometimes it's what CORBA advocates say you to get you to adopt object brokers. Sometimes your app benefits from an explicit networked database tier, and sometimes it doesn't. I'm fine if you think most of the time it does. I'm not fine if you think it rarely does: that's not true. There are big advantages to not having a separate database tier, and the industry is sleeping on them.
(To be clear: the company I work for sells a Postgres service, and we benefit from n-tier databases! I have strong opinions here because I think this is an interesting controversy.)
The point of disaggregating your replicated database is you can scale your app and db tiers independently. Typically you only want to have a handful of database replicas (e.g. to failover after your primary fails), especially if you want strongly consistent replication. But your stateless app tier may require many more servers.
The only real difference between SQLite and something like Postgres or MySQL is that Postgres and MySQL bundle a networking layer on top of their embedded DB engines.
If you are worried about high availability, chances are you too are building a networking layer on top of a database, so what do you need two networking layers for?
Do people build their own networking layers above SQLite (besides for fun)? If you’re doing that, then you need to build some kind of replication story. At that point it makes sense to just use PG or MySQL
Unquestionably. Web apps and the like which are little more than networking frontends to a database are probably the most common type of software written these days.
Ah, we’re talking about different things. I took “network layer” to mean something like the ability to connect directly to the database over the network, not through some shim application.
Meaning something like rqlite[1]? The age of fat desktop clients all connecting back to the central SQL server is long behind us, so yeah there is probably little reason beyond fun for something like that these days, but where there is fun!
Most apps nowadays are HTTP APIs on top of a database, which means we have one networking layer – the HTTP API – on top of another networking layer – the database protocol. The idea behind using SQLite is to just skip the latter, which is simplifying the stack significantly and reducing the database latency, which is making things like 1+n not a problem anymore.
With a tool like LiteFS, it is now possible to get replication done by a sidecar process: https://fly.io/docs/litefs/
It's because people are using super slow languages which incur lots of overhead in running the queries and formatting the results into human readable form. As such they need many more machines to run these webapp machines than they do to run the database itself.
Agreed, although we are having this discussion now exactly because those same problems are starting to get solved for SQLite in an open and polished way, so it is no longer a case of needing that second networking layer to integrate with such solutions.
You're writing this as if LiteFS and Postgres were basically the same thing, and the selection criteria just boils down to the Postgres pedigree and maybe all the Postgres-specific SQL features. But that's not the case at all. The difference between replicated SQLite and Postgres is that SQLite is in-core: it doesn't have to round-trip on the network to fetch data; it can burn through a large set of N+1 queries faster than Postgres can handle a single select. The difference is that SQLite is much faster.
You sacrifice things to get that speed (Postgres features, set-and-forget write concurrency). Nobody is saying there's no reason to use Postgres, or maybe even that Postgres is the right call most of the time. But the idea that SQLite is rarely appropriate for concurrent serverside applications? It's received wisdom and it's wrong.
Somebody across the thread actually suggested that WordPress was an example of the kind of application that SQLite wouldn't work for, that needed an n-tier database. (Leave aside the fact that WordPress doesn't support SQLite, has instead a longstanding MySQL dependency). WordPress! WordPress is a perfect example of a concurrent serverside application that probably should almost exclusively use SQLite. As I said in a different comment: the whole movement towards static site generators is, in large part, a reaction to how bad n-tier databases are for a very large, popular class of concurrent serverside applications.
My second paragraph was specifically talking about the gigantic difference between operating a stateful application vs. a more typical stateless app + stateful DB. Most people aren’t deploying to Fly.io (yet!), so making Litestream work sounds pretty high risk to me :) combined with the lack of pedigree I led with, I just don’t see litestream as a good answer to “I need a HA database solution” for most people.
For most of the world (outside silicon valley) stateless is not the typical. The typical is stateful app + stateful DB run on stateful VMs.
For most enterprises, running a stateful app with SQLite is not only possible, it's easier than running stateless apps as their entire IT infrastructure is setup to support stateful apps and it's what their sysadmins know how to do.
I have tried to deploy stateless docker container apps into a typical enterprise network and it's a nightmare. The people done understand it, are not interested in understanding it, none of the infrastructure supports it and if you want to setup something like Kubernetes then you end up in configuration hell as none of the easy to deploy standard setups actually work out of the box in an enterprise network, everything ends up needing highly customized configuration to work.
I'm sorry, I don't understand any word of that sentence (for instance: we don't run containers?). We sponsor Litestream and LiteFS (in the sense that we pay Ben to work on it full time), but no part of Litestream is baked into or in any way coupled with our platform. If you can run Litestream on Fly.io, you can run it anywhere else.
You "transmogrify container images into Firecracker micro-VMs" - surely you knew what I meant?
And there is no denying that fly.io is a _particularly_ well suited platform to run stateful SQLite apps (with or without Litestream), since fly makes running stateful applications easy. I think thats awesome.
I'm really confused why you seem to be trying to distance yourself from that, and why you are latching on so hard to my passing mention of fly. I wasn't making a dig, just pointing out that most people are stuck with (arguably worse) platforms that don't lend themself to stateful, "in-core" databases.
My confusion is that there's nothing container-y about Litestream. It works anywhere SQLite does. I like our platform a lot, but I mean, Litestream works just as well on AWS.
If your application exceeds the capacity of a single server (e.g. a Java Web Service that is backed by a database), do you see SQLite as a useful solution for that case?
That seems like when you'd need a standalone database, no?
If you think the capacity will never exceed that one standalone database. If not, you'll need to plan for sharding anyway, in which case sqlite can work just fine.
SQLite has been around for a very long time, it is easily the worlds most deployed SQL database. Litestream is just a way to do live sync/replication to make HA easier.
There are certainly use cases where SQLite is not a good fit. There are likewise use-cases where PostgreSQL is not a good fit either.
One is not better than the other, it just depends on your particular needs for that particular project.
My point is, SQLite is a totally sane and reasonable storage/DB solution for many server side applications as well.
> SQLite requires the place you run your application to have durable storage, which is a huge departure from the status quo.
Having durable storage used to be the status quo for decades. It changed only recently with cloud providers (or their customers) pushing for stateless workloads because they are much easier to manage in a distributed system than stateful workloads.
I don’t see how sqlite replicated to other servers can be faster than any other relational dbs replicated on those same servers, at least without taking more chances to loose data. The limit is most likely network latency anyway, isn’t it ?
Other relational databases require network roundtrips to fetch information from the database once it's replicated. SQLite doesn't. It's a huge performance difference, to the point where it changes how you access your database; for instance, there's not much point in hunting down N+1 query patterns anymore.
Right, for a SQLite based app, you are mostly limited to 1 instance running at any given time(there are multi-node stuff shoved behind SQLite done by 3rd parties, but that's a whole diff. can of worms).
The point being, you run 1 instance and you have litestream replicate to your backup node for HA purposes.
Now you are going to think, what about scaling?!!? How many apps actually need to scale beyond 1 node? Very few. If you run into scaling problems, that is when you deal with solving your scaling problem. Because scaling is unique to each application. But before you remotely think about scaling past one node, you just build the node bigger. Nodes can get pretty massive these days.
There's rqlite (https://github.com/rqlite/rqlite), which looks cool on the surface but... it's a layer on top of sqlite, at which point you should probably think long and hard whether it's still the right tool or you should switch to e.g. postgres.
To get HA, it is now possible to replicate the SQLite database using LiteFS, which is similar to how you would get HA with MySQL or PostgreSQL: https://fly.io/docs/litefs/
That's silly. SQLite works fine for all sorts of customer features, and, deployed carefully, is fine with concurrency (writes are serialized, per database, but SQLite makes it easy to use multiple databases). SQLite has this weird reputation, I think, because frameworks like Rails used it as their "test" database, and the industry has broadly slept on how capable SQLite is in serverside applications.
I think if you use a lot of database features, sqlite isn’t really powerful enough for a lot of use cases IMO. Or atleast that’s the impression I have.
Concurrent inserts. There are applications where you don't care about that, but that excludes a lot of web-apps, or things running with multiple processes.
If that changes, I'd agree with your point, but currently that's a big constraint.
Maybe I'm using SQLite wrong, but I have a website scraper. It's written in python and so uses processes for concurrency to avoid GIL issues.
On my laptop, 16 processes accessing the same SQLite db file can do something like 5,000 write transactions/second. That seems like plenty for many purposes?
A caveat: it's in WAL mode, with synchronous mode set to NORMAL. This means that it's not fsyncing the WAL on each individual transaction commit. Instead, they get synced in batches roughly every 200ms. If you pull the power plug on the server, it may lose some amount of writes that happened in that time. Terminating the process is fine.
Yeah, yeah, we're giving up some amount of durability. In practice, not very much (200ms of transactions) and only very rarely (when the server dies). That seems like a fine trade-off for many backend processes and many web apps. You could tune this, as well: for the transactions that you _really_ want durability on, run a checkpoint before rendering your response to the user.
How exactly does that exclude a lot of web apps? Most web apps are very read heavy. And SQLite serializes writes to individual databases, not across all of them. It's trivial in SQLite to query across multiple databases.
Further, some form of write serialization is common to most database clusters: the most common cluster architecture is single write leader, multiple write replicas, which have the effect of serializing writes as well. Those clusters are all also taking advantage of the read-heaviness of applications.
I did an experiment of switching an app which uses Postgres to use SQLite. I ran into issues where I had a long running transaction in one process doing writes and another process doing reads and writes. As soon as the second process tried to do any writes, I would often get the queries running into locks and aborting. Otherwise, it worked great for reads and sped up the app a lot as there were lots of N+1 queries issues.
This is also after playing with settings about WAL mode.
It would not be a viable solution to split these tables into separate databases to avoid the concurrency issue. Also, most of the issues involve writes to the same tables.
Is there honestly any way to get that working well with current SQLite or do we need to wait for hctree or begin concurrent to mature?
I would be concerned that to get this working in its current state would require a large refactoring of the app which would require a lot of effort and would introduce limitations that make coding more difficult.
Many web apps also write a lot. Are you suggesting to have individual databases per process and then read from all of them and aggregate the data you want in memory? That would sound too much like I'm building a database, which I don't want to do.
I'm not talking about clusters and replication, just a single database on a single server, like Wordpress for example. It can handle large amounts of concurrent users if you're using InnoDB (which has been the default since how long?) and nobody will even notice that there's anyone else because it's not locking the table (or the whole database) to write, and you don't have to wait or deal with frequent busy timeouts.
Most SQLite writes can be measured in microseconds. If your app is doing less than 1,000 writes a second you probably won't even notice that SQLite is serializing your writes rather than applying them concurrently.
I must've been doing something wrong then, I've regularly hit longer locks as soon as the DB grew, to the point that it was not usable for me with multiple processes writing in the same DB because it was constantly locked by one.
Having one writer and multiple readers was fine, but even having a dozen writers who update something a few times per second quickly led to the DB becoming the bottleneck. I tried different modes, but saw no difference so I just gave up on using it concurrently.
I spent a bunch of time wrestling with this last year (and into this one), and the general solution I settled on was to serialize writes through a single writer; multiple reader threads and multiple writer threads (or processes) is not a fun configuration to put SQLite in.
There were 500MB-40G databases, probably ~1.5 updates per second at peak load?
I think "web app" is too broad a category to have this kind of discussion over.. for a lot of web apps 1.5 updates per second could be considered a reasonable limit. Many people work on web apps that do 100x that and don't consider themselves to be working at any particular level of scale. 1.5K updates/sec is not uncommon either, in different settings -- high-traffic internal web apps can hit that in a single company, and the people working in those contexts don't consider that too special either, not to mention public facing websites like GH where many people start their careers without having seen anything different.
I'm not benchmarking SQLite, I'm relating the environment in which I was tuning SQLite concurrency, because I feel like it doesn't mean much to say "here's what worked for me" without at least some basic facts about what I was working on. SQLite itself can do tens of thousands of writes per second on commodity hardware without breaking a sweat.
There's more to most applications than interactions with the database. Once your web app exceeds the capacity of a single machine, is SQLite a good choice for state?
I think he’s saying to shard writes by tenant, or something else that makes sense for your app - so you can route reads to an individual shard. Same as you’d do to shard any other database, if a little more DIY.
Speaking of Wordpress though, that’s a pretty great example of a web app that typically handles almost exclusively reads. I’ll bet almost any Wordpress instance would work great on SQLite.
WordPress is the archetypical application that SQLite would kill for, but for historical reasons, WordPress has always wanted MySQL. It's funny, because there's a whole movement towards static sites that is, in part, a reaction to how bad the n-tier architecture behind WordPress is!
It's crazy that this is only happening as of 2022, right? They could have kept a whole generation of web sites on WordPress by eliminating the MySQL dependency, and gained a performance and deployability advantage in the process. Even multitenant WordPress would have come out great with SQLite. I'm waiting for someone closer to WordPress to tell me how wrong I am about this, because I have to be missing something.
I'm not sure that it'll bring a lot of sites back to or keep them on WP, but it'll be great for testing. Much of it is always integration testing with WP because of how filters and actions work your code is rarely ever isolated, SQLite will make that much easier and faster.
It's also really not a big switch for the most part. I'm not aware of that many MySQL-specific things happening in WP. Things like fulltext indexes aren't in core, so it's really just using a different driver to get the basics working. I'm sure it would break a lot of plugins, but that's fine, they have platform requirements anyhow, adding "does work with sqlite" isn't a huge step and can likely be automated to a high degree (if the plugin never uses wpdb->query, ->get_results etc, it's compatible).
And to be clear, this would be a single instance of the WordPress php application that all traffic would route to, and it would access the SQLite database on that node?
If that node failed, then the php application and database would be restored to a new machine and traffic would resume?