Just recently I thought to myself "it has been a long time since I saw the Github Unicorn". As it turns out, I didn't really miss it at all.
Regarding the centralized nature of Github: it is the centralized communication that is a problem, not the ability to share code. I can easily send a patch to somebody on my team, but that doesn't help me review a PR, reply to comments, trigger a CI build, or initiate a deploy.
Code sharing is only a small part of what a team relies on Github for.
Yup. Half the comments in this thread are "oh you could do issues in git or something." No, when my PM asks me if bug 123 is fixed, I want a single source of truth. Whether that truth is GitHub, JIRA or something else, it's still a single point of failure.
Maybe the best way would be to put those wiki/issue/whatever-data in the repository itself and just build tools to throw at them. This could be github, gitlab or something local.
Back when hurricane Sandy took out my employer's upstream Mercurial server for several days straight, I was pretty chuffed to be able to tell my boss, "You know how we migrated to a new source control system a couple months ago? Well, thanks to that, we can keep working pretty much uninterrupted. We should double check that XXX is getting frequent offsite backups, though."
My current company uses a self-hosted option, so this doesn't affect me. But I can't help but think that this time it's different, and we'd be hosed. The git part would still work without too much hassle, but we are heavily dependent on a bunch of additional things that GitHub offers, such as the pull request interface. That's slightly worrisome, I suppose.
All that said, I want to steer clear of knee-jerk assuming, "We don't have this problem b/c we self-host." There's a sickening sense of not really being in control of your own fate when a cloud provider goes down, but, realistically, I wouldn't be the person in charge of getting one of our self-hosted services back up, either. What really matters is % downtime. My experience has been that, compared to many in-house IT departments, folks like GitHub are generally very good at keeping the lights on.
Is it just me or is the new GitHub Status page a joke? It used to show historical charts for uptime and latency for each of the different services, which was actually useful. Now it is just a daily messages list.
The Wayback Machine suggests it was 13th September 2017: [1] (from 17:50:53 GMT) has graphs, but [2] (from 20:06:33 GMT) is a 302 redirect to the messages page.
I saw that, too, and was fairly concerned. A smoothly running, quality focused engineering organization doesn't leave a status page un-updated for 3+ weeks...
Also, "reports of service unavailability"? I would expect monitoring tools to be screaming...
They had an outage happening a few days ago, and I happened to be looking at the status page. It appeared that days in the past were just showing the current status. It went orange, all days past went orange. I refreshed again, it went red, all days past went red.
Time allocated for this used to be a given. It's interesting how quickly we have moved to SaaS.
Though if your internal IT team had a VCS outage, it wouldn't be Hacker News news. Really it is just the scope of the outages (and commiseration) that has changed.
Yeah, merely an observation, not a criticism of SaaS.
However I do wonder sometimes, even with the resources and domain knowledge, if the super crazy scale these SaaS companies have to deal with tips the scales to be about the same reliability as a solid internal ops or IT team (who only have to worry about YOUR scale).
It does this without cluttering the repository with unneeded files, gives the possibility for having tree-like conversation (including merging conversations), referencing issues, linking to commits and so on.
It is even possible to host the issues in another repository as the code (if that is wanted) or having one repository of issues for several projects / moving issues from one repository to another.
There is only a CLI as of today. I hear that there is some effort on building a (view-only) web frontend for it, though I don't know much about its progress. Maybe asking the maintainer would be an idea.
What's also missing is a way to give users of the tool access to a repository where they can submit issues (which then could also be used by a web/gui frontend for the tool). This is not the domain of git-dit itself, but a solution needs to be found. One idea would be a publish repo (where everyone can push) which automatically does some sanity-verification on the issues and forwards them to the maintainers repository... or something like that.
Also integration/mappers to/from other services (gitlab, github) are missing and so is mail->git-dit integration (posting issues from a mailinglist automagically into the issues repository).
Also, https://github.com/vitiral/artifact/ is a really nice tool to do planning of an application or library inside a git repository. I am currently starting using it in imag (https://imag-pim.org) and it is really wonderful. The author currently does a reimplementation of its core functionality to make it even more powerful.
Yeah I bet these guys will have this fixed in no time. If you manage your own instance you are bound to have a less entertaining landing page when it breaks.
I've worked on numerous enterprise git servers - they all inevitably go down for at least a half day every 6-9 months.
Is that better or worse than stripe who updates their status page saying something is wrong all of the time and then just changes it back without noting any problem in the history? I get alerts on them 4-5x/week and only maybe one of those winds up as a colored entry in the history.
...and sure enough, shortly after writing this it was 'down' for 10 minutes and came back up with the status page saying nothing about it.
While I kind of hate Atlassian products (bloated, horrible APIs), and Bitbucket has some serious UX problems (Find), it's still better than have your repos be offline.
There's not much that can go wrong on a single instance with a local database. Have a failover in case the HW fails and you are good to go, and it's faster. The pricing is not great, but perhaps that's something GitLab can solve.
But being hip and being lean (even if it costs more) is more important I guess.
I think GP was referring to the fact that bitbucket.org-the-website is just the Atlassian-hosted installation of BitBucket-the-Atlassian-product, which you can also self-host and self-administrate.
It is payware, but reasonably affordable ($10/10 users, usually).
I've seen companies squeeze some (probably contract-breaching) huge numbers of users out of the very small plans of Atlassian software, as well, usually via insane multiple-people-using-same-account editing conventions.
> to the fact that bitbucket.org-the-website is just the Atlassian-hosted installation of BitBucket-the-Atlassian-product, which you can also self-host and self-administrate.
Nope. They're completely separate things.
The hosted one they bought, and does mercurial too.
The self hosted one is what used to be called stash, and is git only afaik.
Not a Bitbucket user, but this surprises me given the seemingly constant downtime issues that plagued Hipchat when I used it on a regular basis, but maybe it was just isolated to that one product.
See my reply to nik736's comment in this subthread: bitbucket.org is just Atlassian's copy of BitBucket, which you can buy from them and hose yourself.
EDIT: I meant "host", not "hose", but given some of my experience sysadminning Atlassian products, the sentence may still be correct.
> bitbucket.org is just Atlassian's copy of BitBucket, which you can buy from them and hose yourself.
This is incorrect. Bitbucket Cloud and Server are two completely separate codebases. On the other hand, GitHub Enterprise is basically a snapshot of the production GitHub application.
Interesting, thanks for the info. I had assumed BitBucket cloud was a selected distribution of mostly the same components as BitBucket Server, plus a few proprietary things they don't sell in the hosted version.
Self-hosted vs SaaS is not a binary decision, and I hate seeing it argued like that every time there's an outage.
What the hell happened to basic risk mitigation? Offsite backups, disaster procedures, etc aren't new, or unexpected. You'd think they should be the norm, if you're a working professional...
It’s been on Hacker News for two minutes. Your reply is one minute ago. Perhaps you should check your SLA with GitHub, but presumably it doesn’t say that 60 second delays is the same as the status page being effectively useless.
Also, it was updated by the time you posted your reply...
This is why it pays to host your own source repositories. It is kind of shocking that many people with the skills and means are too cheap to host their own. I personally could not risk github deleting my repositories or (in this case) going down for any length of time.
Though in an alternate universe where somebody's self-hosted server went down in flames (possibly literally), one could just as easily say:
This is why it pays to have your own source repositories in the cloud. It is kind of shocking that many people with the awareness and means are too cheap to pay for a GitHub private repo. I personally could not risk a careless sysadmin deleting my repositories or (in this case) going down for any length of time.
The obvious answer is to have both, but smooth synchronization isn't always easy or even available.
You seem to be assuming that your self-hosted git will be more bullet-proof than GitHub. It probably won't be. You'll probably have fewer nines than they do; it's their business, after all.
Even if your own solution had better uptime, you still haven't shown that it's worth it. GitHub is far more than just a git repo on a server.
The problem is that if you spin up a GitLab in DO or Linode, there is a chance that your VM will get lost due to hardware failure. You also need to manage backups and stuff.
I'm right there with you, but it doesn't make sense for small teams. Once you get past a point, it starts to make sense to self-host some stuff.
Is there a managed hosting provider that would host all this stuff for you? Like, say "I need GitLab, Jira, Active Directory, and X, Y, Z" and they come back with "$XXX/month for 10 users"?
I don't worry about Github deleting my stuff. Every single developer has a copy of all of the repos they work on. Delete it from Github and any one of us can just push it somewhere else. That's one of the great things about git.
Them going down is still a problem if you're using Github as part of your daily development or deployment process, of course.
Eh, self hosting stuff is a hassle and I'd be willing to bet most people are more concerned about the time/reliability than the cost.
I use gh pages for the static parts of my own site even though I have a web server (which I use for quickly sharing files and whatnot) because GitHub is less likely to leave things broken than I am.
You nailed it for the devs but you gotta' say it to people who would pay for it too.
Decentralized GitHub would synergistically leverage our existing cloud infrastructure to provide unprecedented collaboration that is open, robust, efficient, and focused.
"crypto" baiting is like the funniest trend ever in the Stock Market lately.
Kodak is riding that pony home ... (noting I live in Buffalo, just down the road from Kodak's home turf in Rochester and I'm a pretty avid photographer - http://www.instagram.com/crispyfotos/).
You got to ironically buzz to be cool today?
Pretend you didnt get it, so all those d* who didnt get it can lecture you.
We are deeply invested into the idea of the Seagullarity!
Actually that's the FIRST time I've ever heard of lattice. Is that a real deal? Not trying to feed into the hype here, but from a math perspective I get fascinated by concurrency
Yeah, a cryptocurrency called Rai currently uses it (maybe IOTA too?). Essentially, each account gets its own blockchain and they're connected together using a DAG data structure.
SIA coin is a similar concept. They even have video hosting/streaming on their development roadmap.
Decentralized storage, filesystem and social media seem to be the most valuable use–case for blockchains aside from the inherent value of cryptocurrency.
It's funny that there's already a Gitcoin project, however they do NOT have a token:
https://gitcoin.co/
The goal of the project is to incentivize FOSS development, similar to Bountysource, except without requiring participants to trust a central party. It's pretty cool!
The issue is that the features we use along with git, many of which github provides, are not decentralized. The true but tired argument that git will continue to work when github goes down totally ignores this issue.
Yes, git still works. But we don't just rely on the features git provides.
So this isn't really anything to do with Git then is it? So why joke we centralised Git when really we centralised a bunch of other things that are not really anything to do with Git?
It's also the issue with git, if it goes down and you don't update your local clones ~daily (or as other mentioned don't have system in place that would allow you to update somewhat locally).
You can still work with your colleagues by pushing and pulling your own repos without involving GitHub.
I think centralised CI is the real problem. I don't have the compute power in my home to run our full test suite, so I can't push with confidence without my CI cluster.
Issues and other GH infrastructure is arguably a bigger problem. That metadata is locked within the Github silo with no easy way to export it elsewhere.
This is indeed the crux of the problem. I've been thinking about this a lot (and I wouldn't be surprised if it exists already), we need a decentralised method of storing issues and other things inside our git repos.
https://github.com/neithernut/git-dit provides a distributed issue tracker inside git, without cluttering the repository with unneeded files and also gives the possibility for having tree-like conversation, referencing issues and so on.
Unfortunately, no non-cli frontend exists right now (feel free to build one, shouldn't be complicated). Also some convenience is still missing, but could easily be integrated.
What's also missing is a way to give users of the tool access to a repository where they can submit issues (which then could also be used by a web/gui frontend for the tool). This is not the domain of git-dit itself, but a solution needs to be found. One idea would be a publish repo (where everyone can push) which automatically does some sanity-verification on the issues and forwards them to the maintainers repository... or something like that.
Also, https://github.com/vitiral/artifact/ is a really nice tool to do planning of an application or library inside a git repository. I am currently starting using it in iamg (https://imag-pim.org) and it is really wonderful. The author currently does a reimplementation of its core functionality to make it even more powerful.
If you think about it, there's no particular reasons why the metadata can't live in (and be tracked by) the repo itself.
Issues could live in /issues. Simple command-line (or GUI) tools could edit them. I'm thinking in particular of how password-store[0] makes tracking history in a git repo invisible: it Just Works™.
Discussions could live in /discussions, stored in something like RFC822 format. Again, simple CLI (or GUI, if you swing that way) tools could manipulate this easily.
A wiki can, again, live in the same repo.
PRs are a little different, since they really do need to live outside the repo. But what is a PR other than someone saying, 'hey, please pull my branch into yours'?
PRs and other things could also just live in a "shadow repo". Even if just by convention.
You have a `Product` repo and a `Product-meta` repo.
The biggest issue I have with using git as a truly decentralized system is remote management. Unless you want to be manually futzing with remotes on every single client and pushing/fetching from others correctly, you need some kind of central server.
I really think there is a hole here for a product that works with git underneath, but gives a nice easy way to manage all that complexity.
Also, the workflow on Github is one many people like, and it differs a bit if you have to use git "the old fashioned way". Not that it's hard or impossible, but it differs. I can't imagine explaining the GitHub-less workflow to my colleges..
There are several attempts at tracking issues inside a repository. What we really need to sell the concept, though, I think is one that can reasonably sync with GitHub Issues. GitHub Issues are a reasonable front end for issue reporting for casual and non-technical users and if you can interoperate with them you don't have to reinvent that basic CMS.
Every now and then I sketch ideas on the subject, but haven't yet gotten someone to pay me to build it. ;)
GitLab CI is really sweet, because you deploy your own runners (workers) whereever you want. Downside is, the control is not a standalone CI app but a part of GitLab (or I'm unaware about something).
Drone is very promising but last time I've checked the documentation had some holes in it. The website is "coming soon" and IIRC it's like this for quite a long while.
I'm unaware about any CIs that are usable with local repos. Would be neat to just run a local command and it would spawn a worker somewhere (local or through a remote coordinator) and run the tests on whatever I have in the working tree, just like it happens with centralized repo+CI combos.
It's too frequent I find myself doing `git commit -m 'Fix that stupid typo in previous commit'`.
We're working on a 'CI only' mode so you can easily use GitLab CI without the rest of GitLab. This is already possible but now it requires some configuration.
We centralized a decentralized communication system too. (eMail)
Decentralization just doesn't work too well in practice for whatever reason. Everyone is behind a NAT/firewall, everyone has low computing power, its hard to regulate, etc. This all leads to a centralized solution being easier.
I think the current best thing we have is centralized but open source and encrypted, which gets an "okay"/10 from me.
> Decentralization just doesn't work too well in practice for whatever reason.
Because it's inconvenient. Centralisation is convenient, it gives a single discovery and synchronisation point. Decentralisation makes discovery much more difficult, and requires adding separate synchronisation mechanisms. It generates friction and cognitive overhead.
Even more so for "side-services". Sure your VCS is nominally decentralised[0], but what about bug reports? Contributions? Notes & docs? There were distributed bug trackers efforts back in the early 10s but… they didn't really work IME, they were not convenient or practical.
[0] though even without a single giant point of failure, most project would still have a single canonical master copy, a really mesh/distributed contribution system is very rare (Linux's tree of integrators/forks is probably the closest?) and none of the current VCS makes mesh/point-to-point collaborations really convenient
That's kind of missing the point, everyone's git local clones are still there, I can still work on the code. Git's decentralisation is meant to make sure work doesn't stop altogether when the remote is down.
The major feature of git is that it's distributed, not that it's decentralized.
Git's got two big features over SVN:
1. Automatic, private, per-user branching. Git's even nice enough to keep the private branches out of the main repository, and lets you pretend to be the authoritative repository without creating a branch if you really want to. This is what clone/push/pull actually does, and it's what a distributed VCS really brings to the table. It lets every dev pretend to be the project manager when they're writing their own code.
2. A much improved merging model. The graph model of git is just much better than the linear model of SVN.
The second one is what people thought they wanted when they started using git. The first one is what they didn't know they wanted before they started using git.
Git gets around the problem of "Well, if we do #1, how do we know which repository is authoritative then?" by saying, "We're not solving that problem. This is an exercise for the users that's easily solved by file permissions." So by refusing to solve that (rather hard) problem, the VCS becomes internally decentralized. That doesn't mean you can't or shouldn't centrally manage your repositories or have an authoritative repository. It's just that git itself doesn't care about knowing which repository is authoritative.
Bingo. People think its hard to self host. Its not. They have an omnipackage that you install. You run updates. You enable the automatic backups. Problem solved!
The VCS is still decentralized (you can share code with your neighbour or across the world), it's the administration (tickets, PRs, etc) that aren't.
I think it'd be fairly straightforward for github or a competitor to store those things in git as plain markdown files, either alongside the main source code, or (as it does with GH Pages) in a separate branch (that has nothing in common with the master branch but it's still in the same repo).
No way to build my npm dependencies.
So the day github would really crash/lost some files, the package decency system thing is dead.
That is a very exciting scenario... as exiting as if google would forget to renew google.com and would have no legal right to get it back.
I find it useful because I don't use github directly, but some of my software's dependencies do. It's a nice heads up, because I probably won't be checking otherwise.
so is it interesting for you to know that a service you don't use daily is down for a couple hours? this random github outage happens basically every other month, and is fixed in like 30 minutes. every time it's on HN frontpage.
Regarding the centralized nature of Github: it is the centralized communication that is a problem, not the ability to share code. I can easily send a patch to somebody on my team, but that doesn't help me review a PR, reply to comments, trigger a CI build, or initiate a deploy.
Code sharing is only a small part of what a team relies on Github for.