Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why Hacker News should use a different Deterministic Algorithm, Not a Random One (bayesianwitch.com)
137 points by yummyfajitas on Dec 2, 2013 | hide | past | favorite | 46 comments


  We assert that a randomized algorithm is unnecessary, and the 
  problem can be solved deterministically (though of course random 
  numbers will be used for sampling).
At a certain point this distinction becomes just philosophical wankery. Make no mistake, the proposed deterministic algorithm is a random one in disguise. And that's fine. Over large sample sizes, this randomness becomes sufficiently deterministic. Which is what the original proposal suggested, and it's also why sometimes a random variable can be elegant, too.


The article's philosophical wankery was way too pragmatic.

Random vs deterministic is a spectrum, anyway. Given enough information about the state of the machine doing the random picking/number generation all computer algorithms are deterministic anyway.

Heck, given enough information about the state of the universe everything is theoretically predictable, some predictions are just really hard.

Hopefully this achieves the needed philosophical wankery quota.


That's sort of what I was talking about. Sorry if that wasn't clear. Anyway, I'm with you on that. :)


The article describes an algorithm that randomly shows articles based on a computed distribution, and calls that "deterministic". That doesn't sound deterministic to me - is the article misusing the term, or am I misunderstanding "deterministic"? (genuine question)


The algorithm deterministically computes how many people should be shown article A and B and randomly assigns people to those different groups.

I suppose any algorithm involving sampling will be random in this manner, but the point is that randomness is not being introduced to work around flaws in the decision procedure.


I'm a big fan of adding noise to systems to make them more predictable (it was the topic of my thesis: http://tlb.org/#papers). But, I believe it's not needed here because the arrival times of /newest voters are already sufficiently noisy. You only need random noise of +/- 0.5 to get predictable results, and voter arrival patters give more than that already.


imo the best argument for randomized algorithms is that you may want to show more "underexposed" articles without seeing all of them.

The issue with deterministic algos is that you basically have two options:

- Treat new things as guilty until proven innocent and don't push them to the front until they get votes. This is the current approach here.

- Push new things without "enough" votes to the front. This makes the front page almost all new items, invites spam, and generally defeats the purpose of a sorting algorithm.

The problem here is that you're necessarily combining both the act of "learning" article quality and the act of exploiting it, which necessarily makes at least one of those tasks suboptimal. One advantage of randomized algorithms is that you can add a third option:

- Show just enough underexposed articles to gather evidence (votes) without polluting the front page too much.


The flaw in your post is that deterministic does not mean showing everyone the same thing. You can deterministically decide how much evidence to gather about underexposed articles as well, which is what I advocated

Though to be fair, a random number generator would be used to determine which users to expose the article too (just not how many of them).

Even if you fail to come up with a deterministic algorithm, you'll probably come up with a better random one simply by coming up with better probability distribution to draw from.


> The flaw in your post is that deterministic does not mean showing everyone the same thing. Though to be fair, a random number generator would be used to determine which users to expose the article too (just not how many of them).

Well to be fair, what you are describing is a random experiment in the statistical sense, ie. something modeled by a probability space. I see what you're saying, but I would say your usage of the word is the nonstandard one, not chilldream's.


>Show just enough underexposed articles to gather evidence (votes) without polluting the front page too much.

That seems pretty much exactly what he is proposing. It doesn't require a randomized sorting though. For example, you pick some item from the new feed that isn't getting a lot of views and shove it on the front page for a small percent of users to get it more views, and therefore more votes to see how good the article actually is.

The trick is to trade-off between discovering new good content, and just promoting what has already proven itself. There is an optimal tradeoff somehow.


>There is an optimal tradeoff somehow.

You are basically talking about Bandit Algorithms, which are a family of techniques to solve the problem of trading of exploration ('discovering new good content') and exploitation ('just promoting what has already proven itself').

http://en.wikipedia.org/wiki/Multi-armed_bandit

There are several bayesian approaches to bandit problems, which calculate the the uncertainty in a bayesian way, and then use this uncertainty to make the tradeoff between explore and exploit (e.g. http://en.wikipedia.org/wiki/Thompson_sampling).

So that's a well understood problem.

For my money, the 'correct' way to solve the HN problem is as a bandit arm for each article, which you use to decide how much to display low vote articles.

I think this bandit process should be run on top of a bayesian model that features not just the estimated quality of the article from its votes, but also a prior which it gets from the submitter's 'quality' (which is updated depending on how their articles do - this would also serve as a brake on spammers).

Then you just need a heuristic to age out articles over time.

I'm sure PG knows enough ML/stats people to do this sort of thing properly; the fact that its not done means admins are probably happy enough with the status quo.

It's also worth looking at this article by the author of XKCD on voting on reddit, for an easy discussion of simple bayesian voting: http://blog.reddit.com/2009/10/reddits-new-comment-sorting-s...


Don't use heuristics. Age out articles by penalizing them in your objective function. It's far cleaner.


Well, it depends.

If you already know exactly the 'age-out' behaviour that you want to capture, then maybe the best thing to do is bolt that on after doing your statistical inference.

This keeps the 'complicated' bit of your system that uses the statistical models simpler.

Also, you'll know exactly what the heuristic does, and be able to predictably hack changes into it in a single place, without affecting your inference code.

If you instead put the aging-out into your objective function, you'll have a more elegant implementation, but the 'age-out' behaviour might surprise you.

Well-specified models are great, but heuristic hacks that force the system to behave how you want also have a place. In general, I would start out by only using the statistical modelling in places where you can't describe exactly what you want with a heuristic.

What the best approach actually is depends, and I'd need to be implementing the system to say more.


Show just enough underexposed articles to gather evidence (votes) without polluting the front page too much

The problem with this is that people are biased. Their negative reaction to spam is far greater than their positive reaction to a hidden gem.


[deleted]


The first really strong argument for some randomization that I remember learning went like this:

1. Suppose you have a game which is a perfect indicator of skill. Whenever two people play, the one with more skill wins.

2. We can now give everyone an ordinal number representing their skill in the game. But ordinal numbers are a little unsatisfying. What we really want are cardinal numbers (the answer to the question "how good am I at this, anyway?").

3. We can get much closer to that goal by adding noise to the game. Now, instead of the better player winning 100% of the time, the better player will win with some probability (> 50%) related to his skill. Instead of being doomed to label three people as first, second, and third, we can say "A wins against C 75% of the time, and B wins against C 72% of the time; A is better than B, but the gap is not so large".


Randomization isn't necessary though. Just give one player a deterministic advantage and see how high the handicap has to be for the other player to win. It's essentially the same thing.


The idea is the same; a dithered game can theoretically assign any real-valued probability of winning while a handicap is usually discrete.


Additionally to what we have, I wish each user could "simply" hide topics we don't care about, favourite others (and maybe even comments for good measure), and also have a view of the topics they replied to.

Yes, echo chamber, but also people talking about topics they know or want to know about with other people who do. Having an arbitrary and often rather short window of opportunity to discuss something sometimes really sucks.


After the last front page thread about HN ranking (https://news.ycombinator.com/item?id=6799854) I built http://news-yc.appspot.com, which shows an unpenalized version of the front page (and tries to calculate the applied penalties). My plan, time permitting, is to start clustering submissions by topic (probably using k-means or similar) as well as building a recommendation engine around submission comments ("users who commented on this submission also commented on…").


So... subreddits?


I was thinking a few more menu items in the otherwise unchanged top row of HN - not making categories to file stuff into. There's ask and jobs, the rest could be up to each individual user, without having any effect on anyone else.

Reddit doesn't allow one to fold away particular threads, or favourite others - for me there is a difference between upvoting something and filing it away for later reference/response... oh, and maybe also a link showing just the new comments of all the discussions you favourited. While I'm dreaming, even let us give our own tags to discussion threads.


Maybe you could do something with tags.


Interestingly, there's already an implicit HN 'subreddit' (ask) and a couple of implicit tags 'show hn, ask hn, ask pg.' Even the who's hiring threads could be argued to be a "board" in anything but name.


If you're interested in these types of algorithms, I'm writing a book / running an online course. You can sign up and read past content here: http://bandits.mynaweb.com/

Next section is coming very soon. Just gotta finish the last visualisation. (Interesting that both bandit algorithms and Angular + d3 feature on HN today.)

I haven't had the time to read either article closely (at a conference) but the main issue I'd be concerned with is non-stationary rewards. What is interesting today is not interesting tomorrow. A simple hack to get around this is to decay the prior over time. Other issues to consider include:

- the amount of data available. Even HN might not have enough readers to get sufficient data on each arm. Here you might use a similarity measure between arms (articles), such as tf-idf so votes for one article can inform the ranking of other articles.

- the effect of ranking on votes. The highest ranked articles probably gets more votes just by virtue of being at the top. There is some work on submodular bandits that might be relevant here, but this research is very new.


Just curious, do you have some links on handling non-stationary rewards?

I've cooked up a method of my own which I believe is theoretically sound, but I imagine there must be methods out there that don't involve solving PDEs. (Or at least simpler closed form approximations to the PDEs.)

http://www.chrisstucchio.com/blog/2013/time_varying_conversi...


I would say, that surely flags of an article to count (how much, I really am not sure). At least I would do that.

And I believe, that comments would also count a fair amount. An up-vote is a strong signal, but a comment is an even stronger signal.

I would (without telling anybody) even take the length of comments into account when weighting. I would at least try this and do an A/B-test, to see if this heightens the quality of articles on the front-page.

I would also use clicks on the link as an indicator, but one would have to be careful, as this would push link-bait articles to the top. So here one should not put too much weight into clicks.


Actually, the recent articles on HN's ranking algorithm suggest that PG considers comments to be evidence against an article being a quality article. Presumably that's why he penalizes articles with lots of comments.


He believes that the number of comments correlates with how controversial something is and wants to select against controversial topics. Anything with greater than 40 comments gets an automatic penalty.

I think a better way of doing it would be to get a large number of articles which are known to be controversial or uncontroversial and then use features from them (not just the number of comments) and try to develop a good predictor of how controversial something is. It would probably be a lot more effective than just arbitrary penalties for articles with 40 comments.

Though I kind of object to this stuff being done automatically at all. There aren't that many articles on HN that human moderation is impossible. As cool as automation is.


> Anything with greater than 40 comments gets an automatic penalty.

That's not quite right. Anything with >= 40 comments AND more comments than upvotes gets the controversy penalty. Both conditions need to be true. An article with tons of comments will have NO controversy penalty as long as there are more votes than comments.

For most articles, there are more votes than comments, so commenting will not harm the article's rank. (I wrote the recent article analyzing HN ranking.)


It appears he wants to correlate the number of comments with their contextual value, though.

A better indicator of how trollish a thread is and how much that's getting in the way of the conversation (because those are two different things - users may be willing to put up with vehement discussion in one instance but not another) would be the ratio of downvoted comments to comments. I can even see measuring the length of comments and the depth of a thread, as a deep thread with lots of tiny comments might be an indication of sniping.

I realize i'm probably being incredibly naive and that the whole thing is complex with multiple factors interrelating but I don't see how the ratio of upvotes to comments tells you anything useful, if it doesn't take into account that people may comment without upvoting, and people might upvote bad comments. I make certain to upvote the article now with every comment because i'm aware that not doing so, regardless of the content of my comment, is an implicit downvote.

I think it's also a mistake to correlate "controversial" with "unwanted" but that might just be a matter of semantics.


Agreed, the stories sunk include ones like the Anandtech review of the Surface Pro, and the site is quite reputed for being neutral, however the excessive number of comments combined with not too many upvotes can easily sink articles which are otherwise just news and this contributes to the filter bubble(funnily there's a anti-filter bubble post right near this article on the front page).

Read pg's comments in this thread. https://news.ycombinator.com/item?id=6596038


No, no. More comments than votes counts against the article. Articles which are punished by that are submissions that are more controversial than interesting.


Well that is quite an interesting signal. I hadn't thought of this. Looking at the ratio between upvotes and comments really is an interesting take on the matter.

I would believe, that PG had analyzed a lot of articles, to find a good ratio, that does penalize these kind of controversial topics, that he does not want to gain too much traction.

What I do not understand though is the fact, that controversial topics are penalized in the first place. But that is just me, one who loves to argue, to find optimal solutions (or to see, that there are really only two believe-systems at work).


Apparently it penalizes articles if number of comments > number of upvotes, so I doubt much analysis was involved - it's unlikely that it would result in such a simple ratio.


Yes, that came out in a thread where people were complaining that many positive/neutral posts about Microsoft/Nokia were being downranked and disappeared from the front page, while negative stories didn't usually have that problem leading to a chilling effect on the stories submitted and commented on. People seem to have got riled up over it because they thought it was being flagged, but PG said it was because of the flamewar detector that penalizes articles with lots of comments but fewer upvotes, atleast in that particular case.

The flamewar detector does its intended job with things like the gender discrimination stories but it looks like it throws out the baby with the bathwater and contributes to the echo chamber effect.

Here's a couple of Samsung stories that met the same fate. http://hnweird.tumblr.com/

It's really hard to tell(except if you're pg or HN staff) whether a story disappeared because of the flamewar detector, or flagging or the domain greylist or something else. I guess this is intentionally so.


Is contention with HN's ranking algorithm common? I've perused the "new" page a number of times, and generally interesting submissions (i.e. what HN finds interesting) DO make it to the front page.


> I've perused the "new" page a number of times, and generally interesting submissions (i.e. what HN finds interesting) DO make it to the front page.

Some do and some don't. Certainly, many interesting articles never make it to the front page. Due to lack of exposure rather than some failing intrinsic failing of the article.

It raises an interesting question of priorities; Is it more important that every interesting article spends some time on the front page, or that no non-interesting article spends some time on the front page?

In other words, given that there are lots of interesting things in the world, does it matter if a particular interesting thing never makes it to the front page, so long as the front page is full of interesting things?


Very many bad articles make it to the front page (sensationalist title and/or link bait about Bitcoin), however, the flag option is so much more powerful than the up vote that it counteracts false positives immediately.


I agree that introducing randomness would probably make it more fair to new posts. One idea I had, rather than an algorithm based on Bayesian prior probability estimates, would be one based on a Kalman filter (or similar) confidence interval from the up/down vote history.

The filter would give a range of possible scores for the post -- ones with few votes would have a very broad range, ones with lots of votes would have a tight range. Then you order based on a random number in that range (discounted based on its age).


One of the problems with this approach, that I'm not seeing anyone mention in a cursory scan of the existing comments, is that it would make the front page different for different people. Other sites work that way (notably Reddit), but AFAIK, the HN front page is the same for everyone who views it at the same time (possibly modulo showdead; does that affect posts?). This is a useful property to preserve, and one that this proposal explicitly breaks.


I find it amusing that they propose a Bayesian approach since PG contributed a fair deal to spam detection a decade or so ago.


quasi random number sequences [1] are "space filling". A typical problem of sampling from the uniform distribution is that parts of the space never get sampled due to sampling bias. Quasi-random sequences ensure evenish distribution. This speeds up many sampling based algorithms e.g. MCMC, by an order of a complexity level.

I too believe random numbers are usually the wrong choice.

[1] http://www.mathworks.co.uk/help/stats/generating-quasi-rando...


Another (perhaps unpopular) metric to use would be whether links were clicked before 1) upvoting, 2) clicking the comments link, 3) commenting


Probability matching is hardly what I would call a deterministic algorithm. What the authors likely means is that we should use a probabilistic algorithm rather than a random one.

Nitpicking aside, I disagree with the usage of the original quote:

> “It appears to be a quite general principle that, whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought.” E.T. Jaynes

What randomized methods are good at is reducing bias. In our case, there are several sources of bias: - The population that upvotes links on the 'new' page is different than the one that upvotes links on the frontpage - The probability of upvotes (which I assume are either ratios of upvotes to impressions or the ratio of upvotes to clicks) itself is determined by several factors other than article quality: article type and headline quality (and there is another interesting debate about quality vs. popularity, since sensationalized headlines are likely to get more clicks) to cite the main culprits, but there are many others.

I have three problems with this article:

1. The proposed strategy answers the question of variance but not bias, which is where the problem differs from a simple multi-armed bandit problem. Now we could build sophisticated models that try to take all these into account and get an unbiased estimator of quality, but it's likely be a pursuit that's not worth the payoff.

2. Adding random noise to the rankings is likely to be much more efficient. The author here suggests computing the posterior probability of an article being better than another for each pair of articles, which seems rather impractical.

3. Most importantly: is overall quality of the frontpage really the ultimate optimization objective? This solution doesn't solve the problem of the N+1th article receiving much fewer votes than the Nth, where N is the number of articles shown on the frontpage, while being arguably equally deserving. I would argue that the problem of minimizing the probability of a good article getting few pageviews is actually the more important one, because we have more good articles than slots on the frontpage. A randomized algorithm would be much better equipped for this purpose.

Of course, the other thing is that for bragging purposes "We made the frontpage of HN" sounds much better than "We made the frontpage of HN for 60% of the users". But that's a different problem altogether...

Edit: writing this comment made me think of alternative solutions. An easy-to-implement one would be to add 5 or so slots at the bottom of the frontpage that would contain random picks from the 2nd or 3rd page in a rotating manner. That would give good articles which weren't fortunate enough to get good visibility a chance at floating up to the top, without all the implications of changing the ranking algorithm itself. In other words, add fuzziness to the threshold instead of the ranking.


I have little dispute with everything you've written.

However, one place I'd argue with you strongly is that handling beliefs about the world should be part of your model. For example, you mention different upvote probabilities on the new and front page - I'd simply model that with a new parameter beta (possibly related to alpha in some way).

It's far from clear to me how randomization such as what Dan Luu proposed actually addresses this at all.

I'll also discuss one other point: The author here suggests computing the posterior probability of an article being better than another for each pair of articles, which seems rather impractical.

It probably is. However, suppose we went through the effort of computing posteriors, only to discover that we get stuck at the stage of computing pairwise probabilities. There is still a very good chance we can draw samples from the posteriors.

Even though our deterministic algorithm has failed, we still get a much better random algorithm out of the exercise. Given the posteriors on P(good article), we can draw samples from them and employ Thompson Sampling to sort the articles. (I.e., draw a sample from each posterior and then order the articles by increasing sampled posterior.)


bayesianwitch.com is those fucks that caused this: https://news.ycombinator.com/item?id=6780997 and in particular is run by this asshole: http://www.chrisstucchio.com/blog/2013/human_checksum.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: