This is the failure of google. I stopped using google about 6 months ago and started using duckduckgo. But at the time I stopped using google, one of the reasons I stopped was that the quality was so low.
Hell, the quality of Google is so low that Bing is actually running ads right now with blind taste tests where people preferred bing. Of course this isn't scientific at all, but my point is-- no big scandal has erupted about how wrong this is. It's totally plausible for bing to do this because everyone realizes that google has gotten to the point where microsoft can plausibly compete with them!
Page rank was really cutting edge, but that was 10 years ago, yet it is still their primary mechanism. It's been gamed, but they seem uninterested in moving to more sophisticated mechanisms (they use them but the influence of better methods seems to be too low) ... meanwhile they've used their bully pulpit to influence the web to conserve page juice which has backfired in such a way that actual links to authoritative and useful sites are lower ranked than spam links, making it easier to game. (When wikipedia is using no-follow on relevant outbound links to pages that wikipedia is quoting or citing, things are fundamentally broken- no site on the web has a more favored ranking position than wikipedia. Not to mention hand curation of pages. You can't even correct errors there without having them reverted by some know-nothing whose sole accomplishment is rising in the ranks of wikipedia editors, so its not like they need this to prevent spam.)
This means the site that google unquestionably considers the most authoritative, when it cites a page that it considers authoritative, google gives that site no credibility. But let me create a web of sites that construct text that passes grammer parsers as "good english" but whose purpose is to spam keywords and link to each other and I can rank for those terms up close to wikipedia. (This is essentially what techcrunch is doing only they are having humans write low quality text instead of a computer.)
I disagree with a number of your points. I've noticed a huge reduction in the amount of spammy content I see in google results (over the last year, maybe), to the point that I actually make "productName reviews" searches again.
Your focus on pagerank is at least somewhat off base, considering that it's only a factor in ranking as noted by someone below, and it's a ranking that pretty much all the search engines use as well ("The Bing ranking algorithm analyzes many factors, including but not limited to: ... the number, relevance, and authoritative quality of websites that link to your webpages").
There is something to be said for no-follow links being a symptom of something broken, but, OTOH, page rank is still a good indicator of what people out on the web find to be useful and relevant content, allowing you to find popular content, cluster it by subject, etc...essentially crowd sourcing (a portion of) relevancy via something people do anyways. Gaming was inevitable, and no-follow is really more of a way to disincentivize spammers...the fact that with no-follow you get the spammers anyways (to get human eyeballs instead of crawlers') demonstrates that the motivation is always there. If a search engine trusts wikipedia's outbound links they don't have to obey no-follow in any case, but you still have the situation that everyone will have their own favorite "impartial" external links to add, not to mention people with a vested interest in the subject.
The possibility you forget in your "microsoft can plausibly compete with [google]" point (leaving aside the fact that most people are just ignoring it) was that bing has improved, and that has nothing to do with google breaking anything.
DDG works great for general searches, but for anything technical, I can plan on doing the DDG search and following up with "g!". I have DDG set as my default search engine and get tired of the double searches, but I want DDG to work better.
PageRank is no longer Google's primary mechanism. In fact, there is no single primary mechanism. Google has stated that the ranking you see when you search is the product of over 200 contributing factors.
I think the 'personalization' they have been doing is the primary culprit.
When I search for technical things or related to the news I feel like I can see the 'tint' actually. It's like google adds certain keywords to my searches, or reduces it to a much smaller subset.
When I search for something, sometimes it's because I want to find something I saw a while ago, but sometimes it is to get a new perspective on things. When you search for something and see the same opinion for the first 10 results you can tell how skewed it is.
Now I have to manually add 'criticism' or 'failure' to certain searches, or 'success' even. It's just weird.
In the Bing It On Challenge from Bing I picked Google 5 out of 5. Then I posted that on Facebook/Google+ as well, and more and more friends started posting telling me that they felt the exact same way AND that they too were getting better results from Google then Bing.
Yes, it is not a scientific study, but Bing's marketing isn't entire scientific in what "nearly 2 to 1" means either. Nearly 2 can be 1.5. Hell, ceil(1.1) is 2.
Google's search results have been improving more and more over time, for my technical searches (specifically related to programming) no other search engine even comes close in getting me the results I want. I do have Google's Web History turned on, which most likely allows Google to improve their search results better to what I will most likely want.
This is the failure of google. I stopped using google about 6 months ago and started using duckduckgo. But at the time I stopped using google, one of the reasons I stopped was that the quality was so low.
There is a difference between quality and freshness. I agree some SERPS on DDG look higher quality, but then when you dig down in the results you find out why: They are all safe choices. They could be pages from 2005. They could be pages that were once authoritative, but now lack topicality and news.
Hell, the quality of Google is so low that Bing is actually running ads right now with blind taste tests where people preferred bing
This is more marketing than research.
Page rank was really cutting edge, but that was 10 years ago, yet it is still their primary mechanism
It is one of 200 factors. Also there is internal pagerank and world-visible pagerank. Besides Google has been doing a lot with author rank and mentions.
It's been gamed, but they seem uninterested in moving to more sophisticated mechanisms (they use them but the influence of better methods seems to be too low)
Latent semantic indexing, query deserves diversity, query deserves freshness, detecting spam by following links in spam emails etc. There is no shortage of sophisticated methods.
meanwhile they've used their bully pulpit to influence the web to conserve page juice which has backfired in such a way that actual links to authoritative and useful sites are lower ranked than spam links, making it easier to game
Pagerank hoarding is an old and crummy idea. Google webmaster guidelines even say it is not a good idea to hoard pagerank, as it reeks of manipulation. There is also a decay factor.
Even when I joined the company in 2000, Google was doing
more sophisticated link computation than you would observe
from the classic PageRank papers. If you believe that
Google stopped innovating in link analysis, that’s a
flawed assumption.
Spam links are hell-banned by manual and algorithmic review.
(When wikipedia is using no-follow on relevant outbound links to pages that wikipedia is quoting or citing, things are fundamentally broken-
Spammers still exist. Spammers try to game healthy systems. Search engines are not broken, because Wikipedia tries to combat spammers... And links are not all there is. Googlebot still follows nofollow Twitter links. Mentions (words without links) are still worth a popularity vote for the things or people they mention.
no site on the web has a more favored ranking position than wikipedia. Not to mention hand curation of pages. You can't even correct errors there without having them reverted by some know-nothing whose sole accomplishment is rising in the ranks of wikipedia editors, so its not like they need this to prevent spam.)
It is about adding spammy external sources. If Wikipedia links were dofollow, much more sources would be added, not because they are good sources, but because they would do well with marketing. Wikipedia is a shining example of a site that gets lots of inbound links, mentions, great content and top notch internal linking.
This means the site that google unquestionably considers the most authoritative, when it cites a page that it considers authoritative, google gives that site no credibility.
If all Wikipedia cites were worthless, no one would gain an unfair advantage over gaming Wikipedia. But Wikipedia cites are not worthless. If you Google for a company A and B and only company A appears on Wikipedia, what do think of the quality difference between company A and B? If company A has 10.000 search results and company B has 1000 search results, what does that say about the reach (social proof) of company A? Also, like the mention-algorithm, check out: http://www.seobythesea.com/2012/01/named-entity-detection-in... (Entity detection). Finally: Not only who links to you counts for your quality/popularity, but also who you link to. Pages get rewarded for linking to quality resources.
But let me create a web of sites that construct text that passes grammer parsers as "good english" but whose purpose is to spam keywords and link to each other and I can rank for those terms up close to wikipedia. (This is essentially what techcrunch is doing only they are having humans write low quality text instead of a computer.)
Reading level and quality of journalism on Techcrunch aside: The article talked of once-been blogs, who cling to their gained reputation, to produce spammy content. Firstly: They are playing with fire. Google or their users could say: Enough is enough, you just lost your reputation/got hit with Panda. Then they are just another spammy/low-quality blog ranking somewhere around #1024. Secondly: Blogs like Techcrunch have a big company and money behind them. They organize offline events, get mentioned in newspapers or are the root start of an online discussion about a start-up or SF drama. They employ well-known writers. All things equal, it would be bad for Google to rank Techcrunch under a single-author amateur blog that started out last month. Even in lieu of high quality, Techcrunch is relevant and popular.
It's broken, and google broke it.
It is how it is. Use it to your advantage. Keep adding new fresh content and enjoy your pageviews. I know Bing isn't sending them my way...
Hell, the quality of Google is so low that Bing is actually running ads right now with blind taste tests where people preferred bing. Of course this isn't scientific at all, but my point is-- no big scandal has erupted about how wrong this is. It's totally plausible for bing to do this because everyone realizes that google has gotten to the point where microsoft can plausibly compete with them!
Page rank was really cutting edge, but that was 10 years ago, yet it is still their primary mechanism. It's been gamed, but they seem uninterested in moving to more sophisticated mechanisms (they use them but the influence of better methods seems to be too low) ... meanwhile they've used their bully pulpit to influence the web to conserve page juice which has backfired in such a way that actual links to authoritative and useful sites are lower ranked than spam links, making it easier to game. (When wikipedia is using no-follow on relevant outbound links to pages that wikipedia is quoting or citing, things are fundamentally broken- no site on the web has a more favored ranking position than wikipedia. Not to mention hand curation of pages. You can't even correct errors there without having them reverted by some know-nothing whose sole accomplishment is rising in the ranks of wikipedia editors, so its not like they need this to prevent spam.)
This means the site that google unquestionably considers the most authoritative, when it cites a page that it considers authoritative, google gives that site no credibility. But let me create a web of sites that construct text that passes grammer parsers as "good english" but whose purpose is to spam keywords and link to each other and I can rank for those terms up close to wikipedia. (This is essentially what techcrunch is doing only they are having humans write low quality text instead of a computer.)
It's broken, and google broke it.