These places on the "independent web" are sadly getting more and more obscure to find via search engines.
It's an unfortunate reflection of the "mainstream web" that I feel compelled to remark on the use of simple HTML and CSS, only marred by a single Google Analytics script. There's not a lot of content here, but there's also not a lot of bloat either.
not really relatated at all to my search, but I'm very glad to find it. A beutiful looking site indeed. This is the kind of stuff I remember doing back in the 90s, you'd search for something then something more random would show up but often be interesting to check out.
Loved this.. Landed on someone's site that promoted "Game design for websites"...
"OK, you've created your web site, registered it, then found that there are 341 other web sites in your search engine category. How do you distinguish your web site from all of the rest? How do you encourage visitors to remember and revisit your web site? Add a games area!"
Holy hell this is incredible. I've wanted exactly this to exist for years and now it does. Does Wiby still crawl the web? How can we give a project like this wings? It restores the true Internet.
It seems that you have to submit pages to wiby, and it’s URL and all subpages will then be indexed if they follow their rukes. You can talk about it to others and you can donate, that’s probably all this project needs.
Rules are written on their submit page [1]. They prefer non-commercial, content-focused websites with lightweight HTML. The submit page also suggests they don't do link crawling, but only allow submission of individual URLs.
I love the idea, but it seems quite hacker-oriented (for a lack of a better term). Most searches will return something UNIX or hardware related (e.g. "table tennis").
That's true, but on the other hand, I find myself steadily finding them anyway (many on HN). This site is obviously not for mass consumption by the general public, so it's OK that it isn't highly ranked by Google. Google is for the average user. Sites like this are discovered by their target audience and spread largely by wires of mouth.
> This site is obviously not for mass consumption by the general public, so it's OK that it isn't highly ranked by Google. Google is for the average user. Sites like this are discovered by their target audience and spread largely by wires of mouth.
You find some really good in-depth content for niche areas on these sort of sites (personal HTML websites, neocities.org, etc.). But if you forget the URL, it's usually very hard to find them again on Google even with using specific keywords. Too much SEO spam clogging up the results. I suspect not having Google Analytics scripts on the pages may also contribute.
One that I can remember off the top of my head was a blog with a pink-ish background with content on making quills and ink from plants, among other stuff. One post specifically had an easy ink recipe that used walnuts iirc. Google will show you tons of big SEO'd Wordpress blogs with ads/affiliate links but not the old school genuine site.
Another one was the blog of a day trader (a real one, not a guy trying to sell day trading courses). Impossible to find. site:wordpress.com helps filter out some of the SEO spam but I think it must have been hosted on a custom domain. It definitely looked like a standard Wordpress website when I read it a long time ago.
I think that reflects a change in the way the web is used.
Anecdotally, I clicked on the monkey.org link, poked around for thirty seconds, found
- an FAQ that didn't answer any questions I had
- a list of users (no particular reason to click on any, so I didn't)
- a list of domains (no particular reason to click on any, so I didn't)
... and no understanding of what "monkey.org" was or is. I've learned more reading the comments on this HN post than I did from the website.
This isn't to say monkey.org should change anything about what it's doing; it seems to be what it wants to be, and that's great. But a search engine isn't going to index that highly, because it generally answers nobody's questions about anything.
Sometimes I try to use DuckDuckGo more for this; to exit Google's filter bubble. I have at times thought the DDG search results are worse, but maybe it's just me unused to be served a tailor-made web, and the actual problem here is that I have grown sloppy and need to work on my search keywords from a "universal web" point of view...
Heh we need to bring back site rings. Those wacky footers where you would link up to a few other sites you liked. I found a lot of cool sites with those back in the late 90s.
Blog rolls were a thing too. Seems like tech blogs (mostly) all got eaten by Stackoverflow.
This is perhaps an unpopular opinion, but why do you think sites such as these deserve any sort of high place on a search engine? The goal of a search engine is to present results which are -useful- to a user. These "independent web" sites are a neat relic at best and completely useless to the everyday user at worst.
Compared to the tens of millions of competing -useful- sites, they are markedly less deserving of a spot. In fact, it would be unfair to award one to them purely because they are "independent".
If Google were a book shop. You'd only see 10 books and usually the same ones. I guess that drive for more unique content comes from an older view of the net that was more ecsentric.
A bookstores a bad analogy. Maybe a fashion store.
If you're a heavy consumer of the net you probably hit the edges of your algorithmic fish bowl pretty regularly. I know I do and it's work to get out.
Actually, I'd so much like to see different things more often I'd consider turning ads back on if the service could be provided.
On the Amazon-CEO news yesterday, someone posted an interview of JB from 1997 [0]. In the first 2-3 minutes he explained exactly that (but for books). There are X million titles on books, they keep a tiny fraction of that in stock for 'next day delivery'. Kinda what Google does (or many search engines). Usually you find the 'most popular' answers on the first page. But when I have a tech problem, I have usually tried the 'first 5 pages' of solutions before I even try to duck it (DDG) for a solution (dear wikihow, yes I did restart my PC!).
I assume HN has the type of readers that already KNOW the first 10 pages of solutions, and thus need that tiny level of detail that one gets after digging deep (stackexchange, forums, etc.)
Question if anyone who knows: does Google search engine give preference to websites that display Google ads, or it is agnostic? I assume Google's crawlers can 'read' if a website is using X, Y, or Z advertiser's ads. (I haven't used Google for many many years and I usually block any google search and ads related IPs and URLs)(and ads in general).
The most accurate (sacrificing probability) answer is "Google's ranking algorithm is proprietary; nobody can independently verify whether they up-rank sites that display Google ads by looking at the source code." Attempts to figure that out from the outside via black-box testing are complicated by the simple ubiquity of Google ads; there's a lot of conflating signal that would have to be tuned out (even if one observes correlation between top search results and Google ads, one has to control for the probability that any random site has Google ads on it, or that ads pay for the SEO and information collection that tends to result in a site that ranks highly in Google results, or even the probability that a site without ads, having found itself high in the rankings because it's delivering data that Google concludes is valuable to users, wouldn't respond to that observation by putting Google ads on their site to capture money otherwise left on the table).
The Pareto principle suggests that the hypothetical Google bookstore, being maximally useful to the querying individual should be showing some ten books for any given query if those books successfully answer 80% of the questions.
A randomizing carousel is a different tool, which Google is not. Nor, to my knowledge, is DDG or Bing... They show a different set of ten books, but the same optimizing goals apply for their use cases.
Is anyone doing something like a search engine that intentionally surfaces rarer data (i.e. makes the tradeoff intentionally of "less likely to satisfy your query, but seen by fewer people")? I'm not aware of one.
What search terms should a site like this show up for? I poked around and all I found was short generic bios and links to social media and photo sites. I don't understand what searches should return a site like this, other than the person's name and some other identifying info (to disambiguate from the other 10 million Pauls).
One reason - because they are often genuine; not like most seo stuffed web with meaningless keywords trying to sell you something, show you a bunch of adds, get you subscribed and spam you etc etc
It's an unfortunate reflection of the "mainstream web" that I feel compelled to remark on the use of simple HTML and CSS, only marred by a single Google Analytics script. There's not a lot of content here, but there's also not a lot of bloat either.