Every single one of these results carries the `html` filetype as part of their URL is my experience. This is likely a consequence of the useragent-based switcheroo technique they use to fool Google.
Just blanket block the lot with the following uBlock Origin filter:
Blanket banning a whole TLD is stupid. One thing is blocking some obscure stuff like ".su", but .it? It's just too big, and arguably unwise if you are in Europe where having to connect to Italian websites or services isn't a remote possibility.
yeah, right, unless you're american, why should you care about .com domains?
¯\_(ツ)_/¯
the problem is not .it domains, it's clearly stated in the linked post
A large number of spam pages are indexed when searching by our product name.
It’s very similar to Japanese Keyword hack, but the difference is that our site is not hacked
so it's definitely an indexing issue, those .it domains are being indexed for the Japanese word hack for some reason, it's not that .it domains are particularly spammy per se.
Your "solution" would filter the vast minority of the abusers at the cost of banning an entire TLD, not much different than turning off the internet connection entirely.
Most of the spam on the internet comes from .com domains though, even more so because registering a .com domain is much easier than getting an .it
> Your "solution" would filter the vast minority of the abusers at the cost of banning an entire TLD, not much different than turning off the internet connection entirely.
Again, we’re talking about client-side filtering. The original comment about blocking .it domains was talking about a uBlock Origin rule. No one’s talking about blocking .it domains from the web.
Yes, as an American, I could block all .it domains on my end and my web experience likely wouldn’t change at all. I rarely, if ever, need to visit .it domains. So maybe I will.
This visually hides the HTML elements on Google Search and for me only. There is no networking involved and so Italian TLDs are still reachable.
This is a personal solution to an extremely disruptive and long standing problem, and only affects those who choose to employ it. It's not hurting anyone.
Nah. I've been reading the docs on Spatialite (the spatial extension for SQLite) at http://www.gaia-gis.it/ the last couple days. It has both a "spam" TLD and a design from 1998.
In addition to this if one runs unbound as their DNS on their home router and they block DoH then one could add
local-zone: "it" always_nxdomain
to NXDOMAIN all requests for the .it TLD and protect non browser devices. I use this method to stay off sanctioned country TLD's and to remove the cheap/free spammy domains and TLD's that often contain more malware than anything useful.
Browsers and other programs can use the User-Agent[1] header to send along a bit of information about themselves with each request.
This and other information is then used to filter out various types of visitor.
In this case, requests claiming to be a Google Search crawler will receive a boring page with lots of text that it can index and use as search results.
Most browsers' devtools let you change your user-agent string, and a listing of the ones used by Google crawlers is publicly available. Not saying that you should, but you could check this out for yourself... entirely at your own risk of course :)
Just blanket block the lot with the following uBlock Origin filter:
Google ain't going to fix itself ;)