Every single one of these results carries the `html` filetype as part of their U...

qalmakka · on July 23, 2022

Blanket banning a whole TLD is stupid. One thing is blocking some obscure stuff like ".su", but .it? It's just too big, and arguably unwise if you are in Europe where having to connect to Italian websites or services isn't a remote possibility.

krono · on July 23, 2022

This merely hides Google search results in my browser.

No network connections are blocked...

qalmakka · on July 25, 2022

Yes, you hid all Italian Google search results - arguably not an ideal solution.

permo-w · on July 23, 2022

I'm sure there are plenty of non-spam html pages based in Italy too

krono · on July 23, 2022

Considering the crowd that trade-off that seemed too obvious to mention.

peoplefromibiza · on July 23, 2022

cool!

now s/\.it/every TLD/ and you solved domain spam forever.

/s

You might not know that 99.99% of .it domains with urls ending up in .html are completely legit, including some official government one.

nkrisc · on July 23, 2022

Since uBlock is run on the client, unless you’re Italian or interested in Italian sites it doesn’t really seem like much of an issue.

I could block all .it sites on my network and I’d likely never even notice.

peoplefromibiza · on July 23, 2022

yeah, right, unless you're american, why should you care about .com domains?

  ¯\_(ツ)_/¯

the problem is not .it domains, it's clearly stated in the linked post

A large number of spam pages are indexed when searching by our product name. It’s very similar to Japanese Keyword hack, but the difference is that our site is not hacked

so it's definitely an indexing issue, those .it domains are being indexed for the Japanese word hack for some reason, it's not that .it domains are particularly spammy per se.

Your "solution" would filter the vast minority of the abusers at the cost of banning an entire TLD, not much different than turning off the internet connection entirely.

Most of the spam on the internet comes from .com domains though, even more so because registering a .com domain is much easier than getting an .it

Are you willing to ban .com too?

nkrisc · on July 23, 2022

> Your "solution" would filter the vast minority of the abusers at the cost of banning an entire TLD, not much different than turning off the internet connection entirely.

Again, we’re talking about client-side filtering. The original comment about blocking .it domains was talking about a uBlock Origin rule. No one’s talking about blocking .it domains from the web.

Yes, as an American, I could block all .it domains on my end and my web experience likely wouldn’t change at all. I rarely, if ever, need to visit .it domains. So maybe I will.

krono · on July 23, 2022

This visually hides the HTML elements on Google Search and for me only. There is no networking involved and so Italian TLDs are still reachable.

This is a personal solution to an extremely disruptive and long standing problem, and only affects those who choose to employ it. It's not hurting anyone.

kzrdude · on July 23, 2022

.com implies spam - it's commercial, so let's go ahead. If it's not .org I'm not playing. /s

plank · on July 23, 2022

And yet, here you are, and not on ycombinator.org? ;-)

tbran · on July 23, 2022

Nah. I've been reading the docs on Spatialite (the spatial extension for SQLite) at http://www.gaia-gis.it/ the last couple days. It has both a "spam" TLD and a design from 1998.

fijiaarone · on July 23, 2022

But not many of the official government ones.

peoplefromibiza · on July 23, 2022

official government in Italy also means cities, towns, hospitals, universities, public schools etc

There are 8 thousands towns in Italy, each with their own .it website.

Bender · on July 23, 2022

In addition to this if one runs unbound as their DNS on their home router and they block DoH then one could add

    local-zone: "it" always_nxdomain

to NXDOMAIN all requests for the .it TLD and protect non browser devices. I use this method to stay off sanctioned country TLD's and to remove the cheap/free spammy domains and TLD's that often contain more malware than anything useful.

ale · on July 23, 2022

What’s this useragent switcheroo?

krono · on July 23, 2022

Browsers and other programs can use the User-Agent[1] header to send along a bit of information about themselves with each request.

This and other information is then used to filter out various types of visitor.

In this case, requests claiming to be a Google Search crawler will receive a boring page with lots of text that it can index and use as search results.

Most browsers' devtools let you change your user-agent string, and a listing of the ones used by Google crawlers is publicly available. Not saying that you should, but you could check this out for yourself... entirely at your own risk of course :)

https://en.wikipedia.org/wiki/User_agent

https://developers.google.com/search/docs/advanced/crawling/...

jacooper · on July 23, 2022

Or use Brave search, which honestly from my experience is much better.