Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Seeing this has made me realize that I have one HUGE bug with native Postgres full text search.

On my blog, I have a lot of articles about the game series Borderlands. If you type "Borderlands" into the search box, it will find them, but if you type "border lands", it won't. Same with "starcraft" and "star craft", etc.

It looks like I will have to implement trigrams on top of full text search to fix:

https://www.postgresql.org/docs/current/static/pgtrgm.html



Take a look at [0]. I think you can solve your problem with a custom text search dictionary or possibly a thesaurus dictionary. They can substitute or canonicalize documents or search terms and look pretty easy to set up.

[0] https://www.postgresql.org/docs/11/static/textsearch-diction...


The downside with dictionaries is that they need to be continually updated depending on the underlying content. Trigrams seem to be more versatile and hands-off, and they can work with misspelled search terms.


Elasticsearch won’t do that out of the box either. How would you solve this in the general case without causing false positives?


I'm not sure. For now, I'm more concerned about false negatives than (too many) false positives.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: