I’m working on this as well, but with a different set of content types. I want my personal search to primarily support second-brain functionality for creative work, so focusing on indexing:
- podcasts
- self-authored internet comments (Reddit, Hackernews)
- podcasts
- self-authored internet comments (Reddit, Hackernews)
- books
- articles
- code
- music
- lectures
Started it at http://rememex.org