Hacker Newsnew | past | comments | ask | show | jobs | submit | ryeguy_24's commentslogin

How many proprietary use cases truly need pre-training or even fine-tuning as opposed to RAG approach? And at what point does it make sense to pre-train/fine tune? Curious.

I'm thinking stuff like this:

https://denverite.com/2026/03/12/ai-recycling-facility-comme...

You could take a model like the one referenced in the article, retool it with Forge for oh I don't know, compost, and use it to flag batches that contain too much paper for instance.

These kinds of applications would work across industries, basically anywhere where you have a documented process and can stand to have automated oversight.


You can fine tune small, very fast and cheap to run specialized models ie. to react to logs, tool use and domain knowledge, possibly removing network llm comms altogether etc.

rag basically gives the llm a bunch of documents to search thru for the answer. What it doesn't do is make the algorithm any better. pre-training and fine-tunning improve the llm abaility to reason about your task.

RAG is dead

Using tools and skills to retrieve data or files is anything but dead.

I think people just mean "using vector databases to enable RAG".

Even that doesn't make sense. Why would you not build a vector database to complement your RAG engine?

For coding use cases you may want a way to search for symbols themselves or do a plain text exact match for the name of a symbol to find the relevant documents to include. There is more to searching than building a basic similarity search.

Sorry but who mentioned coding as a use-case? My comment was general and not specific to the coding use-case, and I don't understand where did you get the idea from that I am arguing that building a similarity search engine would be a substitute to the symbol-search engine or that symbol-search is inferior to the similarity-search? Please don't put words into my mouth. My question was genuine without making any presumptions.

Even with the coding use-case you would still likely want to build a similarity search engine because searching through plain symbols isn't enough to build a contextual understanding of higher-level concepts in the code.


I mentioned coding as a use case in my comment you replied to. You were asking for an example for when one wouldn't use vector search and I provided one. I did not say similarity search would be a substitute. I said that for the coding case you do not need it.

>you would still likely want to build a similarity search engine

In practice tools like Claude Code, Codex, Gemini, Kimi Code, etc are getting away with searching for code with grep / find and understanding code by loading a sufficient amount of code into the context window. It is sufficient to understand higher level concepts in the code. The extra complexity of maintaining vector database top of this is not free and requires extra complexity.


In your point you said "There is more to searching than building a basic similarity search." which assumed and implied all kinds of things and which was completely unnecessary.

> In practice tools like Claude Code, Codex, Gemini, Kimi Code, etc are getting away with searching for code with grep / find and understanding code by loading a sufficient amount of code into the context window

Getting away is the formulation I would use as well. "Sufficient amount" OTOH is arguable and subjective. What suffices in one usage example, it does not in another, so the perception of how sufficient it really is depends on the usage patterns, e.g. type and size of the codebases and actual queries asked.

The crux of the problem is what amount and what parts of the codebase do you want to load into the context while not blowing up the context and while still maintaining the capability of the model to be able to reason about the codebase correctly.

And I find it hard to argue that building the vector database would not help exactly in that problem.


In what, X's hype circles? Embeddings are used in production constantly.

And yet your blog says you think NFTs are alive. Curious.

But seriously, RAG/retrieval is thriving. It'll be part of the mix alongside long context, reranking, and tool-based context assembly for the forseeable future.


I don't think RAG is dead, and I don't think NFTs have any use and think that they are completely dead.

But the OP's blog is more about ZK than about NFTs, and crypto is the only place funding work on ZK. It's kind of a devil's bargain, but I've taken crypto money to work on privacy preserving tech before and would again.


The issue I had with RAG when I tried building our own internal chat/knowledge bot was pulling in the relevant knowledge before sending to the LLM. Domain questions like "What is Cat Block B?" are common and, for a human, provide all the context that is needed for someone to answer within our org. But vectorizing that and then finding matching knowledge produced so many false positives. I tried to circumvent that by adding custom weighting based on keywords, source (Confluence, Teams, Email), but it just seemed unreliable. This was probably a year ago and, admittedly, I was diving in head first without truly understanding RAG end to end.

Being able to just train a model on all of our domain knowledge would, I imagine, produce much better results.


I have no interest in anything crypto, but they are making a proposal about NFTs tied to AI (LLMs and verifiable machine learning) so they can make ownership decisions.

So it'd be alive in the making decisions sense, not in a "the technology is thriving" sense.


Not OP, but...

> Of course you would have to set a temperature of 0 to prevent abuse from the operator, and also assume that an operator has access to the pre-prompt

Doesn't the fact that LLM's are still non-deterministic with a 0 temperature render all of this moot? And why was I compelled to read a random blog post on the unsolved issue of validating natural language? It's a SQL injection except without a predetermined syntax to validate against, and thus a NP problem we've yet to solve.


BTW this is what an ad hominem is, when you can’t find a flaw in an argument you find something else unrelated to attack

Just after that extremely gentle poke about a grift that died many years ago, you'll be pleased to see that I address the very silly claim about RAG in a straightforward, ad rem way.

Wait, what does NFTs have to do with RAG?

Nothing, I think they're just pointing out a seeming lack of awareness of what really is or isn't dead.

They were doing an ad hominem, thats what its called

[flagged]


Have you read the post?

Is it??

I still use AirPods for listening but if I’m ever taking a call, I always use EarPods (USB-C). The microphone quality is multiple times better and that’s important to me. Especially for work. It only took me a few times to hear other people will AirPods to be tainted. It just seems unprofessional now because of how bad it sounds.

Mostly same story. Tinkered for hours with Windows 3.1 floppy disks. Reinstalling OS’s all the time because I’d break stuff or I’d just want a fresh slate. I loved pushing the boundaries. In my 30’s I slowed down with the tinkering because of life (kids, work). I thought I lost the ability to tinker. But recently at 42, I bought a MacBook for the sole purpose of tinkering on the couch at the end of the day (basically after being on computer the whole day, I didn’t want to be in office anymore). And slowly, it’s coming back. I’m playing with new things, learning about Neural Networks, learning about Softare Defined Radio, installing tons of random libraries and tools to test that out. It’s coming back. Keep pushing on it and hopefully it returns for you too!

Has anyone tried to use a laptop at night? It’s pretty hard without lit keys. Maybe this one has some super reflective letters so that the screen lights them up.


Skilled computer usage includes learning to type without looking at the keyboard


Just turn on a light?


I have light fixtures at home.


What expense app are you building? I really want an app that helps me categorize transactions for budgeting purposes. Any recommendations?


My advice is a little different. It’s make the life of your boss insanely easy. Similar in nature to post but slightly different optimization function. Don’t over communicate, communicate just the right amount. Anticipate questions. Don’t create any friction for them and be really helpful. Some of my people will anticipate things and be proactive. I love that and I constantly push to get them promoted.


Ive adopted this mindset recently and it really does work. That being said i feel it turns me into a bit of a “yes man”. I wish there was more room for more of my authentic personality


When I read the publications (the ACM magazine), I swear sometimes the content feels LLM generated. Does anyone else get that impression? In general, I'm not very impressed with the content (I'm used to WIRED, btw).


The way I think of it (might be wrong) but basically a model that has similar sensors to humans (eyes, ears) and has action-oriented outputs with some objective function (a goal to optimize against). I think autopilot is the closest to world models in that they have eyes, they have ability to interact with the world (go different directions) and see the response.


This is a very smart idea. I couldn't turn my Ring Alarm off and I was on the same Wifi connection as the system. In retrospect, it would be quite smart to switch over to local network.


Does anyone have this mystical report?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: