Hacker Newsnew | past | comments | ask | show | jobs | submit | tyushk's commentslogin

quack3.exe again in a way. If it's been done for years on GPU shaders, then why not CPU code?

While highly specific optimisations might give you a tiny bit of advantage, the main boost here is vector code which would work on any processor supporting the instructions. They could have looked at the vendor bits and use those to flag for optimization in any cpu but they didn't and limited it to a small subset of programs and cpus. It tingles the "PR above all else must have highest score" sense.

Intel BOT seems to be patches for specific binaries (hence why they didn't see a difference for Geekbench 6.7), unlike BOLT/Propeller which are for arbitrary programs. The second image from their help page [1] showcases this.

[1] https://www.intel.com/content/www/us/en/support/articles/000...


Applying targeted binary patches shouldn't take 40 seconds... unless that's also a fake "so it looks like it's working really hard" delay.

I might be thinking of a different project then...

I swore Intel had their own PLO tool, but I can only find https://github.com/clearlinux/distribution/issues/2996.


Found it. It was https://www.phoronix.com/news/Intel-Thin-Layout-Optimizer.

It was open source, but has since been deprecated.


See also: Nominative determinism in hospital medicine, by orthopedics Limb, Limb, Limb and Limb

https://publishing.rcseng.ac.uk/doi/10.1308/147363515X141345...


Data tagging? 20k tok/s is at the point where I'd consider running an LLM on data from a column of a database, and these <=100 token problems provide the least chance of hallucination or stupidity.


Code blocks unreadable if the user's system reports dark mode and dark mode is toggled on for the web page.

Cool writeup. Have you had to do any other weird shenanigans with getting FFI between Rust and Clojure other than needing to use CStrings?


In Rust, wouldn't implementing BitOr for Fn/FnOnce/FnMut violate the orphan rule?


I'm envisioning that in Rust (and Python), the operator overload would be on a class/struct. It would be the macro/decorator (the same one that adds logging) which would turn the function definition into an object that implements Fn.


I have done exactly that as an exercise in what you can do with Python: overload |, and a decorator that you can use to on any function to return an instance of a callable class that calls that function and overloads |.

Whether it is a good idea to use it is another matter (it does not feel Pythonic), but it is easy to implement.


somehow this counts like model cot.


I don't think local as it stands with browsers will take off simply from the lead time (of downloading the model), but a new web API for LLMs could change that. Some standard API to communicate with the user's preferred model, abstracting over local inference (like what Chrome does with Gemini Nano (?)) and remote inference (LM Studio or calling out to a provider). This way, every site that wants a language model just has to ask the browser for it, and they'd share weights on-disk across sites.


It sounds good, but I'm not sure that in practice sites will want to "let go" of control this way, knowing that some random model can be used. Usually sites with chatbots want a lot of control over the model behaviour, and spend a lot of time working on how it answers, be it through context control, guardrails or fine tuning and base model selection. Unless everyone standardizes on a single awesome model that everyone agrees is the best for everything, which I don't see happening any time soon, I think this idea is DOA.

Now I could imagine such an API allowing to request a model from huggingface for example, and caching it long term that way, yes just like LM Studio does. But doing this based on some external resource requesting it, vs you doing it purposefully, has major security implications, not to mention not really getting around the lead time problem you mention whenever a new model is requested.


> "Open source" to me is sharing the input required [...]

I don't disagree with your sentiment, I am also more interested in human-written projects, but I'm curious about how this works. Would a new sorting network not be open source if found by a closed source searching program, like AlphaDev? Would code written with a closed source LSP (ie. Pylance) not be open source even if openly licenced? Would a program written in a closed source language like Mojo then be closed source, no matter what the author licences it under? The line between input and tool seems arbitrary at best, and I don't see what freedoms are being restricted by only releasing the generated code.


the line is blurry for shure. Code generated by a closed-source compiler (or LSP) is still 'your' code. Maybe the difference is whether humans can reproduce and learn from the process? With traditional code, you can read commit history and understand the author's thinking. With AI-generated code, that context is lost unless explicitly shared. Food for thought.


I don't think your ultimatum holds. Even assuming LLMs are capable of learning beyond their training data, that just lead back to the purpose of practice in education. Even if you provide a full, unambiguous language spec to a model, and the model were capable of intelligently understanding it, should you expect its performance with your new language to match the petabytes of Python "practice" a model comes with?


Further to this, you can trivially observe two further LLM weaknesses: 1. that LLMs are bad at weird syntax even with a complete description. E.g. writing StandardML and similar languages, or any esolangs. 2. Even with lots of training data, LLMs cannot generalise their output to a shape that doesn’t resemble their training. E.g. ask the LLM to write any nontrivial assembler code like an OS bootstrap.

LLMs aren’t a “superior intelligence” because every abstract concept they “learn” is done so emergently. They understand programming concepts within the scope of languages and tasks that easily map back to those things, and due to finite quantisation they can’t generalise those concepts from first principles. I.e. it can map python to programming concepts, but it can’t map programming concepts to an esoteric language with any amount of reliability. Try doing some prompting and this becomes agonisingly apparent!


Would this be similar to how Rust handles async? The compiler creates a state machine representing every await point and in-scope variables at that point. Resuming the function passes that state machine into another function that matches on the state and continues the async function, returning either another state or a final value.


It's only related in so far as it involves separate storage for the data. I'm thinking of functions that run to completion, not functions that yield and resume, but maybe it's not hard to do coroutines by storing the continuation pointer in the state struct.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: