A lot of nonsense in this article, but I'll try to address his points: (1) There...

Nokinside · on Feb 2, 2021

BOOMv is not high-end, even reasonably.

High-end means building and optimizing miroarchitecture to specific node and prosess technology. Typically it costs $100-200 million to do and it's not high-end after 3 years after anymore.

ekiwi · on Feb 2, 2021

This is the latest paper on BOOM: http://people.eecs.berkeley.edu/~krste/papers/SonicBOOM-CARR...

It has some comparisons so you can judge for yourself. Of course, since this is a project driven by only 2-3 grad students it isn't completely fleshed out. However, you would assume that if the ISA just wasn't suitable for high performance implementations that a project like BOOM would have uncovered that by now.

PaulHoule · on Feb 2, 2021

Right, every so often BOOM's Github page gets reposted as if it was something new, and some of the leadership of RISC-V will talk about this architectural feature or that architectural feature that might support a high performance RISC-V chip, but the proof of a high-performance chip is a high-performance chip.

(The experience of Intel's Itanium, IBM's Cell processor and many others shows that it's not enough to have a few good ideas but you have to have ZERO bad ideas that slow you down to get a high performance design.)

monocasa · on Feb 2, 2021

I think x86 shows that you don't have to have zero bad ideas in order to get a high performance design.

PaulHoule · on Feb 2, 2021

You have to have zero bad ideas that set a ceiling on performance... e.g. you have clear ALL the bottlenecks out of the way.

Just doing something about the one potential bottleneck that you feel like doing something about doesn't necessarily get you a gain in performance at all.

monocasa · on Feb 2, 2021

Can you point out the specific bottlenecks that hampered Itanium and Cell?

PaulHoule · on Feb 2, 2021

Itanium's problem I know pretty well.

The idea of scheduling parallelism in the compiler (VLIW) doesn't work for mainstream workloads because the time it takes for data to come back from the DRAM is highly variable.

A super-scalar processor can possibly run some instructions at the hardware level while other block wait until data gets back from DRAM.

A VLIW processor packs N instructions together (say N=3) and if one of them is blocked by DRAM, they all block. (If one of them is blocked by Optane they all block for a very long time...)

It looks obvious in retrospect but it's amazing how most of the RISC workstation vendors missed it and put themselves out of business by getting on the Itanic train.

(VLIW is successful for DSP and GPU, but that's because workloads like that can have completely predictable fetches)

I don't know what the problem w/ Cell was exactly, but it was the same in that it couldn't pull data from DRAM fast enough to keep the silicon busy.

monocasa · on Feb 2, 2021

With Itanium, they had advanced loads to decouple the pipeline from unknown memory latencies.

Also Itanium was a heavily superscalar design; I think you meant out of order.

Cell just plain didn't really let you directly address main RAM, so you never saw unknown DRAM accesses stalling the cores. Access to the local memory was always single cycle.

In neither case was unknown RAM latencies an issue with the design.

rwmj · on Feb 2, 2021

I agree. OTOH if you're paying $100 millions are you really worried about having a royalty-free design?

ghaff · on Feb 2, 2021

That's probably the main bear argument concerning RISC-V--at least as it applies to the West. ARM has a big ecosystem, it licenses at reasonable rates, so if you want to design a chip, cost-savings associated with RISC-V are in the noise. (And it's at least a bit unclear what other benefits associated with open source software come into play.)

guerby · on Feb 2, 2021

riscv proponents said that ARM licence rate might be reasonable, but negociation time is (or was at the time riscv was just a concept) not reasonable at all.

And time to market is quite important in this area.

eeZah7Ux · on Feb 2, 2021

Also:

- ARM might license you cores for years and suddenly stop

- you might want to switch to a different vendor but all your code, tooling and knowledge is locked-in

- there is no guarantee that ARM will negotiate with you. Especially true for small companies, community projects, embargoed countries.

- ARM will not allow to relicense their "IP" to 3rd parties, either paid of for free

cestith · on Feb 2, 2021

At some point if it catches on, being open and royalty free, I should be able to call Global Foundries, TSMC, or Samsung and say "I want 300,000 of your RISC-V chips in 64 bit with X list of extensions built on your Y nm process".

kelp · on Feb 2, 2021

Your 3rd point also jumped out at me as I read the post. Seemed like a classic case of someone arguing that a coming disruptor was too low end to be a competitive threat.

908B64B197 · on Feb 2, 2021

My understanding was that RISC-V was created as a teaching tool at Berkley so that students and researchers could have a completely open and modern architecture to study. I assume part of the reasons it's a (very) simple core with a lot of small extensions is to encourage incremental development by students. Not needing multiplication right away is a feature here.

I don't think there was ever a commercial goal in mind.

hajile · on Feb 5, 2021

There’s other reasons. Let’s say you’re building a Larabee-style GPU. You want a simple core with a large set of SIMD units (along with some custom extensions for the hardcoded parts of the pipeline). Adding a multiplier and divider to your core would kill its usability either by becoming non-standard or by wasting huge amounts of space and time.

Multiplications and divisions happen out of band on most chips (that is, on the side) because they take so many clock cycles to complete. All the extra synchronization takes a lot of extra work.

The RISCV ISA is seeing an uptick in the embedded space where it can be very efficient in core size and thus cost. Here too, requiring those units would hurt the market. Meanwhile, it can reasonably be assumed that any desktop-grade CPU will have those extensions just like it’s assumed that desktop x86 chips all have SSE.

Speaking of that, there are over twenty x86 extensions released since 2000. There are 8 or so in just the last 5 years. Despite this, life goes on because the tools that make handing this easy have existed for decades.

Zigurd · on Feb 2, 2021

Is there an effort to form a consensus around peripherals and driver support for the most widely used OSs?

rwmj · on Feb 2, 2021

To some extent these kinds of things are discussed in the Unix platform spec WG: https://lists.riscv.org/g/tech-unixplatformspec