Immersion cooling is getting big. At the last Supercomputing conference I probably saw at least a dozen vendors of immersion cooling equipment. My datacenter has one cluster with liquid cooling caps over the sockets, and two immersed clusters. The latter two have basins of various degrees of sophistication under them for when they do spring a leak.
Not quite. Every modern BLAS is (likely) based on Kazushige Goto's implementation, and he was indeed at TACC for a while. But probably the best open source implementation "BLIS" is from UT Austin, but not connected to TACC.
You do a lot of scare quotes. Do you have any suggestions on how things could be different? You need batch jobs because the scheduler has to wait for resources to be available. It's kinda like Tetris in processor/time space. (In fact, that's my personal "proof" that workload scheduling is NP-complete: it's isomorphic to Tetris.)
And what's wrong with shell scripts? It's a lingua franca, generally accepted across scientific disciplines, cluster vendors, workload managers, .... Considering the complexity of some setups (copy data to node-local file systems; run multiple programs, post-process results, ... ) I don't see how you could set up things other than in some scripting language. And then unix shell scripts are not the worst idea.
Debugging failures: yeah. Too many levels where something can go wrong, and it can be a pain to debug. Still, your average cluster processes a few million jobs in its lifetime. If more than a microscopic portion of that would fail, computing centers would need way more personnel than they have.
When used as configuration? Here are some things that are wrong:
* Configuration forced into a single line makes writing long lines inconvenient (for example, if you want Slurm with Pyxis, and you need to specify the image name -- it will most likely not fit on the screen.
* Oh, and since we mentioning Pyxis -- their image names have pound sign in them, and now you also need to figure out how to escape it, because for some reason if used literally it breaks the comments parser.
* No syntax highlighting (because it's all comments).
* No way to create more complex configuration, i.e. no way to have any types other than strings, no way to have variables, no way to have collections of things.
* No way to reuse configuration (you have to copy it from one job file to another). I honestly don't even know what happens if you try to source a job configuration file from another job configuration.
All in all, it's really hard to imagine a worse configuration format. This sounds like a solution from some sort of a code-golfing competition where the goal was to make it as bad as possible, while still retaining some shreds of functionality.
No. Please don't. The book still gets updated regularly, so any copy that is not straight from the repository will get quickly out of date. (I first published this book 6 years ago. You can find pdf copies out on the intertubes that are 200 pages shorter than the most recent version.)
Also, if you link straight to the pdf you don't get to see links to my other books.
Or links to places where you can get a paper copy. Which actually earns me a couple of pennies.
So please: don't make your own link to the pdf file. Don't.
Basics of what? High performance computing? I'd say those are numerical analysis topics and there are tons of books for that. Unless you can make a case that there are high performance aspects to root finding, I'm not going to include it. (You should have said FFT. That has very funky interaction with caches and TBL that absolutely necessitate its inclusion.)
Contact me if you want to to discuss the outline of a short section with me. My reason for not adding the hyperbolic case was that it didn't seem to add much computationally to the discussion.
1. Do you know how much time it takes to keep a 600 page book up to the minute?
2. But yeah. I'm going to roll the tutorials into a volume of their own.