Actually, out-of-order instruction pipelines are limited by this problem because the more instructions you allow in-flight, the more stages in the pipeline. The more stages in the pipeline, the more physical distance information much travel on the chip. So lengthening the instruction pipeline runs into fundamental problems.
Also, the number of transistors in the path to switching the right bits from memory (even cache) is also getting comparable to the length of pipeline sections.
Hence the move to multicore.