It's really just a bandwidth issue - VFX studios do this all the time with their renderfarms - textures are the main issue - prod/archviz like Ikea stuff are generally really clean and don't have THAT many textures - whereas in VFX everything's dirty and generally very detailed so you're generally pulling in >300GB of textures per medium level scene.
And at least in VFX everything's generally done lazily so you only read textures as and when you need them if they're not cached already - there's a bit of overhead to doing this (locking if a global cache, or duplicate memory if per-thread cache which is faster as no locking), but it solves the problem very nicely and on top of that the textures are mipmapped so for things like diffuse rays you only need to pull in the very low-res approximations of the image instead of say 8K images and point-sampling them, so this helps a lot too...
And at least in VFX everything's generally done lazily so you only read textures as and when you need them if they're not cached already - there's a bit of overhead to doing this (locking if a global cache, or duplicate memory if per-thread cache which is faster as no locking), but it solves the problem very nicely and on top of that the textures are mipmapped so for things like diffuse rays you only need to pull in the very low-res approximations of the image instead of say 8K images and point-sampling them, so this helps a lot too...