That video is super informative, thanks for posting it. I now realize I hadn't fully appreciated Kurosawa and I need to go back and watch his movies again.
Just FYI, Kurasawa's favorite filmmaker was Ozu Yasujiro, whose visual style is vastly simpler (using almost no pans or zooms, etc). Wim Wenders 1985 documentary Tokyo-Ga is an excellent introduction (first six minutes is here):
Our brains don't "explode" at movie cuts because there's years of evolution about the techniques that makes a cut appear fluid, essentially keeping a continuity of the movement and position of the main subject of the action.
The photographer and the editor take particular pains to ensure that.
E.g. if you "cross the line" between two actors, you'll see a "jump" because the actors will swap positions.
If you cut in a movement, the next shot will have to show something continuing the movement, or the cut is done when the object stops or is hidden behind something.
That makes the cuts a fluid motion even if the whole scene changes. If you break the rules, even if the scene is the same, you'll feel a "jump" between the shots and it will feel unnatural
I don't think that's what the author meant at all. I think the author meant any sudden change in the visual field, e.g. cutting to the next scene, an outside shot, or whatever. It's an interesting perspective, but probably has a simple answer: when you dart your eyes, you already lose a fraction of a second and have to re-process everything. It's not hard for the brain to put things back together or interpret a totally new scene. In fact it probably only becomes tiring if it's at a strobe-like frequency, and extended for several seconds (dozens of cuts, several per second, sustained over multiple seconds.) That's something that we really don't experience, and which really does become tiring or disorienting as a visual effect.
And once the filmmaker has figured out the rules, they understand that there's precedence to having those kinds of jumps. If a real-life scene is moving quickly and I shut my eyes for a second (or if I am moving and shut my eyes for a second, then open them in a different location), my brain can't 'explode' at that or evolution has failed. Our brain interpolates.
And luckily we are intelligent enough to both understand that a series of pictures displayed on a screen is a representation of a real-life scene and also intelligent enough to understand that we don't have to be moving for it to make sense. Unfortunately, understanding those things still takes quite of bit of subconscious brain power anyway, which is why some movies can be exhausting to watch.
Moreover continuities nowadays are extremely distracting, at least if you know what to look for. Some things like if a chess board completely disappears for a scene pops out to me: https://www.youtube.com/watch?v=ZeMM4hCdDyk
It's the analog hole. If I upload The Simpsons to YouTube, it's taken down immediately. If I upload a video recording of a TV showing The Simpsons, it needs to be taken down by hand.
I also sometimes take still pictures of my computer monitor because there's no good way to take a screenshot of a PC and send it in a picture message to someone without uploading to Imgur or Dropbox or something, which takes time compared to just snapping a pic.
What I would fully support is having TVs with good enough screens that you can't tell the difference between recording them and recording a real-life scene.
But there are literally millions of Simpsons clips on Youtube. Nobody minds as long as they're very short clips.
It's not the look of the screen that bugs me, it's the inability of the person to hold the phone still and stop talking. I suspect that what's supposed to be happening in this scene is that Ace Ventura has swept all the pieces off the chessboard in an excess of accusatory zeal, but I can't tell if there's any sound of chess pieces hitting the floor because the guy keeps talking and waving the phone around.
Yes, people in general are terrible when they're recorded. It's not as bad in real life, in the moment, but watching it on a video is awful. I recorded a fantastic event once that was completely mindblowing. Could not watch the video because everything that happened seemed so awful. Our minds are machines of context and in-the-moment stimulus that has up performing actions that don't hold up to repeated or out-of-context viewing. That's why actors are so highly trained and respected, because even pretending to be yourself is impossible for most people. Recordings are just so unnatural that we have to get ourselves into a different mindset before hitting the red button.
Not only that, but in order to make intelligent inferences from multiple points of observation (a memory of an aggressive animal and the aftermath, combined with a present visualization of a different but similar aggressive animal, and knowledge of one's offspring nearby).. All of these things can be visualized in the mind as a sequence of environmental, first person perspective imagery, with emotional associations and connected knowledge, that comes in the form of feelings, and constructed symbolic representations of reality.
We put things together almost instantaneously without even understanding how we fill in the gaps. The same mechanism occurs when one jumps to conclusions, or understands how one line of a mathematical proof implies that the next line can now be read. It's all absurd and has the potential to be brain exploding when you really sit and try to think about what makes you think that gap is totally filled, but it's also not, because memories and feelings like 'obvious', and also because it's a big part of science to question those gaps.
Brains are weird though, even with all the explanation and science and repetition and predictability in scientific knowledge, it is still 'awe' invoking.
Imagine running through a varied landscape, through and around trees and hills, while looking around every which way to maintain situational awareness. That's a far more challenging task than keeping track of movie cuts, but one that humans do quite well.
Add to that sleeping and waking up, orientating yourself. Even dreaming seems to constantly provide new situations, almost like movie edit cuts, that the mind processes and tests itself with.
I've noticed in recent years that I really cannot parse out quick cuts in movies anymore. For example, I have a very hard time following action sequences in the Marvel movies, and the Transformers movies were just a jumble to me. I mostly don't even bother with trying to watch movies anymore.
These action sequences being a jumbled up mess is a feature, I think. It allows the action on-screen to be visually sloppy and imprecise (thus easier to make), and the audience makes up for it (subconsciously) with their imagination filling in the imperfections with something even better than what they could put on film.
At least, that's what I tell myself. Perhaps we're both just getting old.
You're exactly right. Not only is the ASL (average shot length) decreasing over time, but flurries of quick shots are great for hiding action that doesn't actually happen. It's definitely can be used as a crutch. Not always, it can accelerate the feeling of action, but often feels lazy, and a disservice to the viewer.
e.g. Citizen Kane had an ASL of 12 seconds. Transformers: Dark side of the moon? 3.4seconds! http://www.cinemetrics.lv/database.php Average ASL in 2006 was actually 2.9s.
Comparing to Citizen Kane seems unfair as it's not an action movie. So here are some other action movies. (NB: There are multiple entries for some of these but I'm not listing them all).
Die Hard 4.5
Die Hard 2 3.1
Die Hard 3 1.8
Armageddon 2.2 -- at least he's ben consistently on the quick side
Total Recall 3.7 -- original, 1990
Robocop 3.7 -- original
Haywire 6
And I keep trying to search for movies they don't have in the database. So certainly for action movies (1980+), at least for some popular ones, the ASL isn't out of the ordinary.
But then I searched for Alfred Hitchcock and got some interesting results. This database has many duplicate entries and some wildly different values. The Skin Game had 3 entries at about the same time (17.1, 17.3, 17.4). However, Rear Window has an entry at 3.7 and another at 8.8, and several at 10.5. I suppose I need to read about their methodology to understand why there's so much variance.
It's crowd sourced. They have software that returns timestamps when you click a button and you use it when watching a movie. It's definitely error prone and the data should be treated as if you got it from Mechanical Turk.
There's so much data in movies that is really hard to get automatically. We should really be digitizing and cataloging old films if we're able. It's hard to find films in the public domain but this is what we should be doing with them: 4k scanning and digitizing. Perform a rough automatic restoration on all frames and audio then put it all online, every frame in 4k, and accept pull requests. During this process all shots and scenes should be accounted for. They are timed with actors and locations marked. Subtitles should have crucial metadata such as the actor speaking the lines. Music and musicians should be identified.
That's also part of the reason why films are still shot at 24fps even with advanced cameras. You have fast-paced action scenes at 60fps and it doesn't feel as good as 24fps because you have more information. You take away information and add action blur and it feels more "film-like" and more frantic.
Here's a nice blog post on the fights in JOHN WICK, which are quite fluid. From the viewpoint of a prose writer, but a good take on the visual techniques at work.
I think it's more of a style to avoid actually doing choreography of fights and so on. Cut cut cut and zoom in on random things to give the illusion of action instead of having to do the real work.
Well, we're all getting older, sure, but movies really have gotten less coherent in their action scenes. As a test of your "I'm getting old" hypothesis I once stepped through a confusing fight scene in a modern movie at 25% speed through mplayer, and I still could not work out what was happening coherently. Backgrounds are blurs, cuts are these little tiny slices, it's all sound and fury signifying little.
Compare with, say, Terminator 2. I can't quite derive the exact layout of the factory in the final act, but generally speaking I know who's who, where they are, how close they are to each other, etc., all the basics I really ought to know if I'm to be shown tension instead of cinematically simply told there is tension.
Though the Marvel movies have generally been sensible, if fast-paced, in my opinion, so who knows.
I hate more recent action movies (e.g. Bond movies) for that reason. I can't figure out what's going on. I miss the more sedate pace of 1980's and 1990's action movies. Go back and watch Die Hard or the Fugitive sometime.
Those movies in particular are notorious for being a mess in that regard. It's not just you.
Look for action movies where the actors are able to do their own stunts, particularly when they are famed for it or are even athletes. Those movies tend to cut less, unless the director decides to introduce lots of cuts for stylistic reasons.
I noticed the same in the last Avengers and Superman movies. After a while I had no idea what was going. I am not sure if the action scenes are just some cool looking scenes added to arouse as much emotion as possible without making sense.
Having the sound precede the visual is called a J cut; having the sound (from the previous scene) follow the visual change is called an L cut. This is from the visual appearance of the editing software, where video is in the top half of the sequence editor and audio in the bottom. The shape of the protruding audio clip relative to the vertical video cut loosely resembles the letters J and L.
Cool, I never knew there was a name for that. Watching lost, I would always know a flashback was coming up when I started to hear the ambient noise of the next scene. (Or maybe there was a special flashback sound effect? Some kind of swoosh? Having trouble remembering...)
The technique when talking about screenwriting or film analysis is a prelap. It's the opposite of a bridge, in which the sound from a previous scene carries into the next.
Ostensibly, this is a reflexive design element, though these things can have sublte connotative/implied ideas. Sometimes the prelab audio matches both pre- and post-cut scenes, and help you to think about the content in a different way. Or, it may just be an interesting way to cut your content. :-)
It also compensates the fact that we have a very near field of high-res vision, that is constantly pointed at a different direction, by still keeping the big picture in mind.
The article mentions an experiment where they showed short movies with local actors to people in a village in Turkey who had never seen movies or television. They found that people processed the cuts with no issues.
Yeah, it's likely that the state of the visual cortex suffers a hard reset on the cut, but we integrate our experience at a higher level (semantic, not pure visual). It's the higher level that dictates our overall response. It might be one of the reasons why we possess intelligence and awareness.
I've wondered something similar about movies: why do they even make sense to our social brains?
Throughout history, you'd never be in a position where there would be two people having a confidential conversation in which you weren't an active part. How does the monkey brain react to that? The closest thing might be e.g. two village elders discussing your fate in front of you, such that you know you can't participate.
But then, if that's how your brain frames it, why would you enjoy that at all? It would throw off all the indicators of you being low status.
>Throughout history, you'd never be in a position where there would be two people having a confidential conversation in which you weren't an active part.
It's part of how you learned to speak, but the full process still requires your interaction, which the movie doesn't allow. In any case, after that developmental stage, it's a very atypical situation.
I think the fact that the movie represents a part of your visual field makes it understandable -- the surrounding setting is stable. If everything around you changed instantly, your brain would have a hard time adjusting. Might be interesting when more sophisticated VR is available.
One thing I noticed while watching the movie Birdman was the long shots for the scenes instead of multiple short takes. I felt uneasy the whole movie because of this, maybe my brain got used to the short takes.
yes, or dart our eyes. I think this is the answer.
But the author still has an interesting perspective, as obviously to an extent we immerse ourselves in what is shown on screen, even though we always know we're outside of it. (It's on a screen at a fixed length from us.) That is something evolution never made us do.
We already have a name for this in film theory; it's called suture, our ability to mentally 'stitch things together. The reason people typically ignore drastic changes in continuity (eg the example of different color scarves etc.) is that we don't process the whole scene at once; even in fairly static shots we have regions of interest - if it's two people talking to each other without much movement, then you'll look at their eyes - and in shots involving motion, wherever in the frame the motion ends in one shot will be where you'll focus your attention following a cut.
Commercials exploit this all the time. Next time you watch TV, squint and/or turn off the sound to break your connection to the semantic content of the advert, and try to just look at it as a series of random shapes and colors. Frequently, the most dynamic shot in the commercial is followed by a relatively static shot of the product, with the product situated at wherever the greatest movement in the frame was (typically lower left or right third in live action).
This is especially true for detergent and personal care products that need to be picked out from crowded shelves at the supermarket. It's also why locally produced commercials look cheap - there's usually no sense of movement within the frame, just a series of images that don't connect up visually and so leave only a random fragmentary impression in the mind.
Startup owners, this is often also a problem for you: it's really easy to make a video these days and everyone wants to tap into the feel of those Apple commercials, but it requires a lot more than a cute acoustic guitar track and footage of people looking happy! When you opt for a video presentation rather than text copy on your landing page, you're asking the visitor to process a lot more information a lot more rapidly. If you don't have a cohesive visual as well as semantic narrative, then your video can end up making the same bad impression as the stereotypical 'geocities' web page does for static presentation. Put another way, if your shots don't link up or your sound isn't good, then you will never make it past people's visual cortex.
Your semantic context rides into people's brains on the back of the visual and audible context you create for it. You can watch your product video endlessly and feel good about it, because it's an expression of your semantic map. People who don't have that semantic map already in place (ie everyone else) won't be able to put it together if you just give them the pieces in video form but don't show how they cohere. You find it illuminating to test your product video by playing it on a large TV to small children or even pets. Animals will watch TV if it seems like something is happening; if you can't hold the attention of a dog or cat, then I guarantee you that whatever it is you're trying to explain to your human viewers isn't getting through to most of them.
Edit: another video https://vimeo.com/channels/everyframeapainting/113439313