That’s not exactly true, there doesn’t seem to be an upper bound (that we can reach) on storage capacity in the brain [0]. Instead, the brain actually works to actively distill knowledge that doesn’t need to be memorized verbatim into its essential components in order to achieve exactly this “generalized intuition and understanding” to avoid overfitting.
> That’s not exactly true [...] Instead, the brain actually works to actively distill knowledge that doesn’t need to be memorized verbatim into its essential components
...but that's exactly what OP said, no?
I remember attending an ML presentation where the speaker shared a quote I can't find anymore (speaking of memory and generalization :)), which said something like: "To learn is to forget"
If we memorized everything perfectly, we would not learn anything: instead of remembering the concept of a "chair", you would remember thousands of separate instances of things you've seen that have a certain combination of colors and shapes etc
It's the fact that we forget certain details (small differences between all these chairs) that makes us learn what a "chair" is.
Likewise, if you remembered every single word in a book, you would not understand its meaning; understanding its meaning = being able to "summarize" (compress) this long list of words into something more essential: storyline, characters, feelings, etc.
There’s a story by Jorge Luis Borges called “Funes the Memorious” about a man who remembers everything, but can’t generalize. There’s a line about him not knowing if a dog on the square glimpsed at noon from the side is the same dog as the one seen from the back at 12:01 or something like that. Swirls of smoke from a cigarette are memorized forever. He mostly sits in a dark room.
Long ago, I was introduced to the theory of Mappers and Packers[1], which are polar opposites in the ways that people can learn things. Mappers (like me) have a mental model of the universe which represents facts and knowledge as puzzle pieces that have to fit together into a coherent whole. Any inconsistencies in the fit between those pieces drive us nuts. When we encounter a new set of facts, we have a background process that tries to make them fit. Then all the new connections arise over time as we realize new ways we can combine old facts.
On the other extreme, are packers. They have optimized for packing facts in bulk, with little regard for how they fit together. If you give this type of person a set of instructions that require a wider knowledge of how things fit, they will get lost, frustrated, and/or need support. If you anticipate this, and spend a bit extra time to show how to handle all of the possible contingencies, (and give them a document of this) they're good, and will be quite happy with your support.
I think that mappers take more time figuring out the model, compressing the facts to save space, and increase applicability in general.
Not precisely. We don’t know if verbatim capacity is limited (and it doesn’t seem to be) but the brain operates in a space-efficient manner all the same. So there isn’t necessarily a causative relationship between “memory capacity” and “means of storage”.
> Likewise, if you remembered every single word in a book, you would not understand its meaning
I understand your meaning but I want to clarify for the sake of the discussion that unlike with ML, the human brain can both memorize verbatim and understand the meaning because there is no mechanism for memorizing something but not processing it (i.e. purely storage). The first pass(es) are stripped to their essentials but subsequent passes provide the ability to memorize the same input.
Memorization is storing data. Generalization is developing the heuristics by which you compress stored data. To distill knowledge is to apply heuristics to lossily-compress a large amount of data to a much smaller amount of data from which you nevertheless can recover enough information to be useful in the future.
I did not mean to imply compression implies generalization, if anything the reverse. Compression is the act of cutting, generalization is the whetstone by which you may sharpen a blade, which is the compression heuristic. A more general heuristic is to compression what a sharper blade is to cutting.
I've thought about this a lot in the context of the desire people seem to have to try and achieve human immortality or at least indefinite lifespans. If SciAm is correct here and the upper bound is a quadrillion bytes, we may not be able to hit that given the bound on possible human experiences, but someone who lived long enough would eventually hit that. After a hundred million years or whatever the real number is of life, you'd either lose the ability to form new memories or you'd have to overwrite old ones to do so.
Aside from having to eventually experience the death of all stars and light and the decay of most of the universe's baryonic matter and then face an eternity of darkness with nothing to touch, it's yet another reason I don't think immortality (as opposed to just a very long lifespan) is actually desirable.
I imagine there would be perhaps tech or technique which you can choose to determine which memories to compress and countless of others techniques like extra storage that you can instantly access, so I don't see all of these as being real arguments why not become immortal. If I have to choose to be dead and memoryless compared to losing some of my memories, but being still alive, why should I choose being dead and memoryless?
And when losing memories you would first just discard some details, like you lose now anyway, but you would start compressing centuries into rough ideas of what happened, it's just the details that would lack a bit.
I don't see it being a problem at all. And if really something happens with the Universe, sure I can die then, but why would I want to die before?
I want to know what happens, what gets discovered, what happens with humanity, how far do we reach in terms of understanding of what is going on in this place. Why are we here. Imagine dying and not even knowing why you were here.
My naive assumption would be that it would be a fairly gradual process. You'd just always have a sliding window of the last N years of memories, with the older ones being progressively more fuzzy and unreliable.
You obviously hand wave alzheimer and dementia. Human don't know exactly how brains works. The computational storage is just an estimate of what we understand von Neuman computer storing data 1 and 0. In every psychological test conducted on human mind, they clearly have a limit.
As best as I’ve been able to research, it’s still under active exploration and there are hypotheses but no real answers. I believe research has basically been circling around the recent understanding that in addition to being part of how the brain is wired, it is also an active, deliberate (if unconscious) mechanism that takes place in the background and is run “at a higher priority” during sleep (sort of like an indexing daemon running at low priority during waking hours then getting the bulk of system resources devoted to it during idle).
There are also studies that show “data” in the brain isn’t stored read-only and the process of accessing that memory involves remapping the neurons (which is how fake memories are possible) - so my take is if you access a memory or datum sequentially start to finish each time the brain knows this is to be stored verbatim for as-is retrieval but if you access snapshots of it or actively seek to and replay a certain part while trying to relate that memory to a process or a new task, the brain rewires the neural pathways accusingly. Which implies that there us an unconscious part that takes place globally plus an active, modifying process where how we use a stored memory affects how it is stored and indexed (so data isn’t accessed by simple fields but rather by complex properties or getters, in programming parlance).
I guess the key difference from how machine learning works (and I believe an integral part of AGI, if it is even possible) is that inference is constant, even when you’re only “looking up” data and you don’t know the right answer (i.e. not training stage). The brain recognizes how the new query differs from queries it has been trained on and can modify its own records to take into account the new data. For example, let’s say you’re trying to classify animals into groups and you’ve “been trained” on a dataset that doesn’t include monotremes or marsupials. The first time you come across a platypus in the wild (with its mammaries but no nipples, warm-blooded but lays eggs, and a single duct for waste and reproduction) you wouldn’t just mistakenly classify it as a bird or mammal - you would actively trigger a (delayed/background) reclassification of all your existing inferences to account for this new phenomenon, even though you don’t know what the answer to the platypus classification question is.
imo, it amounts to revisiting concepts once more general principles are found — and needed. For instance, you learn the alphabet, and it's hard. the order is tricky. the sounds are tricky, etc. but eventually, it get distilled to a pattern. But you still have to start from A to remember what letter 6 is, until you encounter that problem many times, and then the brain creates a 6=F mapping. I think of it in economic terms: when the brain realizes it's cheaper to create a generalization, it does so on the fly, and that generalization takes over the task.
Somtimes it's almost like creating a specialist shard to take over the task. Driving is hard at first, with very high task overload, lots to pay attention to. With practice, it becomes a little automated part of yourself takes care of those tasks while your main general intelligence can do whatever it likes, even as the "driver" deals with seriously difficult tasks.
Is there a “realistic upper bound” in things that should be memorized verbatim?
Ancient greeks probably memorized the Iliad and other poems (rhyming and metre might work as a substitute for data compression, in this case), and many medieval preachers apparently memorized the whole Bible…
maybe between sleep and normal waking idle because there is actually quite a bit going on during sleep. There has been quite a bit of research though regarding higher "clock up" states consuming far more energy, such as grandmasters playing a chess tournament
[0]: https://www.scientificamerican.com/article/new-estimate-boos...