Let me start with an embarrassing admission. In all the months I have been turning over in my mind the relationship between memory and structured data, it occurred to me only for the first time the other day that the word "memory" has a very specific meaning in the field of computing, and one so closely connected to data as to be almost synonymous. More precisely, memory is a measure of the amount of room for data, and data is everything that populates memory.
So we have - in barely the space of a generation - grown so used to this elision that it has become invisible (at least to me). But on closer examination it is really peculiar. Memory - of the traditional human variety - is slippery and organic, as any legal expert will testify. It inhabits a dimension of vast, multifaceted complexity, where the storage of facts is tightly wound together with the senses and the emotions, so that the first wintry day or the famous bite of a madeleine can unlock access to recall of events and feelings otherwise completely out of reach to the rational mind. Computer memory, on the other hand, is binary (like all else computerish). The data is either there or it isn't. At its most coy, it might hide in a disk partition that is invisible to a standard search routine, but it isn't going to wait for the appearance of a lost chord to show its face. And when corrupted, it gets corrupted in straight lines and blocks, like this picture.

When human memory gets corrupted (as it does all the time), it does so in the way the mind dreams - in a way that is probably irreducible to verbal explanation or digital representation.
So what? Words evolve, and it's natural that technologies that transform our understanding of culture and human experience will call forth new vocabulary and forge new meanings out of old words. Memory is just one of a long list (icon, avatar, bug, application, folder, browse, to cite a few random examples) of words that have been transformed, revived, debased or ennobled in the crucible of new technology. But the significance of this change is what it tells us about the human experience of memory, our habits and expectations of it, and how they have been changed by computers. What are the implications?
1. "The data is either there or it isn't." We have grown used to trusting the fidelity of computer memory over our own, and so we entrust our information to it with great confidence. To some extent we have always done this with documentary technology, ever since we moved from the oral to the written. But we don't expect computer memory to yellow, rot or burn. We know that computers break and get lost or stolen - but we also know that we can hedge against this by backing up.
2. Because we trust the relative robustness and fidelity of computer memory, we have become more blasé about the challenges of filtering and retrieval. We breathe a subconscious sigh of relief and hand over the heavy lifting of our own personal knowledge economies to the machines. We're in danger of losing sight of the fact that memory is not the same thing as remembering. It is still up to us to know where and how to look (crafting the right search term, navigating our folder structures). Even if the data is there, we still need to know it's there in order for it to be useful.
3. What is true for private memory is doubly so for the public kind - the inherited knowledge of our societies and cultures. The Internet is our new global memory, a collective consciousness that throws up wonders like Wikipedia or the Khan Academy, each testifying the inspiring power of visionaries and networks, but also the commodification of mere knowledge.
I've blogged before about the treacherousness of these first two points (and Justin has picked up the related problem of the many ways not to know). But let's have a drill down to some of the problems around that last point. True, we are building up an astonishingly comprehensive and reliable databank that is increasingly accessible any time and anywhere ("martini knowledge" if you like, to adapt Ashley Highfield's unfortunate coinage). And we are also finding all kinds of useful ways through it, via algorithmic or social search. Our digital remembering increasingly blurs the private with the public, as we outsource the storage and filtering of our memories to friends and acquaintances on social networks. Not only the storage but also the processing of knowledge and memory are getting digitised apace.
But there's the rub: what seems to be, and is sometimes said to be, our global brain has nowhere near the sophistication and power of our apparently puny individual brains to connect and analyse. We are tempted to trust to the cloud to do our knowing for us so we can get on with the more glamorous business of commentary, discussion and mashup. But if our brains don't have the facts they can't process them. Having access to a computer that knows what happened in Afghanistan over a century ago does not give you the vital historical perspective that simply knowing those stories yourself gives you. And that matters because it is still human beings, with their onboard knowing and processing, that make the decisions.
I haven't yet figured out what all this tells me - all comments gratefully received. But here are a couple of tentative, and hopefully complementary, conclusions:
1. We need to be a bit careful in how we re-evaluate what it means to learn. Knowing facts feels like a lower order of intellectual activity to processing them. Having pretty much ubiquitous access to calculators (on our computers and phones) means that you will never again need to do long division. It's tempting to leap by analogy to the idea that having access to the "global brain" means that you will never again need to commit a fact to your own memory. Why use valuable time and mental energy for this that could be spent on analysis? But learning how to think about things is still nowhere near a substitute for knowing them, as individuals. More than that - you need to know things in order to form the context to think about other things.
2. That said, we needn't retreat from the enterprise of making the machines help us think. This is where semantic technologies such as dbpedia should help supplement the current emphasis on folksonomy, term extraction and facial recognition. For example, tagging stories of events present and past in Afghanistan with "adventure" and "failure", situating those concepts in intellectual and emotional contexts that connect them with similar and opposite concepts and making such connections readily available as links wherever stories appear, should act as a valuable supplement to the vital but hazardous and selective business of human remembering, retrieval and reflection. (For starters, I still like a lot what Paul Rissen had to say a year ago about the potential for application of linked data to the production of news, and I'm waiting to see who's going to pick this up and run with it.)
So I suppose I am arguing for a more examined relationship between memory and data. The precise conclusions are less important than the act of paying attention to our assumptions as they evolve, and keeping an eye out for unintended consequences.