Your Documents are Talking, Are You Listening?

Most people have had the experience of trying to open a document with a .pdf extension and having their application say "Cannot open document, does not begin with pdf."  That happens because all applications keep information about the document within the document itself.  Such metadata may tell a lot about you, how you work, and even about your clients or friends.

In actuality, there are two kinds of metadata: that kept by the operating system and that kept by the applications themselves, such as Word and WordPerfect.  The system metadata includes the original author and various dates.  You can usually see all this type of metadata by right-clicking on the document and looking at the properties.

Microsoft Word, in versions prior to 2007, keeps information about the last 10 times a document was saved including the document name and directory structure.  Suppose you use a boilerplate document and modify it for each client.  If you then save it with the client name as part of the file name, the metadata will record that information and reveal the names of past clients for whom you have used this boilerplate.

It is surprisingly easy to see the metadata in a Word document.  From inside Word, choose open and for the file type, choose "Recover Text From Any File" in the pull-down menu at the bottom of the window.  All the metadata will appear at the bottom of the document.  To get rid of Word metadata, convert to the .pdf format.  Or, save as .rft format; reopen in Word; and save in a .doc format.  With either technique, the system metadata will be retained, but the Word-specific metadata will be gone.

If you are emailing Word documents to colleagues and clients, take care!