by Lynda Bourne
The world of business is moving toward storing and exchanging documentation in electronic formats—and the transition is swift. While this process has its advantages, my team and I have been working on a major report based on a data set of more than 250,000 records, and the project has highlighted some problems. Namely, as it becomes easier to preserve every iteration of a document, finding useful information becomes harder.
There are two basic types of document storage and retrieval systems with a couple of nuances:
- Systems that rely on taglines or document characteristics for sorting and searching (e.g. document titles, taglines in emails, dates, authors, senders, receivers, etc.)
- Systems that allow the full content of most document types to be searched (think Google)
If your organization isn’t using one or more of these systems, it soon will be! You’ll probably find that they solve many problems typically found in paper-based systems, but they also introduce a new suite of issues. Here are some of ways in which these systems fall short—and ways to overcome these challenges:
Establishing one source of the truth. As people become more used to the system, they begin to rely on it. And if something isn’t uploaded, stored or created in the tool, it ceases to exist. You cannot rely on people remembering to do the right thing, and if someone is doing something unethical, they will try to evade the system. The solution lies in system design and automation. Discipline and processes are needed to make sure a document retrieval system contains all of the documents.
Creating one document, one record. Send an email to 10 other people in the organization and you immediately have 11 versions of the one document scattered across various email accounts. (And this is before “reply all” and email trails start to build.) Your document management system needs to be smart enough to recognize identical versions of the same document and archive the 10 copies. However, when someone changes the email (maybe by forwarding it), you have a new document, and the process gets more complex if there are attachments. Here, the solution is a system that can manage families of documents.
Finding what you need—easily. This is the biggest challenge with massive archives of documents (and was central to our work over the last few months). How do you find information? A search based on document contents may seem like the best option, but if you Google “PMI PMP exam change,” you get 891,000 results. And it’s Google’s systems that decide which of the pages it will show you and the sort order. That means if you’re looking for something specific, you may have to dig through a sea of hyperlinks and page titles. This gets even more difficult if you want to check if something did not get documented. A null-result may mean the alleged document does not exist—or it may mean your search terms are slightly ambiguous.
Developing systems that balance providing information that you need against burying you under masses of content requires the wisdom of Solomon. Artificial intelligence can help if the search is routine, but for an important ad hoc search you are probably on your own. One way to help focus searches is by structuring the information, using folders or codes. The problems are minimizing misplaced information and persuading everyone to use the system. Again, system design is central to developing processes that work.
The concept of a paperless project has been around for a while now and electronic document management systems are becoming increasingly common. The challenge that remains is scaling this concept up to the enterprise level and developing tools that can quickly provide you with the information you need from a pool of several million documents.
What do you do to store documents and facilitate the ease of information access?