Data is exploding. The variety of data being created by workers inside and outside of the workplace and the velocity at which that data is being shared makes corporate compliance officers sleep with one eye open, because uncontrolled data equals unknown risk, and the unknown is scary. Think about it – in addition to the terabytes of data lurking in companies’ disparate systems, organizations today are creating new content that is expected to drive 60 percent growth in enterprise data stores (Worldwide Big Data Technology and Services 2012-2015 Forecast, Mar 2012, IDC).
Most corporate compliance officers are concerned with the latter – newly created data is the shiny object grabbing attention. However, equal focus needs to be placed on legacy data (sometimes known as dark data), which is often unknown, unmanaged, and may be out of compliance with internal or external requirements. Many organizations today are dealing with information sprawl by throwing more storage at the problem – accepting the risk as a cost of doing business – or by simply ignoring it. None are ideal measures to protect the organization. In fact, 31 percent of organizations report that poor electronic recordkeeping is causing problems with regulators and auditors (Information Governance- Records, Risks, and Retention in the Litigation Age. AIIM 2013). Further, the cost of an individual data breach costs organizations an average $5.5 million (2011 Cost of Data Breach Study: Ponemon Institute 2011). There are also countless examples of fines, sanctions or adverse inference decisions being triggered by data being accidentally lost or mishandled.
To get a handle on dark data, it is first important to understand what it is. Dark data can take many forms, including both structured data (machine-created information that typically fits in rows and columns) and unstructured data (human-generated information that is much more difficult to search). It can also come in many formats and reside in many places, making it more difficult to access. It can be amassed simply because of our reliance on cheap storage or because of special circumstances like M&A. In virtually all cases, legacy data poses legal, regulatory and internal risk if it isn’t managed effectively.