Dark data is on the rise, posing an increasing risk to businesses which handle copious information. So where is the solution to inactive, yet potentially harmful data?
Dark data is best described as information a company gathers that falls outside of its day-to-day operation. It’s also referred to as data that people omit to erase “because it might come in handy one day”. The term “dark data” can also be applied to information that is rendered inaccessible due to the device it was stored upon becoming obsolete.
Examples of data which is left dark include server log files that highlight website visitor behaviour, customer call detail records that indicate consumer feedback and mobile geolocation data which reveals traffic patterns to enable business planning. If harnessed correctly, this abundance of seemingly superfluous information can be used to drive internal revenue streams. However, bringing old data to light isn’t without risk.
A lot of data that is not visible to data administrators so therefore exists primarily in personal files whose content is managed directly by individuals rather than by any corporate applications. It raises the question of quality. Rehashing old information for research or publication purposes, especially if it’s gleaned from the internet, is potentially problematic as very often the information source isn’t known, rendering its verification impossible.
Additionally, when dark data is once more brought to light there’s the potential for its original meaning to be lost in translation, leading to possible compliance issues upon future publication. And what is to stop private or confidential information being copied into spreadsheets unwittingly? So the compliance issue is raised once again. The problem is dark data is very often created using logic which was only designed to be understood by its creator – a simple misunderstanding a long way down the line could be very damaging.
It’s gradually being recognised that dark data is a potential source of great peril as well as value, so much so, financial regulators are reportedly becoming more aware and concerned about the risk inherent in spreadsheet models. Sooner, rather than later, regulators such as Enterprise Information Management (EIM) will have no choice but to confront the prevalence of sleeping data which is best left to lie.