Skip to content

What is Dark Data? And what are the risks?

[acf field="subtitle"]

What is Dark Data? And what are the risks?

[acf field="subtitle"]

The term Dark Data is quite a bit vague. In doing so, it strangely enough indicates what it is all about: the forgotten documents and information that are stored in all sorts of places in the organisation, but that nobody knows they are there anymore. Let alone where they are and what you can do with them.

What is Dark Data anyway?

The definition of dark data is as follows:

Dark Data is a term for structured (data in databases) or unstructured (data in document form) information that people do not know exists and/or do not remember where this information is located.

For many, it is synonymous with once-used or created information. Think, for example, of a .zip file that is unpacked and used, but which nobody else cares about. But also documents that are stored in a place that is then forgotten. Even content that has been actively used can turn into Dark Data the moment the priorities within an organisation change. Such information in many cases literally gets 'rested in place' and is then forgotten. What is worse, we often see information that is not found quickly enough being recreated. This only increases the risk and quantity of Dark Data, not to mention the cost aspect of recreating information that is already there.

How common is dark data?

The short and obvious answer is: a lot and often. In particular, the duplication of pre-existing information is the biggest source of dark data. Metaphorically, this information is also known as the 'dark data iceberg'. This means that the smallest part of all information is findable and well-organised, but the vast majority of data is not visible and findable.

Dark data statistics within an organisation
How much dark data exists within organisations?

Concerns about dark data - what are the risks?

Those large volumes of unused and forgotten information can grow just as fast as information that does matter. But the overview disappears completely. This is a major concern for many companies. As a result, the cost of eDiscovery - the (legal) search of digital data - is also skyrocketing. Every organisation that considers or anticipates potential claims has to deal with this. Dark Data also has a major impact on data analyses being made. Not only is there a lot of noise in the data to be searched, it all takes longer for everything to be analysed. The results are also affected by dated information that is included anyway.

Moreover, that mountain of dark data poses a number of business risks:

  1. Security risks: Dark data may contain sensitive information, such as personally identifiable information (PII), financial data or trade secrets. If this data is not properly managed or secured, it can become the target of hackers or malicious attacks, leading to data theft, reputational damage or legal consequences.
  2. Privacy and compliance risks: Organisations must comply with laws and regulations related to data privacy, such as the General Data Protection Regulation (GDPR) in the European Union. Dark data may contain data subject to these regulations. If organisations do not have control over their dark data and fail to comply with legal requirements, they risk fines and legal sanctions.
  3. Operational inefficiencies: Unstructured dark data can hamper an organisation's operational efficiency. It takes time and resources to sift through large amounts of unorganised data, analyse it and gain valuable insights. The lack of actionable data can also lead to decision-making processes based on incomplete information.
  4. Missed opportunities: Dark data can contain valuable insights and opportunities that go untapped. Due to the lack of awareness or technological capabilities to analyse this data effectively, organisations can potentially miss out on valuable insights, new product opportunities, efficiencies or competitive advantages.

The solution: assigning metadata

How does metadata work?
How does metadata work?

Before Dark Data can be brought into the light, we need to identify it. This can be done manually, but smarter is a platform that documents automatically classifies based on the content of the document.

Next, a good ECM system can be based on metadata organising documents quickly and smartly. Or you decide to part with it for good if there really is no more value to be attributed to it.

By assigning 'attributes' or 'tags' to content, an intelligent ECM system can directly identify and relate information to each other.

This applies to both unstructured and structured data. Example: a quotation (unstructured data) can be linked by metadata to the CRM account for Customer A (structured data). This makes it visible in the CRM system the moment someone starts doing something with Customer A. In this way, metadata again adds value to Dark Data. After all, all information related to Customer A is fully displayed in the CRM.

Learning to see in the dark

When organisations learn to look at Dark Data in this way and can actually do something with it, the ultimate value of the available information becomes much greater and risks are avoided.

Want to know how to combat dark data in your organisation? Schedule a personal online consultation in with one of our Dutch experts.

 

Knowledge files
Knowledge files
Read also

Back to all items.

Back To Top