In this Information age, Data is very crucial. From Information security point of view also data is what everybody is behind. Data loss for any organization can have a very negative impact financially as well as reputation wise. Most of the time people share their data knowingly, but sometimes we don’t realize and reveal critical information in the form of metadata and this data could play a major part in a cyber attack.
Metadata: Simple data can be described as raw values which need to be processed for the purpose of generating information and deriving knowledge. Meta data is commonly described as ‘data about data’; however this definition is not complete and does not covers all properties of metadata. A better definition as described by Wikipedia (http://en.wikipedia.org/wiki/Metadata) is as following.
Metadata (metacontent) is defined as data providing information about one or more aspects of the data, such as:
- Means of creation of the data
- Purpose of the data
- Time and date of creation
- Creator or author of data
- Location on a computer network where the data was created
- Standards used
Metadata has been utilized for various purposes from cataloging archives, data virtualization to SEO (Search Engine Optimization) for web sites. All this metadata is put up intentionally by the owner for the purpose of better and easy management of information; whereas we are going to talk about the metadata that user puts-up without being aware of (most of the time).
We can extract metadata for a given domain using a tool called as FOCA.
FOCA: FOCA means seal in Spanish language. FOCA or Fingerprinting Organizations with Collected Archives is a tool to discover files on target website and extract metadata from it. FOCA is a Windows based tool for the metadata extraction. It provides a GUI for easy usage. FOCA basically uses search engines for the purpose of discovering files and extracts metadata from them. There also exists an online version of the application, which can be found at http://www.informatica64.com/foca/.
Figure 1. FOCA extracts metadata from a word document
User simply needs to input the project name and the domain that need to be parsed for the discovery and extraction of metadata. FOCA utilizes different search engines for the purpose of discovering the list of files available on that domain. After discovering the list, the user can download the file(s) so that the metadata can be extracted from them, as shown in figure 1. This extracted metadata could reveal sensitive information such as the OS being used, specific application used to create the file, name of the machine etc. This information can help an attacker to craft his/her attacks against an infrastructure.
Policies and procedures need to be developed for document sanitization before hosting them online. Strong policies and mitigation methods like usage of Data Loss Prevention (DLP) tools (MetaShieldProtector, OpenDLP etc.), if employed properly can help to prevent such data loss and help the organization to implement defense in depth.