After a decade of programming around normalized or structured data structures, era of big data moved focus back to unstructured data structures and further divided it into sub parts. Depending upon different data structures industry have offered different techniques to deal with.

Different Data Structures:

  1. Structured Data
  2. Semi-Structured
  3. Quasi-Structured
  4. Unstructured

Structured Data

Structured data mainly found in traditional database design and compose of various different data types to store text, images, media, etc. Other data sources includes OLAP, CSV, DBMS, etc.

Semi-Structured Data

Semi-Structured data includes text files with a defined pattern that enables parsing, such as XML data files that are self describing and defined using XML schema.

Quasi-Structured Data

Quasi-Structured data includes textual data with erratic data formats that can be formatted using tools, such as web clickstream data. These data can be obtained from logs and hence web server logs are best suited as quasi-structured data where server logs are parsed and mined to discover usage patterns and uncover relationships and areas of interest on a website or groups of sites.

Unstructured Data

Unstructured Data has no inherent structure and available as text, pdf, images, videos, etc.

Leave a Comment