Fractal Analytics Blog

Data types used in analytics for CPG industry

Data types used in analytics for CPG industry

Moiz Shujaee
By Moiz Shujaee
March 31, 2015

Big data is a powerful way to focus your strategy and apply analytics to get complete shopper insights. Modern day marketers in the Consumer Product Goods (CPG) industry need to truly understand the complete shopping cycle so that they can customize the experience for the shopper before he/she even enters the store. Using real-time data to analyze and act on can prove to be a major differentiator for the business.

A huge amount of data is generated during various points of the supply chain of a product – from the point it’s shipped from a factory, to going on the shelf, and getting in contact with a shopper. Shipments don’t tell you what was the price paid by the shopper, when he/she made the purchase or what type of in-store promotions stimulated the sale. For this, you need retail level sales information, also called as “consumption data.”  As a CPG manufacturer, you have a wide range of data types available from which you can filter out important insights.

There are various phases in the life of a product from manufacturer to a consumer, at which data can be extracted. Based on this the data sources can be separated. The main sources of data are:

  • Shipment – It’s a track of shipments which has been taken from warehouse & delivered to distributors/customers (manufacturer to distributor – direct; distributors to customers – indirect)
  • Scantrack – It is mostly referred to as Point of Sales (POS) data, and is generally provided by a third party vendor, through data collection at a retailer
  • Survey – This data is collected by field staff from consumers, or through respondents generally used in primary research techniques. This can be both qualitative and quantitative data
  • Panel – This data is collected through registered shoppers who track their purchases, and is also known as household (HH) panel data
  • Social Listening – This data is collected through various social media channels such as through Facebook likes/comments, Twitter re-tweets, Youtube views, etc.
  • Digital Data –  This data comes through web clicks/views and through tracking of key words in e-mails

Based on the usability and wide range of application in analytics, the most important of these data types are the Scantrack and HH panel data.


Scantrack is Point of Sale (POS) data that is measured by items scanned during checkout, determining the sales of a store which can be aggregated across stores/retailers/regions. The Scantrack databases have four dimensions – Market, Products, Facts and Periods.

1. Market

Markets, as the name suggests, define the distinct places where the products are sold. Broadly, these could be regions, like parts of a country, or retail environments or channels, like different store types. Further they can be segregated into four types:

  • Scantrack Markets – These refers to large store chains which capture POS data electronically and provide this data to agencies
  • Retail Account Reports (RAR) – These are markets in a specific geography, which include all stores in the region. The geography is defined by the data collection agency.
  • Census Trading Areas (CTA) – These are regions defined by the retailer chains; these will again be geographies which include all stores belonging to that specific retailer
  • Competitor Markets – These are regions in which retailers include their competitors, in addition to their own data.

2. Products

To represent the data, the products and their value figures are arranged in a hierarchical form. The database structure is generally arranged as–

  • Category
  • Manufacturer
  • Brand
  • Variant
  • Size

3. Facts

  • The facts include the total volume sales and total value sales, both of which can be further divided into non-promotional and promotional sales
  • The facts also include percentage ACV (All commodity volume) of weighted distribution and numeric distribution, price in terms of units or packs
  • CPG companies can also define additional facts to be captured such as sales in terms of local currency, currency conversion rates at the time of data collection.

4. Periods

This includes the time periods for which data is collected and ordered. The data can be made available at weekly, monthly or yearly level, or any pre-defined aggregate periods.

Panel Data

Household panel data is taken after each shopping trip of the panelist.  The panelists record their purchases, including the location/store of the purchase and the promotional deal in the store. The information is then tracked demographically and collated back to the total panel data.

The important measures that are used for house hold panel data are:

  • Penetration – The percentage of households purchasing a specific product at least once during the time period within the area defined
  • Buying Rate – The total volume of product purchased by buying households during a time period
  • Purchase Frequency – The average number of times each household purchases the item
  • Purchase Size – The average number of products purchased by a shopper in a single shopping trip
  • Loyalty – The percent of a buyers total category needs that are satisfied by a particular brand

Based on the above data types, the analysis can create a significant impact if mapped to the correct data source. A proper formulation of the analysis and insights will help the firm to alter its market strategy and emerge as a leader.


About Moiz:

Moiz Shujaee is a Computer Science Engineer and a post graduate in Business Administration with specialization in marketing. He is working as an Analyst with Fractal in the categories domain, analyzing data for a major CPG manufacturer.

Post Comment
Category: Others

Leave a Reply

Your email address will not be published. Required fields are marked *


Institutionalize Forecasting Within an Organization

Download Paper


  • collapse2017 (7)
  • expand2016 (7)
  • expand2015 (43)
  • expand2014 (15)
  • expand2013 (47)
  • expand2012 (15)