Thursday, August 7, 2014

Profiling process for Big Data and its three dimentions

No Big Data project will realize the benefits unless it’s really driven by strategic endeavors, organization support and key technical resources. No project should begin before identifying stakeholders and its success criteria which should be measurable. Big data projects should grow incrementally. Use cases and decisions should be catalogued which benefit from Big Data attributes. As mentioned earlier, the results are incremental. Each increment with its metric should be assessed by the control and governance mechanism to help take decisions. Each decision’s impact should be compared against the metric system to achieve the use case driven SMART (smart, measurable, actionable, realistic and time-bound) goal.

Big data is making organizations today to look at their data differently. Data is not only a structured data. Different unstructured formats, images, videos also part of data and are making sense to grow business. Perspectives are forming different meaning and different profiles and there is a need to create different data profiles in realizable manner. There are three such main profiles Customer profiles, Product profiles and Service profiles which will give holistic view of data and its meaning to organizations.

Customer Profiles:

Before the dawn of Big Data, traditional transactional system was the only major source of data which used to give information about customer interactions. Day-to-day transactions and payment information was the key data to ‘understand’ the customer as a data-centric approach. Such interaction is just a slice of entire customer buying behavior. Big data gives an opportunity to profile customers using 4I approach.

Interaction: Earlier, organizations used to look for ways to interact with customers. This interaction gives an idea about what customer thinks about organization, its products and service. Now business is observing a paradigm shift to the interaction idea itself. These days, organizations want to understand the ways by which customers are interacting to not only them but with outside world as well. There are two main interaction categories- Internal interaction and External interaction.

Internal Interaction:
All customer interactions triggered or controlled by organization can be called as Internal Interactions. Online portals, service desks, call-centers, surveys etc. are some of the examples of this type. Organizations are completely aware about these sources and they have complete control over the data collected through these interactions. The main data repository is traditional relational databases and data-warehouses. Also, Meta data like log files are available for IT operations or Service operations which can provide some information about service quality attributes.
External Interaction:
All interactions by customers which are out of control of organization can be called as external interactions. Social networking, social CRM, e-Commerce etc. are some examples of these. Customers are posting publically about multiple products, their experiences, their opinions and are marketing indirectly about their buying behavior. One cannot really have control over this influx but can make use of this data for cross-selling, competitor analysis, revise market segmentation etc. New channels like mobile technologies, Internet of Things, wearable technologies are fuelling this data outburst heavily. Organizations can capitalize on these external interactions by using Big Data analytics.

Information: Data gathered through Internal and External customer interactions can be integrated to create an ‘information base’ of the organization. New data models and architectures should be used to make this as the ‘mainstream’ data. Traditional database structures should be altered to accommodate this newly defined customer centric data. Earlier customers were represented as an entity with its transactions and product related attributes. With this new external data, customers will have their multi-dimensional profiles for payment modes, devices used, locations travelled, social interaction index etc. digitized attributes which will help organizations understand consumers buying behavior in much better way. The customer can be profiled in following different ways.

Inference: Different Analytics tools can be used to identify the correlation between Customer profiles. These profiles have heterogeneous information and statistical tools should be used to identify the correlation and regression in such data-sets. These findings will be helpful to draw the inferences and these inferences should be tested against the data over pre-defined time period.

Intelligence: At this final stage, dependencies between customer profiles can be identified using data patterns and mining techniques. These profiles should be updated regularly to observe the trends. The trend analysis will be helpful to forecast the customer interactions and it will help to take necessary actions timely. This will help organizations to achieve the enhanced customer experience.

Product Profiles:

Product Profiles capture the product related information and groups them together in the relevant categories. Companies have started to look into product performance beyond turnover and market share. Big data is certainly going to be helpful in profiling the products to provide more insights. The Product profiling might be different from domain to domain. Following 4 product profiles give idea about the profiling activity and information required. These are always mutually inclusive profiles where the profile interaction will help to build better Analytics systems.

Performance Profile: This profile tries to capture the product performance through various indices like market share, turnover, churn rate etc. This gives the product team and sales team clear idea about the product performance in different geographies and customer groups. This is the most traditional profiling and almost all organizations are doing it.

Loyalty Profile: Customer Loyalty can be identified through this profile. In house transactional data and data from social media can help to derive the loyalty profile. Each product can have a loyalty index and degree of loyalty. Loyalty Index can be defined as the extent to which a customer or groups of customers tend to buy same product repeatedly. Degree of loyalty can be defined as the number of time a customer or group of customer have chosen a particular product.  Both these numbers together will give us the loyalty map of as product. The Loyalty index explains the spread and the degree explains its depth. This Loyalty map provides important insights about the product behavior in the market.

Sentiment Profile: Sentiment analysis provides the sentiment profile for particular product. Social media is the major source for these sentiments. Sentiments can have positive, neutral or negative polarity. Going forward, business keywords can be defined and grouped together to form a particular polarity. The keyword identification is usually domain specific. But these keywords help to explain not only polarity but the drivers or attributes to that polarity. For example, for a pay tv company, scheduling, decoder, content etc. could be the business keywords which can be associated with sentiments.  Big data can help in this by ingesting the data feeds from social media.

Affinity Profile: Affinity profiles defile the level of affinity a particular product has with other products or product categories, not only from the same organization. Big data can help in big way to identify such kind of affinities, internal as well as external. For example, for a bank, Affinity analysis can provide ‘affinity index’ between credit cards and loans based on internal data as well as from social media. Organizations need to define product categories to understand affinities between them. These categories could be homogeneous as well as heterogeneous. The bank’s case above could be a homogeneous example. But a personal loan and a car or any automobile can form heterogeneous affinity. This affinity exercise will help in cross selling and could be considered as major growth opportunity.

Service Profiles:
All organizations today are becoming customer centric and trying to provide better and better service propositions. Technology is helping them in a big way to reach to more and more customers in fastest possible time. This volume and the speed are increasing day by day and technology is doing marvelous job in supporting all sorts of business requirements. The system performance is a key factor in achieving this objective and it has become a necessity in measuring this quantitatively and qualitatively.  All these systems generate huge amount of data apart from the transactions. There are following important channels which generate this ‘service data’.

All these channels are supported by technology and generate huge amount of service data, called as Meta data (Data about data).Different Servers, data centers are getting monitored to ensure they achieve the optimum required level of performance. The Big Data analytics can reveal quite important business information like fraud patterns, performance patterns, load analysis which can be helpful to adjust the operational strategies. Following are the example of such data types and different techniques is as below

Big Data Themes

Organizations evolve through the data stages as a continuous journey. Big data approach will be helpful in each of the stage as it’s a continuous approach to achieve the competitive advantage. It’s not the ‘One Time’ solution. Following themes will help organizations to leverage big data. Each one belongs to a least one dimension of big data theme.

Transparency: Today, still few organizations observe significant amount of information which is not digitized. Making this digitized, provide huge opportunity to capture this information and make it available in the mainstream data flows. It is being stored in papers, files, reports, tapes etc. Some form of information like processes, standards is not even captured. All such ‘missing’ information is a missed opportunity for making growth strategies in long run

Generation: Advanced instrumentation and embedded technologies are making each possible physical ‘thing’ intelligent. These are forming ‘Internet-of-Things’ more and more communicable and traceable. These internet objects interact with themselves and to the outside world to generate lots and lots of data. Advanced sensors and embedded devices are now able to gather unimaginable information in huge chunks like heart rate monitors, touch sensors, advanced weather forecasting systems etc. This is all new information which organizations never thought about two decades ago and today it’s making it really Big. 

Surfacing: There is ‘Big Data’ available outside the organizations, about the organizations which are currently out of organizations’ control. This is an excellent opportunity to understand what is being said about, over ‘Social Media’. Also, there is huge unstructured data residing on servers in terms of logs which can let you know service performance and anomalies. 

Integration: Companies have started to ask this question. “How data ‘in-silos’ can be integrated together to identify if any correlation between them, eventually between different business functions. Huge transactional data and such loosely controlled data can be integrated together by using advanced data architectures. Organizations need to create customer profiles with this integrated data in order to identify customer interactions to external world to open up cross selling opportunities. 

Discovery: Huge datasets are worth to be examined. Advanced algorithms for data mining and data science techniques can be used to scan through the data and identify data patterns. Relevant business information can be discovered by studying these patterns which can be used take necessary measures. 

Consumption: Data is available today at very fast pace. Every minute is adding huge data in this ‘data-net’. Hence its accessibility becomes equally important. CXO’s of today’s corporate want such information on their screen the way data is getting generated. Fast processing, dynamic reporting are important factors today for data analytics.


Big Data Dimentions



The big story in data analytic and information management in 2011-12 was big data and in 2014, the trend is accelerating. It’s about managing huge amounts of novel and various sources of information. One can perceive this effect as a huge data-net which is growing in fastest ever pace and you are clueless about which data to consider and which not. Data are now woven into every sector and function in the global economy and like other essential factors of production such as hard assets and human capital. This ‘digitized data’ has become the business driver for almost all business function in today’s modern world economics. The use of Big Data - large pools of data - that can be brought together and analyzed to discern patterns and make better decisions — will become the basis of competition and growth for individual firms, enhancing productivity and creating significant value for the world economy by reducing waste and increasing the quality of products and services.

The three important dimensions of Data- Volume, Variety and Velocity – are making it really ‘Big’. The use of Big Data is becoming a crucial way for companies to outperform their competitors. This makes the data relevancy an utmost requirement. This brings another dimension ‘Veracity’ which eventually decides the accuracy of data. In most industries, established competitors and new entrants alike will leverage data-driven strategies to innovate, transform, and generate value. Big Data will help to create new growth opportunities and entirely new categories of business processes such as designing data requirement and aggregate and analyze industry data. These processes will be used to ingest large information flows which pour data about products and services, buyers and suppliers, consumer preferences and intent.
Different data mean differently to different organizations. There four different data stages which are coherent to four analytics stages as below

1. Information Data Stage: This is the basic data stage where data is recognized at information level. The business value of the outcome in this stage is not much but it is the simplest form of data. This will only provide numbers, facts etc. This is the starting phase of each organization in Big data journey. This stage mainly talks about ‘What’ part of it. The descriptive type of analysis tools are used in this stage to provide dashboards, reports etc.

2. Knowledge Data Stage: This data stage is a step ahead from information. In this stage organizations are trying to identify relationships between different types of information. The diagnostic techniques in this stage use search based or query based dynamic reporting tools. It mainly focuses on ‘Why’ part of it.

3. Intelligence Data Stage: This data stage tries to find out the patterns from the ‘knowledge’ that organizations have. Advanced algorithms, statistical techniques are used to mine the data and to identify typical patterns. This is used in predictive analysis to forecast ‘what will happen’.

4. Wisdom Data Stage: This stage is an ultimate data stage where organizations can make use
 of their information, knowledge and Intelligence to be the market leaders by setting Industry-best practices. The prescriptive analysis is used to formulate the strategy and achieve the business objectives.