Thursday, August 7, 2014

Profiling process for Big Data and its three dimentions

No Big Data project will realize the benefits unless it’s really driven by strategic endeavors, organization support and key technical resources. No project should begin before identifying stakeholders and its success criteria which should be measurable. Big data projects should grow incrementally. Use cases and decisions should be catalogued which benefit from Big Data attributes. As mentioned earlier, the results are incremental. Each increment with its metric should be assessed by the control and governance mechanism to help take decisions. Each decision’s impact should be compared against the metric system to achieve the use case driven SMART (smart, measurable, actionable, realistic and time-bound) goal.

Big data is making organizations today to look at their data differently. Data is not only a structured data. Different unstructured formats, images, videos also part of data and are making sense to grow business. Perspectives are forming different meaning and different profiles and there is a need to create different data profiles in realizable manner. There are three such main profiles Customer profiles, Product profiles and Service profiles which will give holistic view of data and its meaning to organizations.

Customer Profiles:

Before the dawn of Big Data, traditional transactional system was the only major source of data which used to give information about customer interactions. Day-to-day transactions and payment information was the key data to ‘understand’ the customer as a data-centric approach. Such interaction is just a slice of entire customer buying behavior. Big data gives an opportunity to profile customers using 4I approach.

Interaction: Earlier, organizations used to look for ways to interact with customers. This interaction gives an idea about what customer thinks about organization, its products and service. Now business is observing a paradigm shift to the interaction idea itself. These days, organizations want to understand the ways by which customers are interacting to not only them but with outside world as well. There are two main interaction categories- Internal interaction and External interaction.

Internal Interaction:
All customer interactions triggered or controlled by organization can be called as Internal Interactions. Online portals, service desks, call-centers, surveys etc. are some of the examples of this type. Organizations are completely aware about these sources and they have complete control over the data collected through these interactions. The main data repository is traditional relational databases and data-warehouses. Also, Meta data like log files are available for IT operations or Service operations which can provide some information about service quality attributes.
External Interaction:
All interactions by customers which are out of control of organization can be called as external interactions. Social networking, social CRM, e-Commerce etc. are some examples of these. Customers are posting publically about multiple products, their experiences, their opinions and are marketing indirectly about their buying behavior. One cannot really have control over this influx but can make use of this data for cross-selling, competitor analysis, revise market segmentation etc. New channels like mobile technologies, Internet of Things, wearable technologies are fuelling this data outburst heavily. Organizations can capitalize on these external interactions by using Big Data analytics.

Information: Data gathered through Internal and External customer interactions can be integrated to create an ‘information base’ of the organization. New data models and architectures should be used to make this as the ‘mainstream’ data. Traditional database structures should be altered to accommodate this newly defined customer centric data. Earlier customers were represented as an entity with its transactions and product related attributes. With this new external data, customers will have their multi-dimensional profiles for payment modes, devices used, locations travelled, social interaction index etc. digitized attributes which will help organizations understand consumers buying behavior in much better way. The customer can be profiled in following different ways.

Inference: Different Analytics tools can be used to identify the correlation between Customer profiles. These profiles have heterogeneous information and statistical tools should be used to identify the correlation and regression in such data-sets. These findings will be helpful to draw the inferences and these inferences should be tested against the data over pre-defined time period.

Intelligence: At this final stage, dependencies between customer profiles can be identified using data patterns and mining techniques. These profiles should be updated regularly to observe the trends. The trend analysis will be helpful to forecast the customer interactions and it will help to take necessary actions timely. This will help organizations to achieve the enhanced customer experience.

Product Profiles:

Product Profiles capture the product related information and groups them together in the relevant categories. Companies have started to look into product performance beyond turnover and market share. Big data is certainly going to be helpful in profiling the products to provide more insights. The Product profiling might be different from domain to domain. Following 4 product profiles give idea about the profiling activity and information required. These are always mutually inclusive profiles where the profile interaction will help to build better Analytics systems.

Performance Profile: This profile tries to capture the product performance through various indices like market share, turnover, churn rate etc. This gives the product team and sales team clear idea about the product performance in different geographies and customer groups. This is the most traditional profiling and almost all organizations are doing it.

Loyalty Profile: Customer Loyalty can be identified through this profile. In house transactional data and data from social media can help to derive the loyalty profile. Each product can have a loyalty index and degree of loyalty. Loyalty Index can be defined as the extent to which a customer or groups of customers tend to buy same product repeatedly. Degree of loyalty can be defined as the number of time a customer or group of customer have chosen a particular product.  Both these numbers together will give us the loyalty map of as product. The Loyalty index explains the spread and the degree explains its depth. This Loyalty map provides important insights about the product behavior in the market.

Sentiment Profile: Sentiment analysis provides the sentiment profile for particular product. Social media is the major source for these sentiments. Sentiments can have positive, neutral or negative polarity. Going forward, business keywords can be defined and grouped together to form a particular polarity. The keyword identification is usually domain specific. But these keywords help to explain not only polarity but the drivers or attributes to that polarity. For example, for a pay tv company, scheduling, decoder, content etc. could be the business keywords which can be associated with sentiments.  Big data can help in this by ingesting the data feeds from social media.

Affinity Profile: Affinity profiles defile the level of affinity a particular product has with other products or product categories, not only from the same organization. Big data can help in big way to identify such kind of affinities, internal as well as external. For example, for a bank, Affinity analysis can provide ‘affinity index’ between credit cards and loans based on internal data as well as from social media. Organizations need to define product categories to understand affinities between them. These categories could be homogeneous as well as heterogeneous. The bank’s case above could be a homogeneous example. But a personal loan and a car or any automobile can form heterogeneous affinity. This affinity exercise will help in cross selling and could be considered as major growth opportunity.

Service Profiles:
All organizations today are becoming customer centric and trying to provide better and better service propositions. Technology is helping them in a big way to reach to more and more customers in fastest possible time. This volume and the speed are increasing day by day and technology is doing marvelous job in supporting all sorts of business requirements. The system performance is a key factor in achieving this objective and it has become a necessity in measuring this quantitatively and qualitatively.  All these systems generate huge amount of data apart from the transactions. There are following important channels which generate this ‘service data’.

All these channels are supported by technology and generate huge amount of service data, called as Meta data (Data about data).Different Servers, data centers are getting monitored to ensure they achieve the optimum required level of performance. The Big Data analytics can reveal quite important business information like fraud patterns, performance patterns, load analysis which can be helpful to adjust the operational strategies. Following are the example of such data types and different techniques is as below

Big Data Themes

Organizations evolve through the data stages as a continuous journey. Big data approach will be helpful in each of the stage as it’s a continuous approach to achieve the competitive advantage. It’s not the ‘One Time’ solution. Following themes will help organizations to leverage big data. Each one belongs to a least one dimension of big data theme.

Transparency: Today, still few organizations observe significant amount of information which is not digitized. Making this digitized, provide huge opportunity to capture this information and make it available in the mainstream data flows. It is being stored in papers, files, reports, tapes etc. Some form of information like processes, standards is not even captured. All such ‘missing’ information is a missed opportunity for making growth strategies in long run

Generation: Advanced instrumentation and embedded technologies are making each possible physical ‘thing’ intelligent. These are forming ‘Internet-of-Things’ more and more communicable and traceable. These internet objects interact with themselves and to the outside world to generate lots and lots of data. Advanced sensors and embedded devices are now able to gather unimaginable information in huge chunks like heart rate monitors, touch sensors, advanced weather forecasting systems etc. This is all new information which organizations never thought about two decades ago and today it’s making it really Big. 

Surfacing: There is ‘Big Data’ available outside the organizations, about the organizations which are currently out of organizations’ control. This is an excellent opportunity to understand what is being said about, over ‘Social Media’. Also, there is huge unstructured data residing on servers in terms of logs which can let you know service performance and anomalies. 

Integration: Companies have started to ask this question. “How data ‘in-silos’ can be integrated together to identify if any correlation between them, eventually between different business functions. Huge transactional data and such loosely controlled data can be integrated together by using advanced data architectures. Organizations need to create customer profiles with this integrated data in order to identify customer interactions to external world to open up cross selling opportunities. 

Discovery: Huge datasets are worth to be examined. Advanced algorithms for data mining and data science techniques can be used to scan through the data and identify data patterns. Relevant business information can be discovered by studying these patterns which can be used take necessary measures. 

Consumption: Data is available today at very fast pace. Every minute is adding huge data in this ‘data-net’. Hence its accessibility becomes equally important. CXO’s of today’s corporate want such information on their screen the way data is getting generated. Fast processing, dynamic reporting are important factors today for data analytics.


Big Data Dimentions



The big story in data analytic and information management in 2011-12 was big data and in 2014, the trend is accelerating. It’s about managing huge amounts of novel and various sources of information. One can perceive this effect as a huge data-net which is growing in fastest ever pace and you are clueless about which data to consider and which not. Data are now woven into every sector and function in the global economy and like other essential factors of production such as hard assets and human capital. This ‘digitized data’ has become the business driver for almost all business function in today’s modern world economics. The use of Big Data - large pools of data - that can be brought together and analyzed to discern patterns and make better decisions — will become the basis of competition and growth for individual firms, enhancing productivity and creating significant value for the world economy by reducing waste and increasing the quality of products and services.

The three important dimensions of Data- Volume, Variety and Velocity – are making it really ‘Big’. The use of Big Data is becoming a crucial way for companies to outperform their competitors. This makes the data relevancy an utmost requirement. This brings another dimension ‘Veracity’ which eventually decides the accuracy of data. In most industries, established competitors and new entrants alike will leverage data-driven strategies to innovate, transform, and generate value. Big Data will help to create new growth opportunities and entirely new categories of business processes such as designing data requirement and aggregate and analyze industry data. These processes will be used to ingest large information flows which pour data about products and services, buyers and suppliers, consumer preferences and intent.
Different data mean differently to different organizations. There four different data stages which are coherent to four analytics stages as below

1. Information Data Stage: This is the basic data stage where data is recognized at information level. The business value of the outcome in this stage is not much but it is the simplest form of data. This will only provide numbers, facts etc. This is the starting phase of each organization in Big data journey. This stage mainly talks about ‘What’ part of it. The descriptive type of analysis tools are used in this stage to provide dashboards, reports etc.

2. Knowledge Data Stage: This data stage is a step ahead from information. In this stage organizations are trying to identify relationships between different types of information. The diagnostic techniques in this stage use search based or query based dynamic reporting tools. It mainly focuses on ‘Why’ part of it.

3. Intelligence Data Stage: This data stage tries to find out the patterns from the ‘knowledge’ that organizations have. Advanced algorithms, statistical techniques are used to mine the data and to identify typical patterns. This is used in predictive analysis to forecast ‘what will happen’.

4. Wisdom Data Stage: This stage is an ultimate data stage where organizations can make use
 of their information, knowledge and Intelligence to be the market leaders by setting Industry-best practices. The prescriptive analysis is used to formulate the strategy and achieve the business objectives.


Tuesday, November 19, 2013

Psedo-societies and virtual relationships

I have always wondered about the paradoxical illusions of 'being social' that networking sites throw upon us. Gone past are  the days when people were being social 'in person'. Today is an age of virtual reality and many social networking companies  are making us realize about it. Today profiles make friendships and continue the virtual relationships by being connected  'digitally'. Strong bonds of digital relationships are being manufactured by likes, comments and sharing things over  internet. We want ourselves getting noticed and this virtual, pseudo-social nature of networking is the fastest way to do  it. Are we really being social? My view is a big NO.

Today's world is quite fast paced world. Here no one is allowed to take a stop or rest for a while. We are busy doing  something or the other incessantly and that's how one becomes a successful person. Most of us are performing an act as per the script written for us. Some do it as it is required, some do it exceptionally well. Day starts with yesterdays work and  ends with thoughts about tomorrow's. Where is the time for ourselves, our friends, our families, our passion? The time slots for all these are shrunk and one most peculiar thing that I have noticed is our definition of family is also getting  shrunk. We may have more than 500 friends on face book but our family ends at me, my wife/husband, my kids and that's it. The true relationships ask for commitments, compromises, understanding and sacrifices. These are all reciprocating virtues. If you possess them , you can experience it in return. Are we running away from these? Are we being too selfish?
Unfortunately, the answer is Yes. We don't need these virtues to make friends over internet. Your new friends are just one  click away. And that is the most easiest way to make friends and relationships. Isn't it?

There is one more important aspect to it - the need of association, the need of being noticed, the need of being  appreciated. With such granular families, such a miniscule sphere of influence, we need somebody, some forum to recognize  us so that we feel associated to some form of society. We want people to know where we went, what we cooked, what good  things are happening in our lives so that they extend their acknowledgement, wishes to us. This gives us oxygen to breathe in tomorrow's polluted environments. And why not? These are the only good things happening and we want to share this to somebody. That how number of likes on your recipe is increasing and that's how you tend to like some good snaps of your friend on face book.

One thing is true that the likes of face book has kept the sense of relationships alive over distances. You can get any sorts of updates(either natural or manufactured) of anybody you wish over internet. Internet has bridged the distances between continents but it has reduced our ability to make human bonds. This is an era of personalities. The pseudo social networks are about personalities but we still need human connection to get to know the true characters.

Monday, November 18, 2013

Festival SRT200

After SRT200 festival, all brand machines will start gearing up for a new unknown festival, which nobody knows at this moment and for which the GOD is yet to be identified. India is such a beautiful and holy country that I am sure the next re-incarnation of GOD is on its way. It's just the matter of revelation by Media, Politics and Corporate offices. SRT has done his magic, which has enthralled us for over two decades and now 'they' have sensed that it's time to retire this GOD. It could be the intuitive feeling for the GOD itself that he is falling to a position of mere human, which is quite reasonable. But the fact is, SRT has ended his international career finally. A speculation from critics, which had grown into a fierce dragon-like  fire spitting creature for past 3-4 years, has come to a reality and GOD has granted them their wish in a magnificent way.

Comparing Sachin with the GOD is not just coincident with his Cricketing abilities. He is an outstanding sportsperson, no doubt about it. But does that only make him GOD? Absolutely not. If you think, far more deep and intriguing notions are associated with this GOD phenomena, specifically to Indian context. In the country of festivals and GODs people are always looking for opportunities of celebrations, with a worthy reason associated to the event. A Savior, an endower is always raised above the human levels and treated as an angel or GOD who will answer the prayers and fulfill the wishes. Sachin has done so many remarkable things to serve our nation in the world of cricket, such a reverence is called for. He acted as the winning king of a religion, language and cast agnostic battlefield of sport. He has always been looked up to as an idol, on and off the field. He has established his legacy on the field by fiercely talking with his bat only, where many other 'humanlike' cricketers of his caliber indulged themselves in abusive game. He has never hurt anybody even off the field. I feel there is a strong connection between the growing middle and upper middle class economic entities and artist-middle-class roots of Sachin's family background. The GOD like behavior is revered by masses because the conduct that SRT has maintained throughout his career is quite dearer to hearts of millions. Since centuries our parents have taught us to  behave like this and Sachin perfectly managed to personify the exact lessons of 'middleclass ideology'. Neither he was involved in big controversy nor he tried to end any of such like match fixing. He was never known as an abusive figure on the field as probably he could not use bad words but let the bat do the business. This was not crafted and scripted but was an outcome of the upbringing that he has had in 'Sahitya Sahavas'. The innocence and simplicity are the two virtues that he has kept with him all along and these two have performed major role in having GOD like stature and veneration for an individual. As a testimony, the speech that he has given on his retirement day is one of the class act which touched millions of souls.


There is an observed tendency of Indians to look for a GOD like figure which can be looked up to in hard times. We need somebody as a pacemaker. We need somebody who will act as savior to us. When many of the things go wrong, we tend find one such Sachin, one such Amitabh, one such Bhimsen Joshi who act as an analgesic to our daily bruises life. Does that mean the roots of this GOD making tendency lie in our inabilities to fight with problems? Does that mean we always need a 'Maseha' to get us out of troubles? GODs and festivals are always good. They will give us strength and ability to believe in self. We need not submit ourselves to them. 


I am sure there will be a next 'GOD in making' somewhere in some industry with the scripts by Media, politics and corporate,  and those will be executed flawlessly with the same euphoria that we have witnessed on SRT200 festival.

Wednesday, July 17, 2013

5 things to avoid while being Agile!

Ron was quite disturbed since he got his music system. ‘It’s useless to me!’ He was yelling at salesman over the phone furiously. ‘How can you do that? What can I do with this system without the power cable and user manual? It’s of no value to me. This box is useless without accessories.” Salesman kept his calm and replied very humbly. “Sir, don’t worry. We know that. We will deliver the power cable after two weeks and user manual after three weeks.” Ron was shocked to hear that. So it was not a mistake but a planned delivery. He has no other way now but to wait till he gets his requirement completely done. One thing is clear; he will not go and buy any product from this company in future. Watch out for few words like ‘value’, ‘useless’, ‘completely done’. Is it about money or commitment or quality or requirement? No. The company failed to understand the ‘Value’ that the product offers to the customer. They understood the requirement, they delivered it on time but still it’s useless. It’s not completely done. The ‘Business Value delivered’ was ‘Zero’. I know this is a hypothetical case but just imagine what will happen if things like this happen with business IT!

Agile methodology focuses on ‘Business Value Delivered’. Instead of calculating and sizing the requirement in terms of time and money, agile makes you think in terms of business value that we are going to offer to the customer by delivering the product. Customer is more interested in ‘Value delivered’ than detailed project plans. In most of the software project management assignments we often try to plan and showcase the value invested in terms of time and money and nobody really thinks about the ‘business value’ delivered. One thing IT needs to understand that ‘Software projects are run for business and not for IT.’  WBS (Work Breakdown Structure), schedule, variances will speak about the effectiveness of the project management but not about business value. These things are also important but not the ‘only’ important things. Agile methodology helps us realize the business value offered and business value expected. Less the gap between two, more successful is the delivery. Remember, software delivery process is not going to change. “Analysis-Design-Development-testing-Support/maintenance” cycle will remain as it is till eternity. Agile makes it different in terms of packaging it together as a value which business finds useful. There is no point conducting agile ceremonies just for the sake of it. Most of the times managers tend to work traditional way under agile framework which may in fact hamper the ‘agility’ of the process.
I have listed 5 such phenomena which one should try and avoid while doing agile project management. I call it as 5 demons which threaten to be a part of your framework, structure and process, and you might not even realize that they are getting bigger and bigger to eat your system. They mark their presence in each part of the scrum life cycle from pre-plan, plan, execute, check and improve, and may produce unhealthy state of the deliverables. You don’t want them, right?

1. Break and Make
2. Unplanned Sprint Backlog
3. Zero Assessment
4. Partial Completion
5. Hours Calibration

Break and Make

As we all know, agile works on user stories and the ‘features’ which essentially are the requirements of the product. While drafting user stories one must ask this question. ‘Does this provide any value to customer?’ So adding a report to dashboard is definitely of some business value to customer, but preparing a test case document is not going to provide any kind of business value. It’s a part of tasks that a story may have. Most of the times,  we end up breaking the user story into its tasks and making individual task as separate user story for the sprint. This will create a picture of very busy product backlog with very effective sprint velocity. But if we see closely, these are the tasks with some story points which are not even stories at all. It is like cutting an apple into 5 pieces and booking them as 5 apples which actually makes just one apple out of it. This will indicate the false sprint backlog and false burn-down of the product backlog. This will take your project away from agile. These tasks are important but these are tasks essentially and not the requirements. We should not assign story point for tasks but hours estimation should be done. Agile does not stops at stories. We need to have estimated timeline for each task for a story. Detailed FBS (Feature Breakdown Structure) should be done with the tasks required for each feature (story). Don’t get confused and confuse your product owner between stories and tasks.

Unplanned Sprint Backlog

When a sprint is started, the scrum master should be aware of its team capacity, its sprint velocity and the items in product backlog. Product backlog is the source of the stories for a particular sprint. Team should be able to pick the tasks up from the product backlog into its sprint backlog and plan those for the sprint. If the sprint velocity for a team is 75 story points/sprint (three weeks), then the sprint should be planned to complete user stories worth 75 story points for that sprint. If this is not planned, we cannot identify the effectiveness of the sprint in terms of points planned vs. delivered points. There can be some challenges like intermediate injections, impediments, re-planning, reprioritization etc. but this will really tell you how your team is getting ‘burnt’. These causes should be highlighted with proper RCA (Root Cause Analysis) and product owner should be made aware of it. This will help you plan future sprints. If planning is not done and stories are taken up in the sprint backlog as and when they are injected, this will not help to baseline the team velocity at any point of time and we will not be able to provide any sort of forecast.  This will not help the project and team to be ‘agile’. Though agile does not focus on schedule variance and effort variance kind of metrics, team needs to make sure that ‘we deliver the committed as planned.’

Zero Assesment

Story points are not really related to the hours. When we estimate story points, we consider these three points: Uncertainty, Complexity and Efforts. All three are subject to change as we progress through sprint. It’s always advisable to spend some time to re-estimate the story points which is the basic step of assessment. This will help the team to give realistic estimations which will help stabilize the team velocity. If a story is estimated at 80 point because of lack of clarity, the re-planning/assessment should be done once the scope becomes certain and known to the team. If it’s continued as estimated, then we do not make justice to the business value of 80 point story. We produce false impression of value delivery. Assessments will help team to do the causal analysis of the variance by which team misses the sprint velocity or beats the sprint velocity.  Each ‘miss’ or ‘beat’ calls for velocity revision. A false ‘beat’ can lead to future misses and any ‘miss’ will not help the team to be a continuously improving team. Assessment can be done at the time of retrospection and we can assess our initial estimations against the work teams done. This will always help in future sprints, making the team an agile and continuously improving team.

Partial Completion 

Imagine the situation where you end up in a meeting without any evidence of the work that you did, hours that you spent and the output that you produced. I hate such type of situations. In agile if you don’t prepare you stories properly, you are going to end up facing either one of the above situation. A story should be a ‘unit requirement’ or a feature which cannot be broken further. Each story ideally should produce some outcome which possible could have the value worth its estimation (story point). If stories are not drafted properly, team starts working on it and it creates problem when we had to remove/change/postpone the requirement. Then a scrum master may end up closing partial points (e.g. 5 out 13 point story) and remove/change/postpone the requirement or close it without addressing the change or remove it keeping the work done unnoticed or create a story for the task completion and accommodate that in sprint backlog. This is going to produce false impression of the velocity and product owner/scrum master will never get to know the team’s true capacity and velocity. Such forced finishes will lead to velocity illusions from sprint to sprint.

Hours Calibration

Story point estimation of a user story is essentially a unit of measurement of the business value team delivers by implementing the story. Hours taken to complete this, is one of the factor which influences the business value, but it’s not the only one. Here we go totally wrong sometimes to map the story points to hours and start calibrating them on hours scale. If we want to map it, there is no point having story points altogether. Business requirements are different in nature and their fulfillment demands different set of skills, which are not same always. Teams who implement these requirements, are going to change and will not be same throughout. The people, who form the teams, have different skill levels and again, these are not same always. In such a changing environment, the hours estimation is not good enough to estimate the business value of a particular requirement and its implementation. If we take some real life examples, lifting a 25 kg box up to the height of 100 meters and cutting a wooden log are two totally different things. The time required for this depends upon the tools available and skill level and experience of the person who actually does this. Also the value delivered by the outcome stands significant in own its way. We just cannot compare these against time scale only. Hence the story estimation should not be done by comparing one requirement to other in terms of hours. Each story should be treated as separate story and this can be done effectively if we understand the business value that we deliver.

If you succeed to keep these demons at shore, I am sure your voyage will be agile enough to guide your passengers safely to the destination even in the environmental dynamics and disturbances!

Remainder

Have you ever though about remainders?. Nothing great in it but thought of sharing it.
Its about Negative number division.
What is 14/3 ? Its 4.667. with 4 as quotient and 2 remainder. Hence 14/3=4 plus 2/3.
Now what is -14/3 or 14/-3? It should be -4.667, but some interesting facts.
What could be the quotient for (-14/3) ... -4 ? and remainder...... -1.. ?
I don't think so.

If we go to the basics, If the number which is not completely divisible by divisor we get a greatest integer number which is completely divisible by divisor but lesser that dividend and subtract it from dividend. The subtraction in nothing but the remainder. In above example if the quotient is -4 then -12 should be lesser that -14. But Its NOT. Also the remainder of the division can not be negative. It is the one which 'remains' (negative can not remain....huhhhh..!!)

So... continuing with the same example, what if quotient is -5..? Lets see.
3*-5= -15 , which is less that -14. And remainder is 1...Hurrrrrayyyyyy...!
So -14/3 = -5 plus 1/3 which turns to -4.667
Also for 14/(-3), quotient is -4 and remainder is 2
Hence 14/(-3) = -4 remainder 2 = -4 plus 2/(-3) = -4.667

This is called is Aditya's theory of remainder..... :)))))) (Good Joke)