Abstract
When taking new approaches to old problems it is not surprising that mistakes may be made. Understanding the approach to big data is necessary before embarking on such a project, which has been described as an art as much as it uses technical skills and science. Data mining cannot be automated; it requires an analyst with good critical thinking skills who can imagine things from a customer’s perspective in order to tease out the insights that are useful to marketing and assist in the formulation of better questions and targeting of the appropriate segment. Three main mistakes that are often made by beginners are assuming that aggregate analysis of the data will be sufficient to provide the needed information, using disorganized and unstructured data and ignoring suspicious findings in the data results.

Data mining is a huge trend in marketing research, but there continue to be many misunderstandings and mistakes that are made as businesses venture into this interesting area of research. Three main mistakes that are often made by beginners are assuming that aggregate analysis of the data will be sufficient to provide the needed information, using disorganized and unstructured data and ignoring suspicious findings in the data results. By better understanding the mistakes that can be made, particularly when first entering into data mining research, it is possible to avoid problems and focus on using big data to grow business and profitability by focusing on target customer segments.

Order Now
Use code: HELLO100 at checkout

The mistake: assuming aggregate analysis of data will be sufficient
As described by Nisbet, Miner and Elder (2009), statistical analysis is insufficient when trying to learn insights from big data. Aggregate data can tell you many things which summarize the common characteristics of current customers or potential customers, but this alone cannot provide the predictive values that are needed in order to fully capitalize on the use of big data.

An example of the use of big data which was reported in the mainstream media was the case of a pregnant teen. The case made headlines because the online Target store was able to deduce that she was pregnant before her father did (Hill, 2012). Such insights do not come from repeatedly analyzing summary data, but rather from insightful analysis of multiple sources of details. For example, no amount of statistical analysis is going to reveal which customers or potential customers are expecting to have babies, but by targeting the analysis of the browsing behavior of customers who have also registered items in a baby shower registry it is possible to begin to build a profile of behaviors that would predict that a customer is expecting a child. In the Target example certain factors were used to determine this, such as the purchase of large cotton swabs and face cloths (Hill, 2012).

Clearly there is more insight that can be found in order to build profiles of this segment of the market by looking at the details of such customers. The details can provide insights that lead to better formulations of questions that can be answered using big data, including not just the identification of persons who meet a certain target but also how to gain their attention.

The mistake: disorganized and unstructured data
When the results of data mining are slopping and disorganized, the findings will not be as useful as when a more organized and structured approach is taken by design. This requires understanding what data is being sought in advance, and ensuring that there is an automated means of capturing that information. When data is haphazard, incomplete or relies on default assumptions the resulting data will not truly reflect the customers, and this will lead to building faulty customer profiles of a particular segment.

Sloppy and unstructured data can include using the wrong data sources or entries in order to make determinations. For example, using the current customer email list in order to determine new potential sources of customers is a mistake. This only provides some structured data, and it is not organized around the question which needs an answer. A better way to approach such a question would be to use the IP address of browsers who have not made purchases to determine what the location is of those individuals. Perhaps it turns out that there is significant traffic from a country which currently is not supported in terms of shipping. Such information would allow for a business to make a decision regarding adding shipping support, given the indications of strong interest from potential customers in that country.

The mistake: Ignoring findings which are suspicious
Unexpected or suspicious findings are interesting for two reasons. The first is that they may hold the key to new insights into a previously unrecognized customer segment. The second is that such findings may provide insight regarding a systematic error or problem in the data capture or collection which should be corrected (Nisbet et al., 2009).

Perhaps a survey is showing that considerable proportions of respondents are always answering with the first possible response on a series of multiple choice questions. If that data is taken into consideration without first investigating why that might be, an entire marketing campaign might be designed around a customer segment that in fact does not exist. It may be that the default for the survey is set to the first response of each question, and when respondents fail to complete the survey or skip the question it falsely relates the first option as the response. In that case the whole of the data has become contaminated by this systematic problem.

Conclusion
When taking new approaches to old problems it is not surprising that mistakes may be made. Understanding the approach to big data is necessary before embarking on such a project, which has been described as an art as much as it uses technical skills and science. Data mining cannot be automated; it requires an analyst with good critical thinking skills who can imagine things from a customer’s perspective in order to tease out the insights that are useful to marketing and assist in the formulation of better questions and targeting of the appropriate segment. By keeping in mind the three mistakes described, businesses which are testing and practicing their data mining capacity can avoid common blunders on a path to successfully exploiting the power of big data.

    References
  • Hill, K. (2012). How Target figured out a teen girl was pregnant before her father did. Forbes, February, 16, 2012.
  • Nisbet, R., Miner, G., & Elder IV, J. (2009). Handbook of statistical analysis and data mining applications. Academic Press.