MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS
ABSTRACT
In any competitive business, success is based on the ability to make an item more appealing to customers than thecompetition. A number of questions arise in the context of this task: how do we formalize and quantify the competitiveness between twoitems? Who are the main competitors of a given item? What are the features of an item that most affect its competitiveness? Despitethe impact and relevance of this problem to many domains, only a limited amount of work has been devoted toward an effectivesolution. In this paper, we present a formal definition of the competitiveness between two items, based on the market segments thatthey can both cover. Our evaluation of competitiveness utilizes customer reviews, an abundant source of information that is available ina wide range of domains. We present efficient methods for evaluating competitiveness in large review datasets and address the naturalproblem of finding the top-k competitors of a given item. Finally, we evaluate the quality of our results and the scalability of ourapproach using multiple datasets from different domains.
EXISTING SYSTEM:
This paper builds on and significantly extends our preliminarywork on the evaluation of competitiveness.To the best of our knowledge, our work is the first toaddress the evaluation of competitiveness via the analysisof large unstructured datasets, without the need for directcomparative evidence. Nonetheless, our work has ties toprevious work from various domains.Managerial Competitor Identification: The managementliterature is rich with works that focus on how managerscan manually identify competitors. Some of these worksmodel competitor identification as a mental categorizationprocess in which managers develop mental representationsof competitors and use them to classify candidate firms. Other manual categorization methods are basedon market- and resource-based similarities between a firmand candidate competitors. Finally, managerialcompetitor identification has also been presented as a sensemakingprocess in which competitors are identified basedon their potential to threaten an organizations identity
PROPOSED SYSTEM:
This example illustrates the ideal scenario, in which wehave access to the complete set of customers in a givenmarket, as well as to specific market segments and theirrequirements. In practice, however, such information is notavailable. In order to overcome this, we describe a methodfor computing all the segments in a given market based onmining large review datasets. This method allows us to operationalizeour definition of competitiveness and addressthe problem of finding the top-k competitors of an item inany given market. As we show in our work, this problempresents significant computational challenges, especially inthe presence of large datasets with hundreds or thousandsof items, such as those that are often found in mainstreamdomains. We address these challenges via a highly scalableframework for top-k computation, including an efficientevaluation algorithm and an appropriate index.Our work makes the following contributions:Our work makes the following contributions:• A formal definition of the competitiveness betweentwo items, based on their appeal to the variouscustomer segments in their market. Our approachovercomes the reliance of previous work on scarcecomparative evidence mined from text.• A formal methodology for the identification of thedifferent types of customers in a given market, aswell as for the estimation of the percentage of customersthat belong to each type.• A highly scalable framework for finding the top-kcompetitors of a given item in very large datasets.
CONCLUSION
We presented a formal definition of competitiveness betweentwo items, which we validated both quantitativelyand qualitatively. Our formalization is applicable acrossdomains, overcoming the shortcomings of previous approaches.We consider a number of factors that have beenlargely overlooked in the past, such as the position ofthe items in the multi-dimensional feature space and thepreferences and opinions of the users. Our work introducesan end-to-end methodology for mining such informationfrom large datasets of customer reviews. Based on ourcompetitiveness definition, we addressed the computationallychallenging problem of finding the top-k competitorsof a given item. The proposed framework is efficient andapplicable to domains with very large populations of items.The efficiency of our methodology was verified via an experimentalevaluation on real datasets from different domains.Our experiments also revealed that only a small numberof reviews is sufficient to confidently estimate the differenttypes of users in a given market, as well the number of usersthat belong to each type.
REFERENCES
[1] M. E. Porter, Competitive Strategy: Techniques for Analyzing Industriesand Competitors. Free Press, 1980.
[2] R. Deshpand and H. Gatingon, “Competitive analysis,” MarketingLetters, 1994.
[3] B. H. Clark and D. B. Montgomery, “Managerial Identification ofCompetitors,” Journal of Marketing, 1999.
[4] W. T. Few, “Managerial competitor identification: Integratingthe categorization, economic and organizational identity perspectives,”Doctoral Dissertaion, 2007.
[5] M. Bergen and M. A. Peteraf, “Competitor identification and competitoranalysis: a broad-based managerial approach,” Managerialand Decision Economics, 2002.
[6] J. F. Porac and H. Thomas, “Taxonomic mental models in competitordefinition,” The Academy of Management Review, 2008.
[7] M.-J. Chen, “Competitor analysis and interfirm rivalry: Toward atheoretical integration,” Academy of Management Review, 1996.
[8] R. Li, S. Bao, J. Wang, Y. Yu, and Y. Cao, “Cominer: An effectivealgorithm for mining competitors from the web,” in ICDM, 2006.
[9] Z. Ma, G. Pant, and O. R. L. Sheng, “Mining competitor relationshipsfrom online news: A network-based approach,” ElectronicCommerce Research and Applications, 2011.
[10] R. Li, S. Bao, J. Wang, Y. Liu, and Y. Yu, “Web scale competitordiscovery using mutual information,” in ADMA, 2006.
[11] S. Bao, R. Li, Y. Yu, and Y. Cao, “Competitor mining with the web,”IEEE Trans. Knowl. Data Eng., 2008.