MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS

MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS

ABSTRACT

In any competitive business, success is based on the ability to make an item more appealing to customers than thecompetition. A number of questions arise in the context of this task: how do we formalize and quantify the competitiveness between twoitems? Who are the main competitors of a given item? What are the features of an item that most affect its competitiveness? Despitethe impact and relevance of this problem to many domains, only a limited amount of work has been devoted toward an effectivesolution. In this paper, we present a formal deﬁnition of the competitiveness between two items, based on the market segments thatthey can both cover. Our evaluation of competitiveness utilizes customer reviews, an abundant source of information that is available ina wide range of domains. We present efﬁcient methods for evaluating competitiveness in large review datasets and address the naturalproblem of ﬁnding the top-k competitors of a given item. Finally, we evaluate the quality of our results and the scalability of ourapproach using multiple datasets from different domains.

EXISTING SYSTEM:

This paper builds on and signiﬁcantly extends our preliminarywork on the evaluation of competitiveness.To the best of our knowledge, our work is the ﬁrst toaddress the evaluation of competitiveness via the analysisof large unstructured datasets, without the need for directcomparative evidence. Nonetheless, our work has ties toprevious work from various domains.Managerial Competitor Identiﬁcation: The managementliterature is rich with works that focus on how managerscan manually identify competitors. Some of these worksmodel competitor identiﬁcation as a mental categorizationprocess in which managers develop mental representationsof competitors and use them to classify candidate ﬁrms. Other manual categorization methods are basedon market- and resource-based similarities between a ﬁrmand candidate competitors. Finally, managerialcompetitor identiﬁcation has also been presented as a sensemakingprocess in which competitors are identiﬁed basedon their potential to threaten an organizations identity

PROPOSED SYSTEM:

This example illustrates the ideal scenario, in which wehave access to the complete set of customers in a givenmarket, as well as to speciﬁc market segments and theirrequirements. In practice, however, such information is notavailable. In order to overcome this, we describe a methodfor computing all the segments in a given market based onmining large review datasets. This method allows us to operationalizeour deﬁnition of competitiveness and addressthe problem of ﬁnding the top-k competitors of an item inany given market. As we show in our work, this problempresents signiﬁcant computational challenges, especially inthe presence of large datasets with hundreds or thousandsof items, such as those that are often found in mainstreamdomains. We address these challenges via a highly scalableframework for top-k computation, including an efﬁcientevaluation algorithm and an appropriate index.Our work makes the following contributions:Our work makes the following contributions:• A formal deﬁnition of the competitiveness betweentwo items, based on their appeal to the variouscustomer segments in their market. Our approachovercomes the reliance of previous work on scarcecomparative evidence mined from text.• A formal methodology for the identiﬁcation of thedifferent types of customers in a given market, aswell as for the estimation of the percentage of customersthat belong to each type.• A highly scalable framework for ﬁnding the top-kcompetitors of a given item in very large datasets.

CONCLUSION

We presented a formal deﬁnition of competitiveness betweentwo items, which we validated both quantitativelyand qualitatively. Our formalization is applicable acrossdomains, overcoming the shortcomings of previous approaches.We consider a number of factors that have beenlargely overlooked in the past, such as the position ofthe items in the multi-dimensional feature space and thepreferences and opinions of the users. Our work introducesan end-to-end methodology for mining such informationfrom large datasets of customer reviews. Based on ourcompetitiveness deﬁnition, we addressed the computationallychallenging problem of ﬁnding the top-k competitorsof a given item. The proposed framework is efﬁcient andapplicable to domains with very large populations of items.The efﬁciency of our methodology was veriﬁed via an experimentalevaluation on real datasets from different domains.Our experiments also revealed that only a small numberof reviews is sufﬁcient to conﬁdently estimate the differenttypes of users in a given market, as well the number of usersthat belong to each type.

REFERENCES

[1] M. E. Porter, Competitive Strategy: Techniques for Analyzing Industriesand Competitors. Free Press, 1980.

[2] R. Deshpand and H. Gatingon, “Competitive analysis,” MarketingLetters, 1994.

[3] B. H. Clark and D. B. Montgomery, “Managerial Identiﬁcation ofCompetitors,” Journal of Marketing, 1999.

[4] W. T. Few, “Managerial competitor identiﬁcation: Integratingthe categorization, economic and organizational identity perspectives,”Doctoral Dissertaion, 2007.

[5] M. Bergen and M. A. Peteraf, “Competitor identiﬁcation and competitoranalysis: a broad-based managerial approach,” Managerialand Decision Economics, 2002.

[6] J. F. Porac and H. Thomas, “Taxonomic mental models in competitordeﬁnition,” The Academy of Management Review, 2008.

[7] M.-J. Chen, “Competitor analysis and interﬁrm rivalry: Toward atheoretical integration,” Academy of Management Review, 1996.

[8] R. Li, S. Bao, J. Wang, Y. Yu, and Y. Cao, “Cominer: An effectivealgorithm for mining competitors from the web,” in ICDM, 2006.

[9] Z. Ma, G. Pant, and O. R. L. Sheng, “Mining competitor relationshipsfrom online news: A network-based approach,” ElectronicCommerce Research and Applications, 2011.

[10] R. Li, S. Bao, J. Wang, Y. Liu, and Y. Yu, “Web scale competitordiscovery using mutual information,” in ADMA, 2006.

[11] S. Bao, R. Li, Y. Yu, and Y. Cao, “Competitor mining with the web,”IEEE Trans. Knowl. Data Eng., 2008.

MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS

Recent Post

Project Categories