Content area
Full Text
Abstract
One of the main concepts in pattern recognition is clustering. This technique is used as important knowledge discovery tools in modern machine learning process. Clustering of high-performance companies is very important not only for investors, but also for the creditors, financial creditors, stockholders, etc. Hence, firms' clustering is considered as one of the important issues in Tehran Stock Exchange (TSE). To this end, we have used financial statement data of three industries in TSE for the year 2012. After selecting profit criteria (attributes) and prioritizing them using AHP, k means clustering algorithm is used to classify these companies. Also, to obtain the optimal number of clusters, different validity measures are presented. The identification of clusters of companies of TSE can be exploited to improve planning and get to more comprehensive decision making about companies.
Keywords: Clustering, AHP, K- means algorithm, Validity.
Introduction
The purpose of any investor is to seek desirable investment opportunities for maximizing profit. Nowadays, investing in companies stocks needs financial knowledge, profitable stock selection and efficient use of capital. On the other hand, one of the important human activities is to classify complex phenomena by using their characteristics. Clustering or cluster analysis is the main method of classification (Sharma, 1996). This technique has many applications in various sciences. Mirkin (1996) defined it as "a mathematical technique designed for revealing classification structures in the data collected in the real world phenomena" (Nanda et al. 2010). Clustering is used to divide a data set into classes using the principle of maximizing the intra class similarity and minimizing inter class similarity. It means that, clusters are formed so that objects which are similar are grouped together and objects that are very different fall into other clusters (Babu et al. 2012). Cluster analysis is not the ultimate goal of research; rather it is beginning for another works. In this paper, after determining criteria and prioritizing them by using Analytic Hieratical Process (AHP), we demonstrate well known clustering technique namely K-means as well as some validity indexes to obtain the optimal number of clusters.
The rest of this paper is organized as follows. Section 2 describes relevant literature review. Section 3 explains methodology of research. Section 4 shows main findings. Finally, in Section 5 conclusion...