Content area
Full Text
ABSTRACT
KNOWLEDGE DISCOVERY IN DATABASES (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and the mechanisms for retrieving potential knowledge from data collections. Related issues include data collection, database design, the description of entries in the database using the most appropriate representation, and data quality. This article is an introductory overview of knowledge discovery in databases. The rationale and environment of its development and applications are discussed. Issues related to database design and collection are reviewed.
INTRODUCTION
Development of techniques to investigate databases, or the contents of databases, is of significant interest. As data storage space becomes less expensive, data collection as a tool has become more accessible and more used. Organizations are literally stockpiling data in warehouses for future investigation. Research is being done to ascertain if there are patterns, notjust within databases but within documents and disciplines, that contribute to knowledge retrieval.
Every discipline has borders that expand and contract with the practical and intellectual adventurism of its members. As the collective knowledge base has grown, it is apparent that aspects of one field cross into many other fields. The evolution of information technology also provides a bridge across disciplines-in its theories and applications to various disciplines. Knowledge discovery in databases (KDD) is another manifestation of the expansion of investigative tools across fields of interest and applications.
Many disciplines contribute to the undertaking of KDD. Some are more cognizant than others of the many factors involved with data collection. This article is an overview of knowledge discovery in databases. Discussion of recurring concerns from different perspectives about the collection, classification, and quality of data related to applications of KDD is presented.
DATABASES AND KNOWLEDGE DISCOVERY
Dramatic improvements in information technology have encouraged the massive collection and storage of data in all areas from commerce to research. From operational databases where personnel data are kept; to transactional systems that track sales, inventory and patron data; to fulltext document databases and more; databases are growing in size, number, and application. The enormous increase in databases of all sizes and designs is evidence of our ability to collect data, but it also creates the necessity for better methods to access and analyze data. Human capacity to handle the data available in these...