Evaluate Various Techniques of Data Warehouse and Data Mining with Web Based Tool

Main Article Content

M. Aarthi
J. Priyadharshini

Abstract

All enterprise has a crucial role to play proficiently and productively to maintain its survival in the market and increase its profitability shares. This challenge becomes more complicated with advancement in information technology along with increasing volume and complexity of information. Currently, success of an enterprise is not just the result of efforts by resources but also depends upon its ability to mine the data from the stored information. Data warehousing is a compilation of decision making procedure to integrate and manage the large variant data efficiently and scientifically. Data mining shores up organizations, scrutinize their data more effectively and proficiently to achieve valuable information, that can reward an intelligent and strategic decision making. Data mining has several techniques and maths algorithms which are used to mine large data to increase the organization performance and strategic decision-making. Clustering is a powerful and widely accepted data mining method used to segregate the large data sets into group of similar objects and provides to the end user a sophisticated view of database. This study discusses the basic concept of clustering; its meaning and applications, especially in business for division and selection of target market. This technique is useful in marketing or sales side and, for example, sends a promotion to the right target for that product or service. Association is a known data mining techniques. A pattern is inferred based on an affiliation between matter of same business transaction. It is also referred as relation technique. Large enterprises depend on this technique to research customer's buying preferences. For instance, to track people's buying behavior, retailers might categorize that a customer always buy sambar onion when they buy dal, and therefore suggest that the next time that they buy dal they might also want to buy onion. Classification – it is one of the data mining concept differs from the above in a way it is used on machine learning and makes use of techniques used in maths such as linear programming, decision trees, neural network. In classification, enterprises try to build tool that can learn how to classify the data items into groups. For instance, a company can define a classification in the application that “given all records of employees who offered to resign from the company, predict the number of individuals who are likely to resign from the company in future.” Under such a scenario, the company can classify the records of employees into two groups that namely “separate” and “retain”. It can use its data mining software to classify the employees into separate groups created earlier. Fuzzy logic resembles human reasoning greatly in handling of imperfect information and can be used as a flexibility tool for soften the boundaries in classification that suits the real problems more efficiently. The present study discusses the meaning of fuzzy logic, its applications and different features. A tool to be build to check data mining algorithms and algorithm behind the model, apply clustering method as a sample in tool to select the training data out of the large data base and reduce complexity and time while computing. K-nearest neighbor method can be used in many applications from general to specific to find the requested data out of huge data. Decision trees – A decision tree is a structure that includes a root node, branches, and leaf nodes. Every one interior node signify a test on an attribute, each branch denotes the result of a test, and each leaf node represents a class label. The topmost node in the tree is the root node. Within the decision tree, we start with a simple question that has multiple answers. Each respond show the way to a further query to help classify or identify the data so that it can be categorized, or so that a prediction can be made based on each answer. Regression analysis is the data mining method of identifying and analyzing the relationship between variables. It is used to identify the likelihood of a specific variable, given the presence of other variables. Outlier detection technique refers to observation of data items in the dataset which do not match an expected pattern or expected behaviour. This technique can be used in a variety of domains, such as intrusion, detection, fraud or fault detection, etc. Outer detection is also called Outlier Analysis or Outlier mining. Sequential Patterns technique helps to find out similar patterns or trends in transaction data for definite period.

Article Details

Section
Articles