Tutorial 1

ch1 Questions


What is data?

Data is the basic facts such as names, numbers or characters that come in different forms (like text or image). 


What is database?

A database is an organized collection of structured information, or data, typically stored electronically in a computer system. A database is usually controlled by a database management system (DBMS). Together, the data and the DBMS, along with the applications that are associated with them, are referred to as a database system, often shortened to just database.


What is information?

Information is processed or organized data which has more meaning.


What is knowledge?

Knowledge is the processed or organized data (information) that is given some values to uncover the relationship for deeper understanding.


What is Pattern Recognition?

PATTERN RECOGNITION is a process of recognizing a pattern using machine (computer), it can be viewed through several aspects


What is data mining?

Data mining is the process of sorting through large data sets to identify patterns and relationships that can help solve business problems through data analysis. Data mining techniques and tools enable enterprises to predict future trends and make more-informed business decisions.

Data Mining – is a process/technology to mine the data for model building. 

DM - A natural evolution of database technology, in great demand, with wide applications (business, medical, manufacturing etc.)

definition: extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data exploration and analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns


What is Knowledge Discovery in Databases (KDD)?

Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. This widely used data mining technique is a process that includes data preparation and selection, data cleansing, incorporating prior knowledge on data sets and interpreting accurate solutions from the observed results.


What is Knowledge Discovery in Databases (KDD) prcess?

A KDD process includes data cleaning, data integration, data selection, transformation, data mining, pattern evaluation, and knowledge presentation.


What is big data?

Data that has the V’s characteristics (volume/variety/velocity/value)


Briefly explain the relationship between Data Mining and Big Data

Data Mining may use/mine data that has big data characteristics (volume/variety/velocity/value) to uncover the knowlege/insights of the data.


What is Data warehouse?

Data warehouse is a central repository for integrated data from multiple sources. They store 

the historical data and current data that can be used for data mining purposes.


Why Data Mining?

Today, massive growth of data availability, from Terabyte to Yottabyte, it is everywhere and anywhere such as Social Media, Society, and E-commerce. Data mining to uncover the knowlege/insights of the data.


What are the mtivation of data mining?  

Growth of data both in commercial and scientific databases due to advances in data generation and collection technologies


Commercial Viewpoint

Lots of data is being collected and warehoused

Computers have become cheaper and more powerful


Scientific Viewpoint

Data collected and stored at enormous speeds

Helps scientists in automated analysis of massive datasets



What is big data mining?

Big data mining is referred to the collective data mining or extraction techniques that is performed on large volume of data or the big data.


What is data science?

Data Science refers as an umbrella term that encompasses all of the techniques and tools used during the process of data mining/analytics.


What are Data mining tasks?

Classification, Clustering, Association, Outlier


What are Data mining techniques?

Decision tree, k-means, apriori


What are Major issues in data mining?

scalability, high dimensionality, heterogenous and complex data



What are Kinds of Database?

Relational, Data warehouse, Transactional DB, Advanced DB system, Flat files, and WWW



What are Kinds of Knowledge? 

Categorizing data (Classification), Find relationship  (Association), Subdivide similar data (Clustering), Make prediction


What are Techniques used in Data Mining?

Machine learning, Pattern recognition, Neural Network, Naïve-Bayes, K-nearest neighbour, Rough Set, Statistic


What are Application adapted Data Mining?

Finance, Marketing, Medical, Stock, Telecommunication


What are DATA MINING TASKS?

TASKS include; Classification, Clustering, Association Rules, Prediction, Sequential Analysis, Deviation analysis, Similarity analysis, Trend analysis


What are DATA MINING TECHNIQUES?

TECHNIQUES include; Decision Trees, Association Rule, k-means, Neural Networks, Naïve Bayes, k-nearest neighbor, Statistical Method


What is model?


What is Data Mining Model?

A data-mining model is structurally composed of a number of data-mining columns and a data-mining algorithm.


What is training set?

training set used to build the model


What is test set?

A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.



What is attributes?


What is class?


What is Euclidean Distance?


What is regression?


What is a linear or nonlinear model of dependency?


What is deviation analysis?


What is outlier?

What is model?


what is categorical data?


What is continuous data?


What is class?


What is learn classifier?

No comments:

Post a Comment

Welcome to Data Mining Blog

The Data Mining course introduces the concepts and methods of data mining and shows its relationship with data science. All the steps involv...