Data Mining and Data Warehousing multiple choice questions with answers pdf for preparation of IT academic and competitive exams. Before jumping to the MCQs of Data Mining and Data Warehousing, lets brief some related terms.
Definition of Data Mining:
Data mining is the systematic process of extracting hidden patterns, correlations, and knowledge from vast and diverse datasets, utilizing techniques from fields such as statistics, machine learning, and database management.
It involves transforming raw data into meaningful information to support informed decision-making and gain a deeper understanding of complex phenomena.
What is the data mining process?
The data mining is the process of data collection, preprocessing, exploration, modeling, evaluation, and deployment of results to extract meaningful patterns and knowledge.
What are the applications of data mining?
Data mining is applied for customer segmentation, fraud detection, recommendation systems, predictive analytics, image recognition, sentiment analysis, and more.
Data Warehouse Defined as:
A data warehouse is a centralised storage facility for collecting and preserving past and present data from multiple sources. It provides a structured data analysis and reporting environment, supporting business intelligence initiatives.
Definiton of Data Warehouse: A repository that gathers structured and unstructured data from diverse sources, a data warehouse offers a fundamental platform for conducting data analysis and generating reports.
Warehouse Database:
A warehouse database, or data warehouse, is a specialized database designed for analytical processing. It stores historical data and supports complex queries for reporting and analysis.
Data Warehouse Explained as:
A data warehouse acts as a data hub, integrating data from different departments and systems. It transforms raw data into meaningful insights, aiding stakeholders in understanding business performance and trends.
A Data Warehouse serves as a dedicated and consolidated storage facility that houses extensive quantities of organized and unorganized data gathered from diverse origins within a company.
Unlike operational databases, which are designed for transactional tasks, a Data Warehouse is optimized for analytical processing, enabling businesses to glean valuable insights and make informed decisions.
The primary objective of a Data Warehouse is to provide a unified and consistent platform where data from disparate systems and departments can be integrated, organized, and made easily accessible for querying and reporting.
This structured storage facilitates complex data analysis, trend identification, and pattern recognition that are crucial for strategic planning and business intelligence.
In essence, a Data Warehouse acts as a treasure trove of historical and current data, serving as the foundation for data-driven decision-making. It empowers organizations to extract actionable insights, uncover hidden correlations, and generate comprehensive reports that drive business growth and innovation.
Through the careful design, integration, and management of data, a Data Warehouse becomes a vital asset in the modern digital landscape, enabling businesses to turn raw information into meaningful knowledge.
Top 70 Data Mining and Data Warehousing multiple choice questions with answers
1. OLTP stands for ___.
Ans. Online Analytical Processing
2. OLTP handles day to day business transactions (true/false)
Ans. True
3. Updates on the Data Warehouse is allowed (true/false)
Ans. False
4. Data Warehouse is a database that is designed for facilitating ___ and ___.
Ans. Query and Analysis
5. Data Warehouse is defined as subject-oriented, integrated, time-variant and ___.
Ans. Non-Volatile
6. Data Warehouse contains only aggregated data and individual transactions (true/false)
Ans. True
7. List the types of the data warehouse.
Ans. Real-time, federated and distributed
8. ___ data Warehouse will allow changes in the information to be monitored and recorded over time.
Ans. time-variant
9. The Data Warehouse functions as ___ and an Executive Information System (EIS).
Ans. DSS
10. Data about data is called ___.
Ans. Metadata
11. Data Warehouse contains data for ___ purpose.
Ans. Analysis
12. Data Warehouse is a storehouse of ___ data.
Ans. Historical
13. In most organizations, two groups of people are key to the success of the project, ___ and ___.
Ans. Senior Management and Working Management
14. OLTP systems are designed for ___.
Ans. Real-time business operations
15. Data Warehouses does not require real-time validation (True / False)
Ans. True
16. In most organizations, two groups of people are key to the success of the project, ___ and ___.
Ans. Senior Management,
17. In Data Warehouse, the requirements are gathered subject area wise. (True / False)
Ans. True
18. The 3 major functions that needed to be performed for getting the data ready into the Data Warehouse are extraction, transformation and ___.
Ans. Loading
19. ___ and ___ of data take place on a large scale in the data staging area.
Ans. Sorting and Merging
20. Knowledge discovery is called ___.
Ans. Data Mining
21. The main purpose of E-R modelling is
a. To remove redundancy
b. To improve analysis for decision-making
c. To record historical data
d. None
Ans. a
22. E-R modelling and Dimensional modelling are the same (True / False)
Ans. No
23. A Dimension is an entity or subject area, which can group the data (True / False)
Ans. True
24. Dimensional model consists of ___ and ___ tables.
Ans. Dimensions and fact tables
25. ___ is often used in dimensional modelling.
Ans. Text data
26. Fact –tables usually consist of ___to___ relationships.
Ans. Many to many
27. Dimensional model can be implemented with the following databases,
a. Relational database
b. MDDB
c. Flat files
d. Excel data files
e. None
Ans. a
28. Customer name change in the dimensional model comes under ___.
Ans. Slowly-changing-dimension
29. The most popular model for the data warehouse is ___.
Ans. Multidimensional model
30. Which of the following schema supports the normalization in dimensional modelling?
a. Star Schema
b. Snow-Flake schema
c. Fact-Constellation
Ans. a
31. Each dimension table is in ___ relationship with the central fact table.
Ans. One-to-many
32. Dimensional table and a fact table can be connected with the following database keys:
a. Foreign key
b. Surrogate key
c. Candidate key
Ans. a
33. For sales analysis units sold is a ___ kind of measure.
Ans. Additive numeric measure
34. OLAP tools are data accessing and discovery tools (True / False)
Ans. True
35. In Data Warehouse a system with multiple architectures is called ___
Ans. Federated Data Warehouse architecture
36. Data marts are,
a) Department level
b) Limited in size
c) Read-only
d) All the above
Ans. d
37. Data Warehouse functions are a Decision support system and ___.
Ans. EIS
38. Info Data extraction, ___ and ___ encompass the areas of data acquisition and data storage.
Ans. Transformation and Loading
39. Populating all the Data Warehouse tables for the very first time is called ___.
Ans. Initial Load
40. Which of the following are open source ETL tools?
a) SAS Data Integrator
b) Ascetical Data Stage
c) Cognos Decision Stream
d) Microsoft DTS
e) Clover
Ans. Clover
41. Average daily balances ___ attribute.
Ans. Derived attribute
42. OLAP stands for ___
Ans. Online analytical processing
43. OLAP tools enable the user to access the data in Data Warehouse in an interactive manner (True / False)
Ans. True
44. ERP and CRM are ___ kinds of systems.
Ans. OLTP
45. Data cube contains ___ and ___.
Ans. Dimensions and Facts
46. A dimensional table contains hierarchies (True / False)
Ans. true
47. Which of the following are the intermediate servers that stand in between a relational back-end server and client front-end tools?
a. ROLAP
b. MOLAP
c. HOLAP
d. All the above
Ans. all
48. The advantage of using a data cube is that it allows fast indexing to precomputed summarized data. (True / False)
Ans. true
49. In Data Warehouse, a single record link to all the duplicate record in the sources systems is called ___.
Ans. De-duplication
50. Sorting the data in the given source file is a transformation (True / False).
Ans. True
51. OLTP is abbreviated as ___
Ans. Online transaction processing
52. Query response time is ___ kind of metadata.
Ans. Operational metadata
53. Key hierarchies and key performance indicators are ___ kind of Metadata.
Ans. Business metadata
54. Storing, data mapping and transformation from source systems to the data warehouse fall into:
a. Technical metadata
b. Operational metadata
c. Business metadata
Ans. a
55. According to Ralph Kimball, Back-room metadata guides:
a. Extraction
b. Cleaning
c. Loading processes
d. All the above
Ans. d
56. One tool that can allow data warehouse managers to deal with metadata is called___.
Ans. Repository
57. Access rights, protocols are ___ metadata.
Ans. Administrative metadata
58. Data about data is called ___.
Ans. Metadata
59. Information can be converted into knowledge about ___ patterns and future trends.
Ans. Historical
60. Data about data is called ___.
Ans. Metadata
61. The ___ software gives the user the opportunity to look at the data from a variety of different dimensions.
Ans. Multidimensional Analysis
62. ___ Optimization techniques are based on the concepts of genetic combination, mutation, and natural selection.
Ans. Genetic algorithms
63. Based on the overall requirements of business intelligence, the ___ layer is required to extract, cleanse and transform data into load files for the information warehouse.
Ans. Data integration
64. Data Mining is not a business solution; it is just a technology. (True/False)
Ans. True
65. ___ is used to refer to systems and technologies that provide the business with the means for decision-makers to extract personalized meaningful information about their business and industry.
Ans. Business Intelligence
66. OLAP Supports ___ user access and multiple queries.
Ans. Multiple
67. Statistics techniques are incorporated into Data mining methods. (True/False).
Ans. True
68. A priori algorithm operates in ___ method
a. Bottom-up search method
b. Breadth-first search method
c. None of the above
d. Both a & b
Ans. D
69. A bi-directional search takes advantage of ___ process
a. Bottom-up process
b. Top-down process
c. None
d. Both a & b
Ans. D
70. The pincer-search has an advantage over a priori algorithm when the largest frequent itemset is long. (True/false)
Ans. True
Download Data mining & data warehousing MCQs with answers in pdf
Data mining and data warehousing MCQs with answers pdf
FAQs related to Data Mining and Data Warehousing
Ques 1: What Does Data Warehouse allow Organization to Achieve?
Answer: A data warehouse allows organizations to achieve streamlined data storage, efficient data analysis, improved decision-making, and enhanced business intelligence capabilities.
Ques 2: What is the difference between Data Warehousing and Data Mining?
Answer: Data warehousing and data mining go hand in hand. While data warehousing provides a robust infrastructure for data storage, data mining leverages advanced techniques to extract valuable insights and patterns from the stored data.
Here’s a clear explanation of the difference between Data Warehousing and Data Mining:
Data Warehousing vs. Data Mining:
- Data Warehousing:
- Definition: Data Warehousing involves the gathering, retention, and organization of both structured and unstructured data from diverse origins within a centralized storage location.
- Purpose: The primary purpose of Data Warehousing is to provide a consolidated and organized storage environment for historical and current data.
- Focus: Data Warehousing focuses on efficient data storage, retrieval, and integration to support reporting, analysis, and business intelligence.
- Components: It involves the creation of a centralized database, data transformation through ETL processes, and designing schemas for optimal querying.
- Usage: Data Warehousing is used for generating reports, dashboards, and visualizations, making historical comparisons, and supporting strategic decision-making.
- Data Mining:
- Definition: Large datasets are analyzed in order to extract valuable knowledge through the process known as data mining, which involves discovering patterns, correlations, and insights.
- Purpose: The primary purpose of Data Mining is to uncover hidden information and relationships within data that might not be immediately apparent.
- Focus: Data Mining focuses on analyzing data to identify trends, patterns, and anomalies, often using algorithms and statistical techniques.
- Techniques: It involves various techniques such as clustering, classification, regression, and association rule mining.
- Usage: Data Mining is used to predict future trends, segment data, make recommendations, and gain deeper insights for decision-making.
In essence, Data Warehousing is about efficiently storing and organizing data for easy access and analysis, while Data Mining is the process of extracting meaningful insights and knowledge from the stored data. Data Warehousing provides the foundation and infrastructure for Data Mining by creating a structured environment for data analysis.
Ques 3: In a Data warehouse, what is a dimension?
Answer: In the realm of a data warehouse, a dimension refers to a categorical attribute or characteristic that provides context and additional information to the measures or metrics being analyzed. Dimensions help categorize and organize data, allowing users to slice, dice, and drill down into the data for deeper insights.
Dimensions typically describe the “who,” “what,” “where,” “when,” and “how” aspects of the data. For instance, in a sales data warehouse, dimensions might include attributes like “product,” “time,” “location,” “customer,” and “salesperson.” These dimensions provide valuable context to the sales metrics, allowing users to analyze sales performance across different products, time periods, locations, customers, and salespeople.
Dimensions are often used in conjunction with fact tables, which contain the numerical measures or metrics being analyzed, such as sales revenue or quantity sold. By associating dimensions with facts, data warehouses enable multidimensional analysis, allowing users to explore data from various perspectives and uncover meaningful patterns and trends.
In summary, a dimension in a data warehouse serves as a vital component that categorizes and adds context to the measures being analyzed. It allows users to navigate and explore data from different angles, enhancing the depth and breadth of insights gained from the data.
Ques 4: What are the data mining techniques?
Answer: Data mining techniques include clustering, classification, regression, association rule mining, anomaly detection, and neural networks, among others.
Ques 5: How do you define data mining in DBMS (database management system)?
Answer: In a DBMS, data mining refers to the process of extracting valuable information from the stored data to discover patterns and relationships, enhancing decision-making.
Ques 5: What does data mining involve in the context of DBMS?
Answer: In a DBMS, data mining involves querying, analyzing, and visualizing data to uncover hidden insights and generate actionable knowledge.
Conclustion
Thanks for your visit, if you like the post on Data Mining and Data Warehousing multiple choice questions with answers pdf please share on social media. You may also comment on your queries.