CS614-Data Warehousing Quiz MCQS #Objective #Questions #FinalTerm
1. Which of the following is NOT one of the variants of Nested-loop join?
- Naïve nested-loop join
- Indexed nested-loop join
- Temporary index nested-loop join
- Binary index nested-loop join ✔
2. ___ lists each term in the collection only once and then shows a list of all the documents that contain the given term
- Inverted index ✔
- Bitmap index
- Cluster index
- Join index
3. The optimizer uses a hash join to join two tables if they are joined using a equijoin and
- Outer table has less number of rows
- Inner table has less number of rows
- Cardinality of tables is equal
- Large amount of data needs to be joined ✔
4. In nested-loop join case, if there are ‘M’ rows in outer table ‘N’ rows in inner table, time complexity is
- O (M log N)
- O (log MN)
- O (MN) ✔
- O (M + N)
5. Goal driven approach of data warehouse development was result of ___ work
- Bill Inmon
- Ralph Kimball ✔
- Bonhnlein and Ulbrich-vom
- Westerman
6. In ___ phase of a fundamental data warehouse life cycle model, a working model of data warehouse is deployed for a selective set of users
- Design
- Prototype ✔
- Deployment
- Operation
7. One of the drawbacks of waterfall model is that:
- Customers can not review the product during development ✔
- It does not work when the resources are limited
- It does not define the project timeline/schedule
- All of the given
8. Implementation of a data warehouse requires ___ activities
- Highly integrated
- Loosely integrated
- Tightly decoupled
- None ✔
9. The ___ phase of fundamental data warehouse life cycle model includes data warehouse daily maintenance activities
- Deployment
- Operation ✔
- Enhancement
- Maintenance
10. As per Bill Inmon, a data warehouse, in contrast with classical application is
- Data driven ✔
- Resource driven
- Requirement driven
- Time sensitive
11. Which of the following activity executes parallel with all other activities in Kimball’s DWH development approach?
- Requirement elicitation
- Project Planning ✔
- Project management
- Deployment
12. Which of the following activity/activities is/are part of project planning phase in Kimball’s DWH development approach?
- Obtain resources
- Establish the preliminary scope and justification
- Assess organization’s readiness for a data warehouse initiative
- All of the given ✔
13. Improper documentation results the problem(s) like:
- Maintenance issue
- New developers unable to configure already existing code
- Lot of time required for enhancing the code
- All of the given ✔
14. Which of the following is NOT one of the possible pitfalls in DWH Life Cycle & Development?
- Not having multiple servers
- Low priority for OLAP Cube Construction
- Improper documentation
- None ✔
15. The Kimball’s iterative data warehouse development approach drew on decades of experience to develop the ___
- Business dimensional lifecycle ✔
- Data Warehouse dimension
- Business definition lifecycle OLAP Dimension
16. For a smooth DWH implementation we must be a technologist
- True
- False ✔
17. During the application specification activity, we also must give consideration to the organization of the applications
- True ✔
- False
18. The most recent attack is the ___ attack on the cotton crop during 2003-04, resulting in a loss of nearly 0.5 million bales
- Boll Worm ✔
- Purple Worm
- Blue Worm
- Cotton Worm
19. We must try to find the one access tool that will handle all the needs of their users
- True
- False ✔
20. As opposed to the outcome of classification, estimation deal with ___ value outcome
- Discrete
- Isolated
- Continuous ✔
- Distinct
21. To identify the ___ required we need to perform data profiling
- Degree of transformation ✔
- Complexity
- Cost
- Time
22. ___ in agriculture extension is that pest population beyond which the benefit of spraying outweighs its cost
- Profit Threshold Level
- Economic Threshold Level ✔
- Medicine Threshold Level
- None
23. People that design and build the data warehouse must be capable of working across the organization at all levels
- True ✔
- False
24. The ___ is only a small part in realizing the true business value buried within the mountain of data collected and stored within organizations business systems and operational databases
- Independence on technology
- Dependence on technology ✔
- None
25. Data Transformation Service (DTS) provide a set of ___ that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations supported by DTS connectivity
- Tools ✔
- Documentations
- Guidelines
- Graphs
26. The goal of ___ is to look at as few blocks as possible to find the matching records
- Indexing ✔
- Partitioning
- Joining
- None
27. A dense index, if fits into memory, costs only ___ disk I/O access to locate a record by given key
- One ✔
- Two
- log n
- n
28. The key idea behind ___ is to take a big task and break it into subtasks that can be processed concurrently on a stream of data inputs in multiple, overlapping stages of execution
- Pipeline Parallelism ✔
- Overlapped parallelism
- Massive parallelism
- Distributed parallelism
29. Non uniform distribution, when data is distributed across the processors, is called ___
- Skew in partition ✔
- Pipeline distribution
- Distributed distribution
- Uncontrolled distribution
30. Th goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The smaller the portion of the program that must be executed ___ the greater the scalability of the computation
- Sequentially ✔
- In parallel
- Distributed
- None
31. Data mining is a/an ___ approach, where browsing through data using data mining techniques may reveal something that might be of interest to the user as information that was unknown previously
- Exploratory ✔
- Non exploratory
- Computer science
- None
32. Data mining evolve as a mechanism to cater the limitations of ___ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.
- OLTP ✔
- OLAP
- DSS
- DWH
33. ___ is the technique in which existing heterogeneous segments are reshuffled, relocated into homogeneous segments.
- Clustering ✔
- Aggregation
- Segmentation
- Partitioning
34. To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?
- Pearson correlation is the only technique
- Euclidean distance is the only technique
- Both Pearson correlation and Euclidean distance ✔
- None
35. For a given data set, to get a global view in un-supervised learning we use
- One-way clustering ✔
- Bi-clustering
- Pearson correlation
- Euclidean distance
36. In DWH project, it is assured that ___ environment is similar to the production environment
- Designing
- Development ✔
- Analysis
- Implementation
37. For a DWH project, the key requirement are ___ and product experience
- Tools
- Industry ✔
- Software
- None
38. Pipeline parallelism focuses on increasing throughput of task execution, NOT on ___ sub-task execution time
- Increasing
- Decreasing ✔
- Maintaining
- None
39. Many data warehouse project teams waste enormous amounts of time searching in vain for a ___
- Silver bullet ✔
- Golden bullet
- Suitable hardware
- Compatible product
40. Focusing on data warehouse delivery only often end up ___
- Rebuilding ✔
- Success
- Good stable product
- None
41. Pakistan is one of the five major ___ countries in the world
- Cotton-growing ✔
- Rice-growing
- Weapon producing
- Wheat production
42. ___ is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records
- Data profiling ✔
- Data Anomaly Detection
- Record Duplicate Detection
- None
43. ___ contributes to an under-utilization of valuable and expensive historical data, and inevitably results in a limited capability to provide decision support and analysis
- The lack of data integration and standardization ✔
- Data stored in heterogeneous sources
- Missing data
- OLAP
44. DTS allows us to connect through any data source or destination that is supported by ___
- OLE DB ✔
- OLAP
- OLTP
- Data Warehouse
45. Execution can be completed successfully or it may be stopped due to some error. In case of successful completion of execution all the transactions will be ___
- committed to the database ✔
- rolled back
- completed
- executed
46. If some errors occurs, execution will be terminated abnormally and all transactions will be rolled back. In this case when we will access the database we will find it in the state that was before the ___
- Execution of package ✔
- Creation of package
- Connection of package
47. To judge effectiveness we perform data profiling twice
- One before extraction and other after extraction
- One before transformation and the other after transformation ✔
- One before loading and the other after loading
48. ___ if fits into memory, costs only one disk I/O access to locate a record by given key
- A dense index ✔
- A sparse index
- An inverted index
- None
49. The purpose of the House of Quality technique is to reduce ___ types of risk
- Two ✔
- Three
- Four
- All
50. NUMA stands for ___
- Non-uniform Memory Access ✔
- Non-updatable Memory Architecture
- New Universal Memory Architecture
51. There are many variants of the traditional nested-loop join. If the index is built as part of the query plan and subsequently dropped, it is called
- Naïve nested-loop
- Index nested loop
- Temporary index nested-loop join ✔
- None
52. Investing years in architecture and forgetting the primary purpose of solving business problems, results in inefficient application. This is the example of ___ mistake
- Extreme Technology Design
- Extreme Architecture Design
- None ✔
53. Classification consists of examining the properties of a newly presented observation and assigning it to a predefined ___
- Object
- Container
- Subject
- Class ✔
54. During business hours, most ___ systems should probably not use parallel execution
- OLAP
- DSS
- Data Mining
- OLTP ✔
55. In contrast to statistics, data mining is ___ driven
- Assumption
- Knowledge ✔
- Human
- Database
56. The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The ___ the portion of the program that must be executed sequentially, the greater the scalability of the computation.
- Larger
- Smaller ✔
- Unambiguous
- Superior
57. If every key in the data file is represented in the index file then index is
- Dense index ✔
- Sparse index
- Inverted index
- None
58. An optimized structure which is built primarily for retrieval, with update being only a secondary consideration is
- OLTP
- OLAP
- DSS
- Inverted index ✔
59. ___, if too big an does not fit into memory, will be expensive when used to find a record by given key
- An inverted index
- A sparse index
- A dense index ✔
- None
60. Which of the following is not an activity of Data Quality Analysis Project?
- Define
- Measure
- Analyze
- Compression ✔
61. Data mining uses ___ algorithms to discover patterns and regularities in data
- Mathematical
- Computational
- Statistical ✔
- None
62. In contrast to data mining, statistics is ___ driven
- Assumption ✔
- Knowledge
- Discovery
- Database
63. There are many variants of the traditional nested-loop join. When the entire table is scanned it is called
- Index nested-loop join
- Naïve nested-loop join ✔
- Temporary index nested-loop join
- None
64. Which of the following is not a technique of Data Mining?
- Estimation
- Prediction
- Clustering
- Normalization ✔
65. There are many variants of the traditional nested-loop join, if there is an index is exploited, then it is called ___
- Naïve nested loop join
- Index nested loop join ✔
- Temporary index nested loop join
- None