CS614-Data Warehousing Quiz MCQS #Objective #Questions #MidTerm
1. Data warehouse stores ___
- Operational data
- Historical data ✔
- Meta data
- Log files data
2. The ___ dimension represents data correctness
- Free-of-error ✔
- Completeness
- Consistency
- Correctness
3. Which of the following is not a Data Quality Validation Technique?
- Referential Integrity
- Using Data Quality Rules
- Data Histograming
- Indexes ✔
4. Which of the following is an example of Non-Additive Facts?
- Quantity sold
- Total Sale in Rs.
- Discount Percentage ✔
- Count of orders in a store
5. Which is the most complex type of transformation in the following?
- Many-to-many element transformation ✔
- One-to-one scalar transformation
- One-to-many element transformation
- All of the given
6. Serious ___ involves decomposing and reassembling the data
- Data cleansing ✔
- Data transformation
- Data loading
- Data Extraction
7. ___ is the degree of utility and value the data has to support the enterprise processes that enable accomplishing enterprise objectives
- Intrinsic Data Quality
- Realistic Data Quality ✔
- Strong Data Quality
- Weak Data Quality
8. In a decision support system ease of use is achieved by:
- Normalization
- Denormalization ✔
- Drill up
- Drill down
9. Assume a company with a multi-million row customer table i.e. n rows. Checking for Referential Integrity (RI), using a smart technique with some kind of tree data structure would require ___ time
- O(log n) ✔
- O(n)
- O(1)
- None
10. Which of the following is NOT an example of a typical grain?
- Individual Transactions
- Daily aggregates
- Monthly aggregates
- Normalized attributes ✔
11. Most DWH implementations today do not use ___ enforced by the database, but as TQM methods improve overall data quality and database optimizers
- Consistency Integrity
- Referential Integrity ✔
- Attribute domain
- Using Data Quality Rules
12. Suppose in system A, the possible values of “Gender” attribute were “Male” & “Female”, however in data warehouse, the values stored were “M” for male and “F” for female. The above scenario is an example of:
- One-to-one scalar transformation ✔
- One-to-many element transformation
- Many-to-one element transformation
- Many-to-many element transformation
13. Development of data warehouse is hard because data sources are usually ___
- Structured and homogeneous
- Unstructured and heterogeneous ✔
- Structured and heterogeneous
- Unstructured and homogeneous
14. A/an ___ is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension
- Junk dimension ✔
- Slowly changing dimension
- Multi-valued dimension
- Simple dimensions
15. ROLAP provides access to information via a relational database using
- ANSI standard SQL ✔
- Proprietary file format
- Comma Separated Values
- All of the given
16. The typical availability of OLTP system is 24/7, while that of data warehouse is ___
- 6/12 ✔
- 7/12
- 1/24
- Twice a week
17. In ___ nested-loop join of quadratic time complexity does not hurt the performance
- Typical OLTP environments ✔
- Data warehouse
- DSS
- OLAP
18. If actual data structure does not conform to documented formats then it is called:
- Syntactically dirty data ✔
- Semantically dirty data
- Coverage anomaly
- Extraction issue
19. Which of the following is not a CUBE operation?
- ANSI SQL ✔
- Roll UP
- Drill down
- pivoting
20. The data has to be checked, cleansed and transformed into a ___ format to allow easy and fast access
- unified ✔
- predicted
- qualified
- proactive
21. Which is not a/an characteristics of data quality?
- Reliability ✔
- Uniqueness
- Accessibility
- Consistency
22. The extent to which data is in appropriate languages, symbols, and units, and the definitions are clear is known as ___
- Interpretability ✔
- Uniqueness
- Accessibility
- Consistency
23. In case of multiple sources for the same data element, we need to prioritize the source systems per element bases, the process is called
- Ranking ✔
- Prioritization
- Element Selection
- Measurement Selection
24. In OLTP environments, the size of tables is relatively ___
- Large
- Fixed
- Moderate
- Small ✔
25. Change Data Capture (CDC) can be challenging task because
- Aggregates don’t change in real time
- Transformation of extracted data is difficult
- Identifying the recently modified data may be difficult ✔
- Source systems may not support extraction of changed aggregates
26. ___ is the extent to which data is regarded as true and credible
- Believability ✔
- Completeness
- Accessibility
- Consistency
27. The relation R will be in 2nd Normal Form if
- It is in 1NF and each cell contains single value
- It is in 1NF and each non key attribute is dependent upon entire primary key ✔
- It is in 1NF and each non key attribute is dependent upon a single column of composite primary key
- It is in 1NF and Primary key is composite
28. ___ is the degree to which data accurately reflects the real-world object that the data represents
- Intrinsic Data Quality ✔
- Realistic Data Quality
- Strong Data Quality
- Weak Data Quality
29. Web scrapping is a process of applying ___ techniques to the web
- Screen scrapping ✔
- Data scrapping
- Text scrapping
- Meta scrapping
30. In which class of aggregates AVERAGE function can be placed?
- Algebraic ✔
- Distributive
- Associative
- Holistic
31. Which of the following is not an “Orr’s Law of Data Quality”?
- “Data that is not used cannot be correct!”
- “Data quality is a function of its use, not its collection”
- “Data will be no better than its most stringent use!”
- “Data duplication can be harmful for the organization!” ✔
32. The ___ operator proves useful in more complex metrics applicable to the dimensions of timeliness and accessibility
- Max ✔
- Min
- Min or Max
- None
33. Which is not a/an Data Quality Validation Technique?
- Consistency Integrity ✔
- Referential Integrity
- Attribute Domain
- Using Data Quality Rules
34. Assume a company with a multi-million row customer table i.e. n rows. Checking for Referential Integrity (RI) using a naïve approach would take ___ time.
- O(n) ✔
- O(1)
- O(log n)
- None
35. ___ breaks a table into multiple tables based upon common column values
- Horizontal splitting ✔
- Vertical splitting
- Both
- None of these
36. Companies collect and record their own operational data, but at the same time they also use reference data obtained from ___ sources such as codes, prices etc.
- Operational
- None
- Internal
- External ✔
37. Ad-hoc access means to run such queries which are known already
- True
- False ✔
38. Relational databases allow you to navigate the data in ____ that is appropriate using the primary, foreign key structure within the data model.
- Only One direction
- Any direction ✔
- Two direction
- None
39. DSS queries do not involve a primary key
- True ✔
- False
40. The need to synchronize data upon update is called
- Data Manipulation
- Data Replication
- Data Coherency ✔
- Data imitation
41. Taken jointly, the extract programs or naturally evolving systems formed a spider web, also known as
- Distributed Systems Architecture
- Legacy Systems Architecture ✔
- Online Systems Architecture
- Intranet Systems Architecture
42. Node of a B-Tree is stored in memory block and traversing a B-tree involves ___ page faults
- O(n)
- O(n^2)
- O(n log n)
- O(log n) ✔
43. Which statement is true for De-Normalization?
- Redundant data is a performance liability at query time, but is a performance benefit at update time
- Redundant data is a performance liability at both query time and update time
- Redundant data is a performance benefit at both query time and update time
- Redundant data is a performance benefit at query time, but is a performance liability at update time ✔
44. De-normalization normally speeds up
- Data Retrieval ✔
- Data modification
- Development cycle
- Data replication
45. In horizontal splitting, we split a relation into multiple tables on the basis of
- Common column values ✔
- Common row values
- Different index values
- Value resulted by ad-hoc query
46. For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:
- Owner’s perspective
- Customer’s perspective ✔
- Decision Maker’s perspective
- Employee’s Perspective
47. A data warehouse may include
- Legacy systems ✔
- Only internal data sources
- Privacy restrictions
- Small data mart
48. Multidimensional databases typically use proprietary ___ format to store pre-summarized cube structures
- File ✔
- Application
- Aggregate
- Database
49. All data is ___ of something real
- I An abstraction
- II A representation
Which of the following option is true?
- I only ✔
- II only
- Both I and II
- None