Hortonworks Hadoop 2.0 Certification exam for Pig and Hive Developer Sample Questions:
1. In Hadoop 2.0, which TWO of the following processes work together to provide automatic failover of the NameNode? Choose 2 answers
A) JournalNode
B) QuorumManager
C) ZooKeeper
D) ZKFailoverController
2. You have the following key-value pairs as output from your Map task:
(the, 1) (fox, 1) (faster, 1) (than, 1) (the, 1) (dog, 1)
How many keys will be passed to the Reducer's reduce method?
A) Four
B) One
C) Six
D) Five
E) Two
F) Three
3. You've written a MapReduce job that will process 500 million input records and generated 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reduces which is a potential bottleneck. A custom implementation of which interface is most likely to reduce the amount of intermediate data transferred across the network?
A) Partitioner
B) WritableComparable
C) InputFormat
D) OutputFormat
E) Writable
F) Combiner
4. Examine the following Hive statements:
Assuming the statements above execute successfully, which one of the following statements is true?
A) The contents of File1 are parsed as comma-delimited rows and stored in a database
B) The file named File1 is moved to to/user/joe/x/
C) Hive reformats File1 into a structure that Hive can access and moves into to/user/joe/x/
D) The contents of File1 are parsed as comma-delimited rows and loaded into /user/joe/x/
5. Which of the following tool was designed to import data from a relational database into HDFS?
A) HCatalog
B) Sqoop
C) Ambari
D) Flume
Solutions:
Question # 1 Answer: A,D | Question # 2 Answer: D | Question # 3 Answer: F | Question # 4 Answer: B | Question # 5 Answer: B |