CCA-500 Exam - Cloudera Certified Administrator for Apache Hadoop (CCAH)

certleader.com

for Cloudera certification, Real Success Guaranteed with Updated . 100% PASS CCA-500 Cloudera Certified Administrator for Apache Hadoop (CCAH) exam Today!

Online CCA-500 free questions and answers of New Version:

NEW QUESTION 1
Your company stores user profile records in an OLTP databases. You want to join these records with web server logs you have already ingested into the Hadoop file system. What is the best way to obtain and ingest these user records?

  • A. Ingest with Hadoop streaming
  • B. Ingest using Hive’s IQAD DATA command
  • C. Ingest with sqoop import
  • D. Ingest with Pig’s LOAD command
  • E. Ingest using the HDFS put command

Answer: C

NEW QUESTION 2
Which process instantiates user code, and executes map and reduce tasks on a cluster running MapReduce v2 (MRv2) on YARN?

  • A. NodeManager
  • B. ApplicationMaster
  • C. TaskTracker
  • D. JobTracker
  • E. NameNode
  • F. DataNode
  • G. ResourceManager

Answer: A

NEW QUESTION 3
A slave node in your cluster has 4 TB hard drives installed (4 x 2TB). The DataNode is configured to store HDFS blocks on all disks. You set the value of the dfs.datanode.du.reserved parameter to 100 GB. How does this alter HDFS block storage?

  • A. 25GB on each hard drive may not be used to store HDFS blocks
  • B. 100GB on each hard drive may not be used to store HDFS blocks
  • C. All hard drives may be used to store HDFS blocks as long as at least 100 GB in total is available on the node
  • D. A maximum if 100 GB on each hard drive may be used to store HDFS blocks

Answer: B

NEW QUESTION 4
You’re upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce a block size of 128MB for all new files written to the cluster after upgrade. What should you do?

  • A. You cannot enforce this, since client code can always override this value
  • B. Set dfs.block.size to 128M on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final
  • C. Set dfs.block.size to 128 M on all the worker nodes and client machines, and set the parameter to fina
  • D. You do not need to set this value on the NameNode
  • E. Set dfs.block.size to 134217728 on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final
  • F. Set dfs.block.size to 134217728 on all the worker nodes and client machines, and set the parameter to fina
  • G. You do not need to set this value on the NameNode

Answer: C

NEW QUESTION 5
Which YARN daemon or service monitors a Controller’s per-application resource using (e.g., memory CPU)?

  • A. ApplicationMaster
  • B. NodeManager
  • C. ApplicationManagerService
  • D. ResourceManager

Answer: A

NEW QUESTION 6
You have recently converted your Hadoop cluster from a MapReduce 1 (MRv1) architecture to MapReduce 2 (MRv2) on YARN architecture. Your developers are accustomed to specifying map and reduce tasks (resource allocation) tasks when they run jobs: A developer wants to know how specify to reduce tasks when a specific job runs. Which method should you tell that developers to implement?

  • A. MapReduce version 2 (MRv2) on YARN abstracts resource allocation away from the idea of “tasks” into memory and virtual cores, thus eliminating the need for a developer to specify the number of reduce tasks, and indeed preventing the developer from specifying the number of reduce tasks.
  • B. In YARN, resource allocations is a function of megabytes of memory in multiples of 1024m
  • C. Thus, they should specify the amount of memory resource they need by executing –D mapreduce-reduces.memory-mb-2048
  • D. In YARN, the ApplicationMaster is responsible for requesting the resource required for a specific launc
  • E. Thus, executing –D yarn.applicationmaster.reduce.tasks=2 will specify that the ApplicationMaster launch two task contains on the worker nodes.
  • F. Developers specify reduce tasks in the exact same way for both MapReduce version 1 (MRv1) and MapReduce version 2 (MRv2) on YAR
  • G. Thus, executing –D mapreduce.job.reduces-2 will specify reduce tasks.
  • H. In YARN, resource allocation is function of virtual cores specified by the ApplicationManager making requests to the NodeManager where a reduce task is handeled by a single container (and thus a single virtual core). Thus, the developer needs to specify the number of virtual cores to the NodeManager by executing –p yarn.nodemanager.cpu-vcores=2

Answer: D

NEW QUESTION 7
You observed that the number of spilled records from Map tasks far exceeds the number of map output records. Your child heap size is 1GB and your io.sort.mb value is set to 1000MB. How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?

  • A. For a 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O
  • B. Increase the io.sort.mb to 1GB
  • C. Decrease the io.sort.mb value to 0
  • D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as close to equals) the number of map output records.

Answer: D

NEW QUESTION 8
You are migrating a cluster from MApReduce version 1 (MRv1) to MapReduce version 2 (MRv2) on YARN. You want to maintain your MRv1 TaskTracker slot capacities when you migrate. What should you do/

  • A. Configure yarn.applicationmaster.resource.memory-mb and yarn.applicationmaster.resource.cpu-vcores so that ApplicationMaster container allocations match the capacity you require.
  • B. You don’t need to configure or balance these properties in YARN as YARN dynamically balances resource management capabilities on your cluster
  • C. Configure mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum ub yarn-site.xml to match your cluster’s capacity set by the yarn-scheduler.minimum-allocation
  • D. Configure yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores to match the capacity you require under YARN for each NodeManager

Answer: D

NEW QUESTION 9
Assume you have a file named foo.txt in your local directory. You issue the following three commands:
Hadoop fs –mkdir input
Hadoop fs –put foo.txt input/foo.txt
Hadoop fs –put foo.txt input
What happens when you issue the third command?

  • A. The write succeeds, overwriting foo.txt in HDFS with no warning
  • B. The file is uploaded and stored as a plain file named input
  • C. You get a warning that foo.txt is being overwritten
  • D. You get an error message telling you that foo.txt already exists, and asking you if you would like to overwrite it.
  • E. You get a error message telling you that foo.txt already exist
  • F. The file is not written to HDFS
  • G. You get an error message telling you that input is not a directory
  • H. The write silently fails

Answer: CE

NEW QUESTION 10
Your cluster’s mapred-start.xml includes the following parameters
<name>mapreduce.map.memory.mb</name>
<value>4096</value>
<name>mapreduce.reduce.memory.mb</name>
<value>8192</value>
And any cluster’s yarn-site.xml includes the following parameters
<name>yarn.nodemanager.vmen-pmen-ration</name>
<value>2.1</value>
What is the maximum amount of virtual memory allocated for each map task before YARN will kill its Container?

  • A. 4 GB
  • B. 17.2 GB
  • C. 8.9 GB
  • D. 8.2 GB
  • E. 24.6 GB

Answer: D

NEW QUESTION 11
Which two features does Kerberos security add to a Hadoop cluster?(Choose two)

  • A. User authentication on all remote procedure calls (RPCs)
  • B. Encryption for data during transfer between the Mappers and Reducers
  • C. Encryption for data on disk (“at rest”)
  • D. Authentication for user access to the cluster against a central server
  • E. Root access to the cluster for users hdfs and mapred but non-root access for clients

Answer: AD

NEW QUESTION 12
You are running Hadoop cluster with all monitoring facilities properly configured. Which scenario will go undeselected?

  • A. HDFS is almost full
  • B. The NameNode goes down
  • C. A DataNode is disconnected from the cluster
  • D. Map or reduce tasks that are stuck in an infinite loop
  • E. MapReduce jobs are causing excessive memory swaps

Answer: B

NEW QUESTION 13
What does CDH packaging do on install to facilitate Kerberos security setup?

  • A. Automatically configures permissions for log files at & MAPRED_LOG_DIR/userlogs
  • B. Creates users for hdfs and mapreduce to facilitate role assignment
  • C. Creates directories for temp, hdfs, and mapreduce with the correct permissions
  • D. Creates a set of pre-configured Kerberos keytab files and their permissions
  • E. Creates and configures your kdc with default cluster values

Answer: B

NEW QUESTION 14
Your cluster has the following characteristics:
✑ A rack aware topology is configured and on
✑ Replication is set to 3
✑ Cluster block size is set to 64MB
Which describes the file read process when a client application connects into the cluster and requests a 50MB file?

  • A. The client queries the NameNode for the locations of the block, and reads all three copie
  • B. The first copy to complete transfer to the client is the one the client reads as part of hadoop’s speculative execution framework.
  • C. The client queries the NameNode for the locations of the block, and reads from the first location in the list it receives.
  • D. The client queries the NameNode for the locations of the block, and reads from a random location in the list it receives to eliminate network I/O loads by balancing which nodes it retrieves data from any given time.
  • E. The client queries the NameNode which retrieves the block from the nearest DataNode to the client then passes that block back to the client.

Answer: B

NEW QUESTION 15
Each node in your Hadoop cluster, running YARN, has 64GB memory and 24 cores. Your yarn.site.xml has the following configuration:
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>32768</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>12</value>
</property>
You want YARN to launch no more than 16 containers per node. What should you do?

  • A. Modify yarn-site.xml with the following property:<name>yarn.scheduler.minimum-allocation-mb</name><value>2048</value>
  • B. Modify yarn-sites.xml with the following property:<name>yarn.scheduler.minimum-allocation-mb</name><value>4096</value>
  • C. Modify yarn-site.xml with the following property:<name>yarn.nodemanager.resource.cpu-vccores</name>
  • D. No action is needed: YARN’s dynamic resource allocation automatically optimizes the node memory and cores

Answer: A

NEW QUESTION 16
You have just run a MapReduce job to filter user messages to only those of a selected geographical region. The output for this job is in a directory named westUsers, located just below your home directory in HDFS. Which command gathers these into a single file on your local file system?

  • A. Hadoop fs –getmerge –R westUsers.txt
  • B. Hadoop fs –getemerge westUsers westUsers.txt
  • C. Hadoop fs –cp westUsers/* westUsers.txt
  • D. Hadoop fs –get westUsers westUsers.txt

Answer: B

NEW QUESTION 17
You have a Hadoop cluster HDFS, and a gateway machine external to the cluster from which clients submit jobs. What do you need to do in order to run Impala on the cluster and submit jobs from the command line of the gateway machine?

  • A. Install the impalad daemon statestored daemon, and daemon on each machine in the cluster, and the impala shell on your gateway machine
  • B. Install the impalad daemon, the statestored daemon, the catalogd daemon, and the impala shell on your gateway machine
  • C. Install the impalad daemon and the impala shell on your gateway machine, and the statestored daemon and catalogd daemon on one of the nodes in the cluster
  • D. Install the impalad daemon on each machine in the cluster, the statestored daemon and catalogd daemon on one machine in the cluster, and the impala shell on your gateway machine
  • E. Install the impalad daemon, statestored daemon, and catalogd daemon on each machine in the cluster and on the gateway node

Answer: D

NEW QUESTION 18
Choose three reasons why should you run the HDFS balancer periodically?(Choose three)

  • A. To ensure that there is capacity in HDFS for additional data
  • B. To ensure that all blocks in the cluster are 128MB in size
  • C. To help HDFS deliver consistent performance under heavy loads
  • D. To ensure that there is consistent disk utilization across the DataNodes
  • E. To improve data locality MapReduce

Answer: CDE

Explanation: http://www.quora.com/Apache-Hadoop/It-is-recommended-that-you-run-the-HDFS-balancer-periodically-Why-Choose-3

100% Valid and Newest Version CCA-500 Questions & Answers shared by Certleader, Get Full Dumps HERE: https://www.certleader.com/CCA-500-dumps.html (New 60 Q&As)