AWS-Certified-Big-Data-Specialty Exam - Amazon AWS Certified Big Data - Speciality

certleader.com

It is more faster and easier to pass the Amazon AWS-Certified-Big-Data-Specialty exam by using Practical Amazon Amazon AWS Certified Big Data - Speciality questuins and answers. Immediate access to the Updated AWS-Certified-Big-Data-Specialty Exam and find the same core area AWS-Certified-Big-Data-Specialty questions with professionally verified answers, then PASS your exam with a high score now.

Check AWS-Certified-Big-Data-Specialty free dumps before getting the full version:

NEW QUESTION 1
A company generates a large number of files each month and needs to use AWS import/export to
move these files into Amazon S3 storage. To satisfy the auditors, the company needs to keep a record of which files were imported into Amazon S3.
What is a low-cost way to create a unique log for each import job?

  • A. Use the same log file prefix in the import/export manifest files to create a versioned log file in Amazon S3 for all imports
  • B. Use the log file prefix in the import/export manifest file to create a unique log file in Amazon S3 for each import
  • C. Use the log file checksum in the import/export manifest file to create a log file in Amazon S3 for each import
  • D. Use script to iterate over files in Amazon S3 to generate a log after each import/export job

Answer: B

NEW QUESTION 2
An Amazon EMR cluster using EMRFS has access to Megabytes of data on Amazon S3, originating
from multiple unique data sources. The customer needs to query common fields across some of the data sets to be able to perform interactive joins and then display results quickly.
Which technology is most appropriate to enable this capability?

  • A. Presto
  • B. MicroStrategy
  • C. Pig
  • D. R Studio

Answer: A

NEW QUESTION 3
You have a large number of web servers in an Auto Scaling group behind a load balancer. On an hourly basis, you want to filter and process the logs to collect data on unique visitors, and then put that data in a durable data store in order to run reports. Web servers in the Auto Scaling group are constantly launching and terminating based on your scaling policies, but you do not want to lose any of the log data from these servers during a stop/termination initiated by a user or by Auto Scaling. What two approaches will meet these requirements? Choose 2 answers

  • A. Install an Amazon CloudWatch Logs Agent on every web server during the bootstrap proces
  • B. Create a CloudWatch log group and define metric Filters to create custom metrics that track unique visitors from the streaming web server log
  • C. Create a scheduled task on an Amazon EC2 instance that runs every hour to generate a new report based on the CloudWatch custom metrics
  • D. On the web servers, create a scheduled task that executes a script to rotate and transmit the logs to Amazon Glacie
  • E. Ensure that the operating system shutdown procedure triggers a logs transmission when the Amazon EC2 instance is stopped/terminate
  • F. Use Amazon Data pipeline to process data in Amazon Glacier and run reports every hour
  • G. On the web servers, create a scheduled task that executes a script to rotate and transmit the logs to an Amazon S3 bucke
  • H. Ensure that the operating system shutdown process triggers a logs transmission when the Amazon EC2 instance is stopped/terminate
  • I. Use AWS Data Pipeline to move log data from the Amazon S3 bucket to Amazon Redshift in order to process and run reports every hour
  • J. Install an AWS Data Pipeline Logs Agent on every web server during the bootstrap proces
  • K. Create a log group object in AWS Data Pipeline, and define Metric filters to move processed log data directly from the web servers to Amazon Redshift and runs reports every hour

Answer: AC

NEW QUESTION 4
An Administrator needs to design the event log storage architecture for events from mobile devices.
The event data will be processed by an Amazon EMR cluster daily for aggregated reporting and analytics before being archived.
How should the administrator recommend storing the log data?

  • A. Create an Amazon S3 bucket and write log data into folders by device Execute the EMR job on the device folders
  • B. Create an Amazon DynamoDB table partitioned on the device and sorted on data, write log data to the tabl
  • C. Execute the EMR job on the Amazon DynamoDB table
  • D. Create an Amazon S3 bucket and write data into folders by da
  • E. Execute the EMR job on the daily folder
  • F. Create an Amazon DynamoDB table partitioned on EventID, write log data to tabl
  • G. Execute the EMR job on the table

Answer: C

NEW QUESTION 5
When attached to an Amazon VPC which two components provide connectivity with external
networks? Choose 2 answers

  • A. Elastic IPS (EIP)
  • B. NAT Gateway (NAT)
  • C. Internet Gateway {IGW)
  • D. Virtual Private Gateway (VGW)

Answer: CD

NEW QUESTION 6
You are deploying an application to collect votes for a very popular television show. Millions of users
will submit votes using mobile devices. The votes must be collected into a durable, scalable, and highly available data store for real-time public tabulation. Which service should you use?

  • A. Amazon DynamoDB
  • B. Amazon Redshift
  • C. Amazon Kinesis
  • D. Amazon Simple Queue Service

Answer: C

NEW QUESTION 7
You have an application running on an Amazon Elastic Compute Cloud instance, that uploads 5 GB
video objects to Amazon Simple Storage Service (S3). Video uploads are taking longer than expected, resulting in poor application performance. Which method will help improve performance of your application?

  • A. Enable enhanced networking
  • B. Use Amazon S3 multipart upload
  • C. Leveraging Amazon CloudFront, use the HTTP POST method to reduce latency.
  • D. Use Amazon Elastic Block Store Provisioned IOPs and use an Amazon EBS-optimized instance

Answer: B

NEW QUESTION 8
A company needs to deploy services to an AWS region which they not previously used. The company currently has an AWS identity and Access Management (IAM) role for their Amazon EC2 instances, which permits the instance to have access to Amazon DynamoDB. The company wants their EC2 instances in the new region to have the same privileges. How should the company achieve this?

  • A. Create a new IAM role and associated policies within the new region
  • B. Assign the existing IAM role to the Amazon EC2 instances in the new region
  • C. Copy the IAM role and associated policies to the new region and attach it to the instances
  • D. Create the Amazon Machine Image of the instance and copy it to the desired region using the AMI Copy feature

Answer: B

NEW QUESTION 9
A company needs a churn prevention model to predict which customers will NOT review their yearly
subscription to the company’s service. The company plans to provide these customers with a promotional offer. A binary classification model that uses Amazon Machine Learning is required. On which basis should this binary classification model be built?

  • A. User profiles (age, gender, income, occupation)
  • B. Last user session
  • C. Each user time series events in the past 3 months
  • D. Quarterly results

Answer: C

NEW QUESTION 10
A us-based company is expanding their web presence into Europe. The company wants to extend their AWS infrastructure from Northern Virginia (us-east-1) into the Dublin (eu-west-1) region. Which of the following options would enable an equivalent experience for users on both continents?

  • A. Use a public-facing load balancer per region to load-balancer web traffic, and enable HTTP health checks
  • B. Use a public-facing load balancer per region to load balancer web traffic, and enable sticky sessions
  • C. Use Amazon Route S3, and apply a geolocation routing policy to distribution traffic across both regions
  • D. Use Amazon Route S3, and apply a weighted routing policy to distribute traffic across both regions

Answer: C

NEW QUESTION 11
You are configuring your company’s application to use Auto Scaling and need to move user state
information. Which of the following AWS services provides a shared data store with durability and low latency?

  • A. Amazon Simple Storage Service
  • B. Amazon DynamoDB
  • C. Amazon EC2 instance storage
  • D. AWS ElasticCache Memcached

Answer: A

NEW QUESTION 12
You are managing the AWS account of a big organization. The organization has more than
1000+ employees and they want to provide access to the various services to most of the employees. Which of the below mentioned options is the best possible solution in this case?

  • A. The user should create a separate IAM user for each employee and provide access to them as per the policy
  • B. The user should create an IAM role and attach STS with the rol
  • C. The user should attach that role to the EC2 instance and setup AWS authentication on that server
  • D. The user should create IAM groups as per the organization’s departments and add each user to the group for better access control
  • E. Attach an IAM role with the organization’s authentication service to authorize each user forvarious AWS services

Answer: D

NEW QUESTION 13
A company is using Amazon Machine Learning as part of a medical software application. The application will predict the most likely blood type for a patient based on a variety of other clinical tests that are available when blood type knowledge is unavailable.
What is the appropriate model choice and target attribute combination for the problem?

  • A. Multi-class classification model with a categorical target attribute
  • B. Regression model with a numeric target attribute
  • C. Binary Classification with a categorical target attribute
  • D. K-Nearest Neighbors model with a multi-class target attribute

Answer: C

NEW QUESTION 14
A user is running one instance for only 3 hours every day. The user wants to save some cost with the
instance. Which of the below mentioned Reserved Instance categories is advised in this case?

  • A. The user should not use RI; instead only go with the on-demand pricing
  • B. The user should use the AWS high utilized RI
  • C. The user should use the AWS medium utilized RI
  • D. The user should use the AWS low utilized RI

Answer: A

NEW QUESTION 15
An administrator needs to design a distribution strategy for a star schema in a Redshift cluster. The
administrator needs to determine the optimal distribution style for the tables in the Redshift schema. In which three circumstances would choosing Key-based distribution be most appropriate? (Select three)

  • A. When the administrator needs to optimize a large, slowly changing dimension table
  • B. When the administrator needs to reduce cross-node traffic
  • C. When the administrator needs to optimize the fact table for parity with the number of slices
  • D. When the administrator needs to balance data distribution and collocation of data
  • E. When the administrator needs to take advantage of data locality on a local node of joins and aggregates

Answer: ADE

NEW QUESTION 16
A company that provides economics data dashboards needs to be able to develop software to display
rich, interactive, data-driven graphics that run in web browsers and leverages the full stack of web standards (HTML, SVG and CSS).
Which technology provides the most appropriate for this requirement?

  • A. D3.js
  • B. Python/Jupyter
  • C. R Studio
  • D. Hue

Answer: C

NEW QUESTION 17
Which of the following notification endpoints or clients are supported by Amazon Simple Notification Service? Choose 2 answers

  • A. Email
  • B. CloudFront distribution
  • C. File Transfer Protocol
  • D. Short Message Service
  • E. Simple Network Management Protocol

Answer: BC

NEW QUESTION 18
When you put objects in Amazon 53, what is the indication that an object was successfully stored?

  • A. A HTTP 200 result code and MD5 checksum, taken together, indicate that the operation was successful
  • B. A success code is inserted into the S3 object metadata
  • C. Amazon S3 is engineered for 99.999999999% durabilit
  • D. Therefore there is no need to confirm that data was inserted.
  • E. Each S3 account has a special bucket named_ s3_log
  • F. Success codes are written to this bucket with a timestamp and checksum

Answer: A

NEW QUESTION 19
A company has reproducible data that they want to store on Amazon Web Services. The company may want to retrieve the data on a frequent basis. Which Amazon web services storage option allows the customer to optimize storage costs and still achieve high availability for their data?

  • A. Amazon S3 Reduced Redundancy Storage
  • B. Amazon EBS Magnetic Volume
  • C. Amazon Glacier
  • D. Amazon S3 Standard Storage

Answer: A

NEW QUESTION 20
An administrator is deploying Spark on Amazon EMR for two distinct use cases: machine learning
algorithms and ad hoc querying. All data will be stored in Amazon S3. Two separate clusters for each use case will be deployed. The data volumes on Amazon S3 are less than 10 GB.
How should the administrator align instance types with the cluster’s purpose?

  • A. Machine Learning on C instance types and ad-hoc queries on R instance types
  • B. Machine Learning on R instance types and ad-hoc queries on G2 instance types
  • C. Machine Learning on T instance types and ad-hoc queries on M instance types
  • D. Machine Learning on D instance types and ad-hoc queries on I instance types

Answer: D

NEW QUESTION 21
A new algorithm has been written in Python to identify SPAM e-mails. The algorithm analyzes the free text contained within a sample set of 1 million e-mails stored on Amazon S3. The algorithm must be scaled across a production of 5 PB, which also resides in Amazon S3 storage
Which AWS service strategy is best for this use case?

  • A. Copy the data into Amazon ElasticCache to perform text analysis on the in-memory data and export the results of the model into Amazon machine learning
  • B. Use Amazon EMR to parallelize the text analysis tasks across the cluster using a streaming program step
  • C. Use Amazon Elasticsearch service to store the text and then use the Python Elastic search client to run analysis against the text index
  • D. Initiate a python job from AWS Data pipeline to run directly against the Amazon S3 text files

Answer: C

Explanation:
Reference: https://aws.amazon.com/blogs/database/indexing-metadata-in-amazon-elasticsearch- service-using-aws-lambda-and-python/

NEW QUESTION 22
A city has been collecting data on its public bicycle share program for the past three years. The SPB
dataset currently on Amazon S3. The data contains the following data points:
• Bicycle organization points
• Bicycle destination points
• Mileage between the points
• Number of bicycle slots available at the station (which is variable based on the station location)
• Number of slots available and taken at each station at a given time
The program has received additional funds to increase the number of bicycle stations, available. All data is regularly archived to Amazon Glacier.
The new bicycle station must be located to provide the most riders access to bicycles. How should this task be performed?

  • A. Move the data from Amazon S3 into Amazon EBS-backed volumes and EC2 Hardoop with spot instances to run a Spark job that performs a stochastic gradient descent optimization.
  • B. Use the Amazon Redshift COPY command to move the data from Amazon S3 into RedShift and platform a SQL query that outputs the most popular bicycle stations.
  • C. Persist the data on Amazon S3 and use a transits EMR cluster with spot instances to run a Spark streaming job that will move the data into Amazon Kinesis.
  • D. Keep the data on Amazon S3 and use an Amazon EMR based Hadoop cluster with spot insistences to run a spark job that perform a stochastic gradient descent optimization over EMBFS.

Answer: B

NEW QUESTION 23
Company A operates in Country X, Company A maintains a large dataset of historical purchase orders
that contains personal data of their customers in the form of full names and telephone numbers. The dataset consists of 5 text files. 1TB each. Currently the dataset resides on- premises due to legal requirements of storing personal data in-country. The research and development department need to run a clustering algorithm on the dataset and wants to use Elastic Map Reduce service in the closes AWS region. Due to geographic distance the minimum latency between the on-premises system and the closet AWS region is 200 ms.
Which option allows Company A to do clustering in the AWS Cloud and meet the legal requirement of maintaining personal data in-country?

  • A. Anonymize the personal data portions of the dataset and transfer the data files into Amazon S3 in the AWS regio
  • B. Have the EMR cluster read the dataset using EMRFS.
  • C. Establishing a Direct Connect link between the on-premises system and the AWS region to reduce latenc
  • D. Have the EMR cluster read the data directly from the on-premises storage system over Direct Connect.
  • E. Encrypt the data files according to encryption standards of Country X and store them in AWS region in Amazon S3. Have the EMR cluster read the dataset using EMRFS.
  • F. Use AWS Import/Export Snowball device to securely transfer the data to the AWS region and copy the files onto an EBS volum
  • G. Have the EMR cluster read the dataset using EMRFS.

Answer: B

NEW QUESTION 24
A photo-sharing service stores pictures in Amazon Simple Storage Service (S3) and allows application sign-in using an opened connect-compatible identity provider. Which AWS Security Token Service approach to temporary access should you use for the Amazon S3 operations?

  • A. Cross-Account Access
  • B. AWS identity and Access Management roles
  • C. SAML-based Identity Federation
  • D. Web identity Federation

Answer: C

NEW QUESTION 25
When will you incur costs with an Elastic IP address (EIP)?

  • A. When an EIP is allocated
  • B. When it is allocated and associated with a running instance
  • C. When it is allocated and associated with a stopped instance
  • D. Costs are incurred regardless of whether the EIP associated with a running instance

Answer: C

NEW QUESTION 26
......

Recommend!! Get the Full AWS-Certified-Big-Data-Specialty dumps in VCE and PDF From 2passeasy, Welcome to Download: https://www.2passeasy.com/dumps/AWS-Certified-Big-Data-Specialty/ (New 243 Q&As Version)