New 2023 Realistic Free Amazon DAS-C01 Exam Dump Questions & Answer [Q86-Q104]

Share

New 2023 Realistic Free Amazon DAS-C01 Exam Dump Questions and Answer

DAS-C01 Practice Test Engine: Try These 159 Exam Questions


You can read the Best Solution to prepare AWS Certified Data Analytics Specialty Exam

TestPassKing offer you self-assessment tools that help you estimate yourself. Intuitive software interface The practical assessment tool for AWS Certified Data Analytics Specialty includes several self-assessment features, such as timed exams, randomized questions, multiple types of questions, test history, and test results, etc. You can change the question mode according to your skill level. This will help you to prepare for a valid AWS Certified Data Analytics Specialty exam dumps. There are many methods by which a person can prepare for the nonprofit cloud consultant exam. Some people prefer to watch tutorials and courses online, while others prefer to answer the questions from the AWS Certified Data Analytics Specialty exam from the previous year, and some people use appropriate preparation materials to prepare. All methods are valid, but the most useful way is to use AWS Certified Data Analytics Specialty. The preparation stuff is a complete set that allows people to know every detail about the certification and fully prepare the candidates. Certifications-questions is one of the reliable, verified and highly valued website that provides its online clients with highly detailed and related online exam preparation materials.


Amazon DAS-C01 Exam Syllabus Topics:

TopicDetails
Topic 1
  • Select the appropriate data analysis solution for a given scenario
  • Determine data access and retrievalpatterns2.3Selectanappropriate data layout, schema, structure, and format
Topic 2
  • Determine the operational characteristics of an analysis and visualization solution
  • Determine the operational characteristics of the collection system
Topic 3
  • Apply data protection and encryption techniques
  • Determine appropriate data processing solution requirements
Topic 4
  • Select the appropriate data visualization solution for a given scenario
  • Select a collection system that handles the frequency, volume, and source of data
Topic 5
  • Automate and operationalize a data processing solution
  • Determine the operational characteristics of a storage solution for analytics
Topic 6
  • Determine an appropriate system for cataloging data and managing meta data
  • Apply data governance and compliance controls

 

NEW QUESTION 86
An online retail company is migrating its reporting system to AWS. The company's legacy system runs data processing on online transactions using a complex series of nested Apache Hive queries. Transactional data is exported from the online system to the reporting system several times a day. Schemas in the files are stable between updates.
A data analyst wants to quickly migrate the data processing to AWS, so any code changes should be minimized. To keep storage costs low, the data analyst decides to store the data in Amazon S3. It is vital that the data from the reports and associated analytics is completely up to date based on the data in Amazon S3.
Which solution meets these requirements?

  • A. Create an Amazon Athena table with CREATE TABLE AS SELECT (CTAS) to ensure data is refreshed from underlying queries against the raw dataset. Create an AWS Glue Data Catalog to manage the Hive metadata over the CTAS table. Create an Amazon EMR cluster and use the metadata in the AWS Glue Data Catalog to run Hive processing queries in Amazon EMR.
  • B. Create an AWS Glue Data Catalog to manage the Hive metadata. Create an AWS Glue crawler over Amazon S3 that runs when data is refreshed to ensure that data changes are updated. Create an Amazon EMR cluster and use the metadata in the AWS Glue Data Catalog to run Hive processing queries in Amazon EMR.
  • C. Use an S3 Select query to ensure that the data is properly updated. Create an AWS Glue Data Catalog to manage the Hive metadata over the S3 Select table. Create an Amazon EMR cluster and use the metadata in the AWS Glue Data Catalog to run Hive processing queries in Amazon EMR.
  • D. Create an AWS Glue Data Catalog to manage the Hive metadata. Create an Amazon EMR cluster with consistent view enabled. Run emrfs sync before each analytics step to ensure data changes are updated.
    Create an EMR cluster and use the metadata in the AWS Glue Data Catalog to run Hive processing queries in Amazon EMR.

Answer: B

 

NEW QUESTION 87
A company wants to enrich application logs in near-real-time and use the enriched dataset for further analysis.
The application is running on Amazon EC2 instances across multiple Availability Zones and storing its logs using Amazon CloudWatch Logs. The enrichment source is stored in an Amazon DynamoDB table.
Which solution meets the requirements for the event collection and enrichment?

  • A. Configure the application to write the logs locally and use Amazon Kinesis Agent to send the data to Amazon Kinesis Data Streams. Configure a Kinesis Data Analytics SQL application with the Kinesis data stream as the source. Join the SQL application input stream with DynamoDB records, and then store the enriched output stream in Amazon S3 using Amazon Kinesis Data Firehose.
  • B. Export the raw logs to Amazon S3 on an hourly basis using the AWS CLI. Use Apache Spark SQL on Amazon EMR to read the logs from Amazon S3 and enrich the records with the data from DynamoDB.
    Store the enriched data in Amazon S3.
  • C. Export the raw logs to Amazon S3 on an hourly basis using the AWS CLI. Use AWS Glue crawlers to catalog the logs. Set up an AWS Glue connection for the DynamoDB table and set up an AWS Glue ETL job to enrich the data. Store the enriched data in Amazon S3.
  • D. Use a CloudWatch Logs subscription to send the data to Amazon Kinesis Data Firehose. Use AWS Lambda to transform the data in the Kinesis Data Firehose delivery stream and enrich it with the data in the DynamoDB table. Configure Amazon S3 as the Kinesis Data Firehose delivery destination.

Answer: D

Explanation:
Explanation
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html#FirehoseExample

 

NEW QUESTION 88
A company uses Amazon Elasticsearch Service (Amazon ES) to store and analyze its website clickstream data. The company ingests 1 TB of data daily using Amazon Kinesis Data Firehose and stores one day's worth of data in an Amazon ES cluster.
The company has very slow query performance on the Amazon ES index and occasionally sees errors from Kinesis Data Firehose when attempting to write to the index. The Amazon ES cluster has 10 nodes running a single index and 3 dedicated master nodes. Each data node has 1.5 TB of Amazon EBS storage attached and the cluster is configured with 1,000 shards. Occasionally, JVMMemoryPressure errors are found in the cluster logs.
Which solution will improve the performance of Amazon ES?

  • A. Decrease the number of Amazon ES shards for the index.
  • B. Increase the memory of the Amazon ES master nodes.
  • C. Increase the number of Amazon ES shards for the index.
  • D. Decrease the number of Amazon ES data nodes.

Answer: A

Explanation:
Explanation
https://aws.amazon.com/premiumsupport/knowledge-center/high-jvm-memory-pressure-elasticsearch/

 

NEW QUESTION 89
An online retail company with millions of users around the globe wants to improve its ecommerce analytics capabilities. Currently, clickstream data is uploaded directly to Amazon S3 as compressed files. Several times each day, an application running on Amazon EC2 processes the data and makes search options and reports available for visualization by editors and marketers. The company wants to make website clicks and aggregated data available to editors and marketers in minutes to enable them to connect with users more effectively.
Which options will help meet these requirements in the MOST efficient way? (Choose two.)

  • A. Use Amazon Elasticsearch Service deployed on Amazon EC2 to aggregate, filter, and process the data.
    Refresh content performance dashboards in near-real time.
  • B. Upload clickstream records to Amazon S3 as compressed files. Then use AWS Lambda to send data to Amazon Elasticsearch Service from Amazon S3.
  • C. Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon Elasticsearch Service.
  • D. Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data Streams consumer to send records to Amazon Elasticsearch Service.
  • E. Use Kibana to aggregate, filter, and visualize the data stored in Amazon Elasticsearch Service. Refresh content performance dashboards in near-real time.

Answer: C,E

 

NEW QUESTION 90
An IoT company wants to release a new device that will collect data to track sleep overnight on an intelligent mattress. Sensors will send data that will be uploaded to an Amazon S3 bucket. About 2 MB of data is generated each night for each bed. Data must be processed and summarized for each user, and the results need to be available as soon as possible. Part of the process consists of time windowing and other functions. Based on tests with a Python script, every run will require about 1 GB of memory and will complete within a couple of minutes.
Which solution will run the script in the MOST cost-effective way?

  • A. AWS Glue with a PySpark job
  • B. AWS Lambda with a Python script
  • C. AWS Glue with a Scala job
  • D. Amazon EMR with an Apache Spark script

Answer: B

 

NEW QUESTION 91
A company uses Amazon Redshift as its data warehouse A new table includes some columns that contain sensitive data and some columns that contain non-sensitive data The data in the table eventually will be referenced by several existing queries that run many times each day A data analytics specialist must ensure that only members of the company's auditing team can read the columns that contain sensitive data All other users must have read-only access to the columns that contain non-sensitive data Which solution will meet these requirements with the LEAST operational overhead?

  • A. Grant all users read-only permissions to the columns that contain non-sensitive data Attach an 1AM policy to the auditing team with an explicit Allow action that grants access to the columns that contain sensitive data
  • B. Grant all users read-only permissions to the columns that contain non-sensitive data Use the GRANT SELECT command to allow the auditing team to access the columns that contain sensitive data
  • C. Grant the auditing team permission to read from the table Create a view of the table that includes the columns that contain non-sensitive data Grant the appropriate users read-only permissions to that view
  • D. Grant the auditing team permission to read from the table. Load the columns that contain non-sensitive data into a second table. Grant the appropriate users read-only permissions to the second table.

Answer: B

Explanation:
https://aws.amazon.com/jp/about-aws/whats-new/2020/03/announcing-column-level-access-control-for-amazon-redshift/

 

NEW QUESTION 92
A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GB dataset. The data analyst triggered the job to run with the Standard worker type. After 3 hours, the AWS Glue job status is still RUNNING. Logs from the job run show no error codes. The data analyst wants to improve the job execution time without overprovisioning.
Which actions should the data analyst take?

  • A. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the executor-cores job parameter.
  • B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the spark.yarn.executor.memoryOverhead job parameter.
  • C. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the num-executors job parameter.
  • D. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the maximum capacity job parameter.

Answer: D

 

NEW QUESTION 93
A media company has been performing analytics on log data generated by its applications. There has been a recent increase in the number of concurrent analytics jobs running, and the overall performance of existing jobs is decreasing as the number of new jobs is increasing. The partitioned data is stored in Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) and the analytic processing is performed on Amazon EMR clusters using the EMR File System (EMRFS) with consistent view enabled. A data analyst has determined that it is taking longer for the EMR task nodes to list objects in Amazon S3.
Which action would MOST likely increase the performance of accessing log data in Amazon S3?

  • A. Use a hash function to create a random string and add that to the beginning of the object prefixes when storing the log data in Amazon S3.
  • B. Increase the read capacity units (RCUs) for the shared Amazon DynamoDB table.
  • C. Use a lifecycle policy to change the S3 storage class to S3 Standard for the log data.
  • D. Redeploy the EMR clusters that are running slowly to a different Availability Zone.

Answer: B

Explanation:
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emrfs-metadata.html

 

NEW QUESTION 94
A data engineering team within a shared workspace company wants to build a centralized logging system for all weblogs generated by the space reservation system. The company has a fleet of Amazon EC2 instances that process requests for shared space reservations on its website. The data engineering team wants to ingest all weblogs into a service that will provide a near-real-time search engine. The team does not want to manage the maintenance and operation of the logging system.
Which solution allows the data engineering team to efficiently set up the web logging system within AWS?

  • A. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis Firehose delivery stream to CloudWatch. Configure Amazon DynamoDB as the end destination of the weblogs.
  • B. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis data stream to CloudWatch. Choose Amazon Elasticsearch Service as the end destination of the weblogs.
  • C. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis Data Firehose delivery stream to CloudWatch. Choose Amazon Elasticsearch Service as the end destination of the weblogs.
  • D. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis data stream to CloudWatch. Configure Splunk as the end destination of the weblogs.

Answer: C

Explanation:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_ES_Stream.html

 

NEW QUESTION 95
A real estate company has a mission-critical application using Apache HBase in Amazon EMR. Amazon EMR is configured with a single master node. The company has over 5 TB of data stored on an Hadoop Distributed File System (HDFS). The company wants a cost-effective solution to make its HBase data highly available.
Which architectural pattern meets company's requirements?

  • A. Store the data on an EMR File System (EMRFS) instead of HDFS. Enable EMRFS consistent view.
    Create an EMR HBase cluster with multiple master nodes. Point the HBase root directory to an Amazon S3 bucket.
  • B. Store the data on an EMR File System (EMRFS) instead of HDFS and enable EMRFS consistent view.
    Create a primary EMR HBase cluster with multiple master nodes. Create a secondary EMR HBase read- replica cluster in a separate Availability Zone. Point both clusters to the same HBase root directory in the same Amazon S3 bucket.
  • C. Store the data on an EMR File System (EMRFS) instead of HDFS and enable EMRFS consistent view.
    Run two separate EMR clusters in two different Availability Zones. Point both clusters to the same HBase root directory in the same Amazon S3 bucket.
  • D. Use Spot Instances for core and task nodes and a Reserved Instance for the EMR master node.
    Configure
    the EMR cluster with multiple master nodes. Schedule automated snapshots using Amazon EventBridge.

Answer: B

 

NEW QUESTION 96
A company hosts an on-premises PostgreSQL database that contains historical dat a. An internal legacy application uses the database for read-only activities. The company's business team wants to move the data to a data lake in Amazon S3 as soon as possible and enrich the data for analytics.
The company has set up an AWS Direct Connect connection between its VPC and its on-premises network. A data analytics specialist must design a solution that achieves the business team's goals with the least operational overhead.
Which solution meets these requirements?

  • A. Create an Amazon RDS for PostgreSQL database and use AWS Database Migration Service (AWS DMS) to migrate the data into Amazon RDS. Use AWS Data Pipeline to copy and enrich the data from the Amazon RDS for PostgreSQL table and move the data to Amazon S3. Use Amazon Athena to query the data.
  • B. Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Create an Amazon Redshift cluster and use Amazon Redshift Spectrum to query the data.
  • C. Upload the data from the on-premises PostgreSQL database to Amazon S3 by using a customized batch upload process. Use the AWS Glue crawler to catalog the data in Amazon S3. Use an AWS Glue job to enrich and store the result in a separate S3 bucket in Apache Parquet format. Use Amazon Athena to query the data.
  • D. Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Use Amazon Athena to query the data.

Answer: A

 

NEW QUESTION 97
A company wants to improve user satisfaction for its smart home system by adding more features to its recommendation engine. Each sensor asynchronously pushes its nested JSON data into Amazon Kinesis Data Streams using the Kinesis Producer Library (KPL) in Java. Statistics from a set of failed sensors showed that, when a sensor is malfunctioning, its recorded data is not always sent to the cloud.
The company needs a solution that offers near-real-time analytics on the data from the most updated sensors.
Which solution enables the company to meet these requirements?

  • A. Update the sensors code to use the PutRecord/PutRecords call from the Kinesis Data Streams API with the AWS SDK for Java. Use AWS Glue to fetch and process data from the stream using the Kinesis Client Library (KCL). Instantiate an Amazon Elasticsearch Service cluster and use AWS Lambda to directly push data into it.
  • B. Set the RecordMaxBufferedTime property of the KPL to "0" to disable the buffering on the sensor side.
    Connect for each stream a dedicated Kinesis Data Firehose delivery stream and enable the data transformation feature to flatten the JSON file before sending it to an Amazon S3 bucket. Load the S3 data into an Amazon Redshift cluster.
  • C. Update the sensors code to use the PutRecord/PutRecords call from the Kinesis Data Streams API with the AWS SDK for Java. Use Kinesis Data Analytics to enrich the data based on a company-developed anomaly detection SQL script. Direct the output of KDA application to a Kinesis Data Firehose delivery stream, enable the data transformation feature to flatten the JSON file, and set the Kinesis Data Firehose destination to an Amazon Elasticsearch Service cluster.
  • D. Set the RecordMaxBufferedTime property of the KPL to "-1" to disable the buffering on the sensor side.
    Use Kinesis Data Analytics to enrich the data based on a company-developed anomaly detection SQL script. Push the enriched data to a fleet of Kinesis data streams and enable the data transformation feature to flatten the JSON file. Instantiate a dense storage Amazon Redshift cluster and use it as the destination for the Kinesis Data Firehose delivery stream.

Answer: D

 

NEW QUESTION 98
A company is sending historical datasets to Amazon S3 for storage. A data engineer at the company wants to make these datasets available for analysis using Amazon Athena. The engineer also wants to encrypt the Athena query results in an S3 results location by using AWS solutions for encryption. The requirements for encrypting the query results are as follows:
Use custom keys for encryption of the primary dataset query results.
Use generic encryption for all other query results.
Provide an audit trail for the primary dataset queries that shows when the keys were used and by whom.
Which solution meets these requirements?

  • A. Use server-side encryption with customer-provided encryption keys (SSE-C) for the primary dataset.
    Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets.
  • B. Use server-side encryption with AWS KMS managed customer master keys (SSE-KMS CMKs) for the primary dataset. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets.
  • C. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the primary dataset. Use SSE-S3 for the other datasets.
  • D. Use client-side encryption with AWS Key Management Service (AWS KMS) customer managed keys for the primary dataset. Use S3 client-side encryption with client-side keys for the other datasets.

Answer: C

 

NEW QUESTION 99
An ecommerce company is migrating its business intelligence environment from on premises to the AWS Cloud. The company will use Amazon Redshift in a public subnet and Amazon QuickSight. The tables already are loaded into Amazon Redshift and can be accessed by a SQL tool.
The company starts QuickSight for the first time. During the creation of the data source, a data analytics specialist enters all the information and tries to validate the connection. An error with the following message occurs: "Creating a connection to your data source timed out." How should the data analytics specialist resolve this error?

  • A. Create an IAM role for QuickSight to access Amazon Redshift.
  • B. Add the QuickSight IP address range into the Amazon Redshift security group.
  • C. Use a QuickSight admin user for creating the dataset.
  • D. Grant the SELECT permission on Amazon Redshift tables.

Answer: D

Explanation:
Explanation
Connection to the database times out
Your client connection to the database appears to hang or time out when running long queries, such as a COPY command. In this case, you might observe that the Amazon Redshift console displays that the query has completed, but the client tool itself still appears to be running the query. The results of the query might be missing or incomplete depending on when the connection stopped.

 

NEW QUESTION 100
A university intends to use Amazon Kinesis Data Firehose to collect JSON-formatted batches of water quality readings in Amazon S3. The readings are from 50 sensors scattered across a local lake. Students will query the stored data using Amazon Athena to observe changes in a captured metric over time, such as water temperature or acidity. Interest has grown in the study, prompting the university to reconsider how data will be stored.
Which data format and partitioning choices will MOST significantly reduce costs? (Choose two.)

  • A. Store the data in Apache Parquet format using Snappy compression.
  • B. Partition the data by sensor, year, month, and day.
  • C. Store the data in Apache Avro format using Snappy compression.
  • D. Store the data in Apache ORC format using no compression.
  • E. Partition the data by year, month, and day.

Answer: A,D

 

NEW QUESTION 101
An advertising company has a data lake that is built on Amazon S3. The company uses AWS Glue Data Catalog to maintain the metadat a. The data lake is several years old and its overall size has increased exponentially as additional data sources and metadata are stored in the data lake. The data lake administrator wants to implement a mechanism to simplify permissions management between Amazon S3 and the Data Catalog to keep them in sync Which solution will simplify permissions management with minimal development effort?

  • A. Set AWS Identity and Access Management (1AM) permissions tor AWS Glue
  • B. Use AWS Lake Formation permissions
  • C. Manage AWS Glue and S3 permissions by using bucket policies
  • D. Use Amazon Cognito user pools.

Answer: B

 

NEW QUESTION 102
A company is building a service to monitor fleets of vehicles. The company collects IoT data from a device in each vehicle and loads the data into Amazon Redshift in near-real time. Fleet owners upload .csv files containing vehicle reference data into Amazon S3 at different times throughout the day. A nightly process loads the vehicle reference data from Amazon S3 into Amazon Redshift. The company joins the IoT data from the device and the vehicle reference data to power reporting and dashboards. Fleet owners are frustrated by waiting a day for the dashboards to update.
Which solution would provide the SHORTEST delay between uploading reference data to Amazon S3 and the change showing up in the owners' dashboards?

  • A. Use S3 event notifications to trigger an AWS Lambda function to copy the vehicle reference data into Amazon Redshift immediately when the reference data is uploaded to Amazon S3.
  • B. Send reference data to Amazon Kinesis Data Streams. Configure the Kinesis data stream to directly load the reference data into Amazon Redshift in real time.
  • C. Create and schedule an AWS Glue Spark job to run every 5 minutes. The job inserts reference data into Amazon Redshift.
  • D. Send the reference data to an Amazon Kinesis Data Firehose delivery stream. Configure Kinesis with a buffer interval of 60 seconds and to directly load the data into Amazon Redshift.

Answer: A

 

NEW QUESTION 103
A company's data analyst needs to ensure that queries executed in Amazon Athena cannot scan more than a prescribed amount of data for cost control purposes. Queries that exceed the prescribed threshold must be canceled immediately.
What should the data analyst do to achieve this?

  • A. For each workgroup, set the control limit for each query to the prescribed threshold.
  • B. Configure Athena to invoke an AWS Lambda function that terminates queries when the prescribed threshold is crossed.
  • C. Enforce the prescribed threshold on all Amazon S3 bucket policies
  • D. For each workgroup, set the workgroup-wide data usage control limit to the prescribed threshold.

Answer: A

Explanation:
Explanation
https://docs.aws.amazon.com/athena/latest/ug/manage-queries-control-costs-with-workgroups.html

 

NEW QUESTION 104
......

Guaranteed Success in AWS Certified Data Analytics DAS-C01 Exam Dumps: https://passleader.testpassking.com/DAS-C01-exam-testking-pass.html