AWS DAS-C01 Free Practice Questions — Page 2

Question 1

A manufacturing company uses Amazon Connect to manage its contact center and Salesforce to manage its customer relationship management (CRM) data. The data engineering team must build a pipeline to ingest data from the contact center and CRM system into a data lake that is built on Amazon S3.What is the MOST efficient way to collect data in the data lake with the LEAST operational overhead?

Answer

Show Answer & Explanation

Correct Answer: C. Use Amazon Kinesis Data Firehose to ingest Amazon Connect data and Amazon AppFlow to ingest Salesforce data.

Why C is correct: Amazon Kinesis Data Firehose is purpose-built for streaming Amazon Connect data (call records, agent events) directly to S3 with automatic batching, compression, and transformation capabilities - all fully managed. Amazon AppFlow is specifically designed for SaaS application integration, including native Salesforce connectivity with pre-built connectors, automatic schema detection, and data transfer directly to S3. Both services are fully managed with minimal operational overhead. Why others are wrong: A Kinesis Data Streams requires additional consumers and code to write to S3, adding operational overhead B Kinesis Data Streams doesn't natively ingest Salesforce data D AppFlow doesn't directly support Amazon Connect; Firehose is the native integration

Question 2

A company has a data warehouse in Amazon Redshift that is approximately 500 TB in size. New data is imported every few hours and read-only queries are run throughout the day and evening. There is aparticularly heavy load with no writes for several hours each morning on business days. During those hours, some queries are queued and take a long time to execute. The company needs to optimize query execution and avoid any downtime.What is the MOST cost-effective solution?

Answer

Show Answer & Explanation

Correct Answer: A. Enable concurrency scaling in the workload management (WLM) queue.

Why A is correct: Concurrency scaling in Amazon Redshift automatically adds transient cluster capacity when queries are queued, specifically designed for read-heavy workloads with periodic spikes. It handles the morning query surge automatically without downtime, and you only pay for the additional capacity when it's actually used (per-second billing for concurrency scaling clusters). This directly addresses the queuing issue with minimal cost and zero downtime. Why others are wrong: B requires manual intervention during peak hours and doesn't avoid downtime during resize operations; distribution style ALL is also not cost-effective for 500 TB C elastic resize requires brief downtime (usually minutes) which violates the "avoid any downtime" requirement D snapshot, restore, and resize involves significant downtime during the switch

Question 3

A company analyzes historical data and needs to query data that is stored in Amazon S3. New data is generated daily as .csv files that are stored in Amazon S3. The company’s analysts are using Amazon Athena to perform SQL queries against a recent subset of the overall data. The amount of data that is ingested into Amazon S3 has increased substantially over time, and the query latency also has increased.Which solutions could the company implement to improve query performance? (Choose two.)

Answer

Show Answer & Explanation

Correct Answers: B. Use Athena to extract the data and store it in Apache Parquet format on a daily basis. Query the extracted data.; C. Run a daily AWS Glue ETL job to convert the data files to Apache Parquet and to partition the converted files. Create a periodic AWS Glue crawler to automatically crawl the partitioned data on a daily basis.

Why B is correct: Apache Parquet is a columnar storage format that significantly improves Athena query performance through better compression and column pruning. By extracting and converting daily, you maintain updated data in an optimized format, reducing query latency dramatically compared to CSV files. Why C is correct: This solution combines two optimizations: converting to Parquet format (columnar, compressed) and partitioning the data (allowing partition pruning to scan less data). The AWS Glue crawler automatically discovers new partitions daily, keeping the catalog updated. This addresses both the format inefficiency and the growing data volume through partitioning, providing the best long-term query performance improvement. Why others are wrong: A using MySQL Workbench doesn't improve underlying query performance; it's just a different client D & E while compression helps, gzip and lzo don't provide the same performance benefits as columnar Parquet format, and they don't address the growing data volume through partitioning

Question 4

A company has a marketing department and a finance department. The departments are storing data in Amazon S3 in their own AWS accounts in AWS Organizations. Both departments use AWS Lake Formation to catalog and secure their data. The departments have some databases and tables that share common names.The marketing department needs to securely access some tables from the finance department.Which two steps are required for this process? (Choose two.)

Answer

Show Answer & Explanation

Correct Answers: A. The finance department grants Lake Formation permissions for the tables to the external account for the marketing department.; C. The marketing department creates an IAM role that has permissions to the Lake Formation tables.

Why A is correct: Lake Formation uses a resource sharing model where the data owner (finance department) must explicitly grant permissions on specific tables to external AWS accounts. This Lake Formation grant is essential for cross-account data access in a Lake Formation-managed environment. Why C is correct: The marketing department needs an IAM role with appropriate permissions to assume and access the shared Lake Formation tables. This role acts as the identity that will access the finance department's data through Lake Formation's permission model. Why B is wrong: While IAM permissions are needed, Lake Formation uses its own permission model for table-level access control, not direct cross-account IAM table permissions. The proper approach is Lake Formation grants plus IAM roles, not pure IAM permissions to tables.

Question 5

A company developed a new elections reporting website that uses Amazon Kinesis Data Firehose to deliver full logs from AWS WAF to an Amazon S3 bucket. The company is now seeking a low-cost option to perform this infrequent data analysis with visualizations of logs in a way that requires minimal development effort.Which solution meets these requirements?

Answer

Show Answer & Explanation

Correct Answer: A. Use an AWS Glue crawler to create and update a table in the Glue data catalog from the logs. Use Athena to perform ad-hoc analyses and use Amazon QuickSight to develop data visualizations.

Why A is correct: This solution leverages AWS serverless services for minimal operational overhead. AWS Glue crawler automatically discovers the schema of WAF logs in S3 and creates/updates tables in the Glue Data Catalog. Athena provides ad-hoc SQL querying against S3 data with pay-per-query pricing (low cost for infrequent analysis). QuickSight connects directly to Athena for visualization with minimal development - just point and click dashboard creation. This is the most cost-effective and low-effort solution. Why others are wrong: B requires managing an OpenSearch cluster which adds operational overhead and continuous costs even when not querying C requires Lambda development for CSV conversion and managing a Redshift cluster (high operational overhead and cost) D requires managing an EMR cluster and developing Spark jobs, adding significant operational complexity and cost

AWS DAS-C01 Free Practice Questions — Page 2

Ready for the Full DAS-C01 Experience?

Recommended Next Certifications