Winter Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: bigdisc65

Professional-Data-Engineer Exam Dumps - Google Cloud Certified Questions and Answers

Question # 54

You work for an advertising company, and you’ve developed a Spark ML model to predict click-through rates at advertisement blocks. You’ve been developing everything at your on-premises data center, and now your company is migrating to Google Cloud. Your data center will be migrated to BigQuery. You periodically retrain your Spark ML models, so you need to migrate existing training pipelines to Google Cloud. What should you do?

Options:

A.

Use Cloud ML Engine for training existing Spark ML models

B.

Rewrite your models on TensorFlow, and start using Cloud ML Engine

C.

Use Cloud Dataproc for training existing Spark ML models, but start reading data directly from BigQuery

D.

Spin up a Spark cluster on Compute Engine, and train Spark ML models on the data exported from BigQuery

Buy Now
Question # 55

You are on the data governance team and are implementing security requirements to deploy resources. You need to ensure that resources are limited to only the europe-west 3 region You want to follow Google-recommended practices What should you do?

Options:

A.

Deploy resources with Terraform and implement a variable validation rule to ensure that the region is set to the europe-west3 region for all resources.

B.

Set the constraints/gcp. resourceLocations organization policy constraint to in:eu-locations.

C.

Create a Cloud Function to monitor all resources created and automatically destroy the ones created outside the europe-west3 region.

D.

Set the constraints/gcp. resourceLocations organization policy constraint to in: europe-west3-locations.

Buy Now
Question # 56

You are designing a data processing pipeline. The pipeline must be able to scale automatically as load increases. Messages must be processed at least once, and must be ordered within windows of 1 hour. How should you design the solution?

Options:

A.

Use Apache Kafka for message ingestion and use Cloud Dataproc for streaming analysis.

B.

Use Apache Kafka for message ingestion and use Cloud Dataflow for streaming analysis.

C.

Use Cloud Pub/Sub for message ingestion and Cloud Dataproc for streaming analysis.

D.

Use Cloud Pub/Sub for message ingestion and Cloud Dataflow for streaming analysis.

Buy Now
Question # 57

You launched a new gaming app almost three years ago. You have been uploading log files from the previous day to a separate Google BigQuery table with the table name format LOGS_yyyymmdd. You have been using table wildcard functions to generate daily and monthly reports for all time ranges. Recently, you discovered that some queries that cover long date ranges are exceeding the limit of 1,000 tables and failing. How can you resolve this issue?

Options:

A.

Convert all daily log tables into date-partitioned tables

B.

Convert the sharded tables into a single partitioned table

C.

Enable query caching so you can cache data from previous months

D.

Create separate views to cover each month, and query from these views

Buy Now
Question # 58

You use a dataset in BigQuery for analysis. You want to provide third-party companies with access to the same dataset. You need to keep the costs of data sharing low and ensure that the data is current. What should you do?

Options:

A.

Use Analytics Hub to control data access, and provide third party companies with access to the dataset

B.

Create a Dataflow job that reads the data in frequent time intervals and writes it to the relevant BigQuery dataset or Cloud Storage bucket for third-party companies to use.

C.

Use Cloud Scheduler to export the data on a regular basis to Cloud Storage, and provide third-party companies with access to the bucket.

D.

Create a separate dataset in BigQuery that contains the relevant data to share, and provide third-party companies with access to the new dataset.

Buy Now
Question # 59

You have some data, which is shown in the graphic below. The two dimensions are X and Y, and the shade of each dot represents what class it is. You want to classify this data accurately using a linear algorithm.

To do this you need to add a synthetic feature. What should the value of that feature be?

Options:

A.

X^2+Y^2

B.

X^2

C.

Y^2

D.

cos(X)

Buy Now
Question # 60

You have thousands of Apache Spark jobs running in your on-premises Apache Hadoop cluster. You want to migrate the jobs to Google Cloud. You want to use managed services to run your jobs instead of maintaining a long-lived Hadoop cluster yourself. You have a tight timeline and want to keep code changes to a minimum. What should you do?

Options:

A.

Copy your data to Compute Engine disks. Manage and run your jobs directly on those instances.

B.

Move your data to Cloud Storage. Run your jobs on Dataproc.

C.

Move your data to BigQuery. Convert your Spark scripts to a SQL-based processing approach.

D.

Rewrite your jobs in Apache Beam. Run your jobs in Dataflow.

Buy Now
Question # 61

You need to move 2 PB of historical data from an on-premises storage appliance to Cloud Storage within six months, and your outbound network capacity is constrained to 20 Mb/sec. How should you migrate this data to Cloud Storage?

Options:

A.

Use Transfer Appliance to copy the data to Cloud Storage

B.

Use gsutil cp –J to compress the content being uploaded to Cloud Storage

C.

Create a private URL for the historical data, and then use Storage Transfer Service to copy the data to Cloud Storage

D.

Use trickle or ionice along with gsutil cp to limit the amount of bandwidth gsutil utilizes to less than 20 Mb/sec so it does not interfere with the production traffic

Buy Now
Question # 62

Which SQL keyword can be used to reduce the number of columns processed by BigQuery?

Options:

A.

BETWEEN

B.

WHERE

C.

SELECT

D.

LIMIT

Buy Now
Question # 63

Cloud Bigtable is a recommended option for storing very large amounts of ____________________________?

Options:

A.

multi-keyed data with very high latency

B.

multi-keyed data with very low latency

C.

single-keyed data with very low latency

D.

single-keyed data with very high latency

Buy Now
Exam Name: Google Professional Data Engineer Exam
Last Update: Feb 20, 2025
Questions: 374
Professional-Data-Engineer pdf

Professional-Data-Engineer PDF

$29.75  $84.99
Professional-Data-Engineer Engine

Professional-Data-Engineer Testing Engine

$33.25  $94.99
Professional-Data-Engineer PDF + Engine

Professional-Data-Engineer PDF + Testing Engine

$47.25  $134.99