Black Friday Special 60% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: bestdeal

Google Professional-Data-Engineer Exam Success: Google Professional Data Engineer Exam Complete Study and Preparation Tips

Questions 31

Your startup has a web application that currently serves customers out of a single region in Asia. You are targeting funding that will allow your startup lo serve customers globally. Your current goal is to optimize for cost, and your post-funding goat is to optimize for global presence and performance. You must use a native JDBC driver. What should you do?

Options:

A.

Use Cloud Spanner to configure a single region instance initially. and then configure multi-region C oud Spanner instances after securing funding.

B.

Use a Cloud SQL for PostgreSQL highly available instance first, and 8»gtable with US. Europe, and Asia

replication alter securing funding

C.

Use a Cloud SQL for PostgreSQL zonal instance first and Bigtable with US. Europe, and Asia after securing funding.

D.

Use a Cloud SOL for PostgreSQL zonal instance first, and Cloud SOL for PostgreSQL with highly available configuration after securing funding.

Buy Now
Questions 32

Your globally distributed auction application allows users to bid on items. Occasionally, users place identical bids at nearly identical times, and different application servers process those bids. Each bid event contains the item, amount, user, and timestamp. You want to collate those bid events into a single location in real time to determine which user bid first. What should you do?

Options:

A.

Create a file on a shared file and have the application servers write all bid events to that file. Process the file with Apache Hadoop to identify which user bid first.

B.

Have each application server write the bid events to Cloud Pub/Sub as they occur. Push the events from Cloud Pub/Sub to a custom endpoint that writes the bid event information into Cloud SQL.

C.

Set up a MySQL database for each application server to write bid events into. Periodically query each of those distributed MySQL databases and update a master MySQL database with bid event information.

D.

Have each application server write the bid events to Google Cloud Pub/Sub as they occur. Use a pull

subscription to pull the bid events using Google Cloud Dataflow. Give the bid for each item to the user in

the bid event that is processed first.

Buy Now
Questions 33

Your company's data platform ingests CSV file dumps of booking and user profile data from upstream sources into Cloud Storage. The data analyst team wants to join these datasets on the email field available in both the datasets to perform analysis. However, personally identifiable information (PII) should not be accessible to the analysts. You need to de-identify the email field in both the datasets before loading them into BigQuery for analysts. What should you do?

Options:

A.

1. Create a pipeline to de-identify the email field by using recordTransformations in Cloud Data Loss Prevention (Cloud DLP) with masking as the de-identification transformations type.

2. Load the booking and user profile data into a BigQuery table.

B.

1. Create a pipeline to de-identify the email field by using recordTransformations in Cloud DLP with format-preserving encryption with FFX as the de-identification transformation type.

2. Load the booking and user profile data into a BigQuery table.

C.

1. Load the CSV files from Cloud Storage into a BigQuery table, and enable dynamic data masking.

2. Create a policy tag with the email mask as the data masking rule.

3. Assign the policy to the email field in both tables. A

4. Assign the Identity and Access Management bigquerydatapolicy.maskedReader role for the BigQuery tables to the analysts.

D.

1. Load the CSV files from Cloud Storage into a BigQuery table, and enable dynamic data masking.

2. Create a policy tag with the default masking value as the data masking rule.

3. Assign the policy to the email field in both tables.

4. Assign the Identity and Access Management bigquerydatapolicy.maskedReader role for the BigQuery tables to the analysts

Buy Now
Questions 34

You are planning to use Cloud Storage as pad of your data lake solution. The Cloud Storage bucket will contain objects ingested from external systems. Each object will be ingested once, and the access patterns of individual objects will be random. You want to minimize the cost of storing and retrieving these objects. You want to ensure that any cost optimization efforts are transparent to the users and applications. What should you do?

Options:

A.

Create a Cloud Storage bucket with Autoclass enabled.

B.

Create a Cloud Storage bucket with an Object Lifecycle Management policy to transition objects from Standard to Coldline storage class if an object age reaches 30 days.

C.

Create a Cloud Storage bucket with an Object Lifecycle Management policy to transition objects from Standard to Coldline storage class if an object is not live.

D.

Create two Cloud Storage buckets. Use the Standard storage class for the first bucket, and use the Coldline storage class for the second bucket. Migrate objects from the first bucket to the second bucket after 30 days.

Buy Now
Questions 35

You need to set access to BigQuery for different departments within your company. Your solution should comply with the following requirements:

    Each department should have access only to their data.

    Each department will have one or more leads who need to be able to create and update tables and provide them to their team.

    Each department has data analysts who need to be able to query but not modify data.

How should you set access to the data in BigQuery?

Options:

A.

Create a dataset for each department. Assign the department leads the role of OWNER, and assign the data analysts the role of WRITER on their dataset.

B.

Create a dataset for each department. Assign the department leads the role of WRITER, and assign the data analysts the role of READER on their dataset.

C.

Create a table for each department. Assign the department leads the role of Owner, and assign the data analysts the role of Editor on the project the table is in.

D.

Create a table for each department. Assign the department leads the role of Editor, and assign the data analysts the role of Viewer on the project the table is in.

Buy Now
Questions 36

You want to encrypt the customer data stored in BigQuery. You need to implement for-user crypto-deletion on data stored in your tables. You want to adopt native features in Google Cloud to avoid custom solutions. What should you do?

Options:

A.

Create a customer-managed encryption key (CMEK) in Cloud KMS. Associate the key to the table while creating the table.

B.

Create a customer-managed encryption key (CMEK) in Cloud KMS. Use the key to encrypt data before storing in BigQuery.

C.

Implement Authenticated Encryption with Associated Data (AEAD) BigQuery functions while storing your data in BigQuery.

D.

Encrypt your data during ingestion by using a cryptographic library supported by your ETL pipeline.

Buy Now
Questions 37

You are integrating one of your internal IT applications and Google BigQuery, so users can query BigQuery from the application’s interface. You do not want individual users to authenticate to BigQuery and you do not want to give them access to the dataset. You need to securely access BigQuery from your IT application.

What should you do?

Options:

A.

Create groups for your users and give those groups access to the dataset

B.

Integrate with a single sign-on (SSO) platform, and pass each user’s credentials along with the query

request

C.

Create a service account and grant dataset access to that account. Use the service account’s private key to access the dataset

D.

Create a dummy user and grant dataset access to that user. Store the username and password for that user in a file on the files system, and use those credentials to access the BigQuery dataset

Buy Now
Questions 38

The marketing team at your organization provides regular updates of a segment of your customer dataset. The marketing team has given you a CSV with 1 million records that must be updated in BigQuery. When you use the UPDATE statement in BigQuery, you receive a quotaExceeded error. What should you do?

Options:

A.

Reduce the number of records updated each day to stay within the BigQuery UPDATE DML statement limit.

B.

Increase the BigQuery UPDATE DML statement limit in the Quota management section of the Google Cloud Platform Console.

C.

Split the source CSV file into smaller CSV files in Cloud Storage to reduce the number of BigQuery UPDATE DML statements per BigQuery job.

D.

Import the new records from the CSV file into a new BigQuery table. Create a BigQuery job that merges the new records with the existing records and writes the results to a new BigQuery table.

Buy Now
Questions 39

You created an analytics environment on Google Cloud so that your data scientist team can explore data without impacting the on-premises Apache Hadoop solution. The data in the on-premises Hadoop Distributed File System (HDFS) cluster is in Optimized Row Columnar (ORC) formatted files with multiple columns of Hive partitioning. The data scientist team needs to be able to explore the data in a similar way as they used the on-premises HDFS cluster with SQL on the Hive query engine. You need to choose the most cost-effective storage and processing solution. What should you do?

Options:

A.

Import the ORC files lo Bigtable tables for the data scientist team.

B.

Import the ORC files to BigOuery tables for the data scientist team.

C.

Copy the ORC files on Cloud Storage, then deploy a Dataproc cluster for the data scientist team.

D.

Copy the ORC files on Cloud Storage, then create external BigQuery tables for the data scientist team.

Buy Now
Questions 40

You have a petabyte of analytics data and need to design a storage and processing platform for it. You must be able to perform data warehouse-style analytics on the data in Google Cloud and expose the dataset as files for batch analysis tools in other cloud providers. What should you do?

Options:

A.

Store and process the entire dataset in BigQuery.

B.

Store and process the entire dataset in Cloud Bigtable.

C.

Store the full dataset in BigQuery, and store a compressed copy of the data in a Cloud Storage bucket.

D.

Store the warm data as files in Cloud Storage, and store the active data in BigQuery. Keep this ratio as 80% warm and 20% active.

Buy Now
Exam Code: Professional-Data-Engineer
Exam Name: Google Professional Data Engineer Exam
Last Update: Dec 14, 2024
Questions: 372

PDF + Testing Engine

$164.99
$66

Testing Engine

$124.99
$50

PDF (Q&A)

$104.99
$42

Google Free Exams

Google Free Exams
Elevate your Google exam preparation with free access to high-quality resources at Examstrack.