Summer Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: v4s65

Professional-Data-Engineer Exam Dumps - Google Professional Data Engineer Exam

Go to page:
Question # 4

You have uploaded 5 years of log data to Cloud Storage A user reported that some data points in the log data are outside of their expected ranges, which indicates errors You need to address this issue and be able to run the process again in the future while keeping the original data for compliance reasons. What should you do?

A.

Import the data from Cloud Storage into BigQuery Create a new BigQuery table, and skip the rows with errors.

B.

Create a Compute Engine instance and create a new copy of the data in Cloud Storage Skip the rows with errors

C.

Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to a new dataset in

Cloud Storage

D.

Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to the same dataset in Cloud Storage

Full Access
Question # 5

Your team is responsible for developing and maintaining ETLs in your company. One of your Dataflow jobs is failing because of some errors in the input data, and you need to improve reliability of the pipeline (incl. being able to reprocess all failing data).

What should you do?

A.

Add a filtering step to skip these types of errors in the future, extract erroneous rows from logs.

B.

Add a try… catch block to your DoFn that transforms the data, extract erroneous rows from logs.

C.

Add a try… catch block to your DoFn that transforms the data, write erroneous rows to PubSub directly from the DoFn.

D.

Add a try… catch block to your DoFn that transforms the data, use a sideOutput to create a PCollection that can be stored to PubSub later.

Full Access
Question # 6

You work for a shipping company that uses handheld scanners to read shipping labels. Your company has strict data privacy standards that require scanners to only transmit recipients’ personally identifiable information (PII) to analytics systems, which violates user privacy rules. You want to quickly build a scalable solution using cloud-native managed services to prevent exposure of PII to the analytics systems. What should you do?

A.

Create an authorized view in BigQuery to restrict access to tables with sensitive data.

B.

Install a third-party data validation tool on Compute Engine virtual machines to check the incoming data for sensitive information.

C.

Use Stackdriver logging to analyze the data passed through the total pipeline to identify transactions that may contain sensitive information.

D.

Build a Cloud Function that reads the topics and makes a call to the Cloud Data Loss Prevention API. Use the tagging and confidence levels to either pass or quarantine the data in a bucket for review.

Full Access
Question # 7

You are migrating your on-premises data warehouse to BigQuery. As part of the migration, you want to facilitate cross-team collaboration to get the most value out of the organization's data. You need to design an architecture that would allow teams within the organization to securely publish, discover, and subscribe to read-only data in a self-service manner. You need to minimize costs while also maximizing data freshness What should you do?

A.

Create authorized datasets to publish shared data in the subscribing team's project.

B.

Create a new dataset for sharing in each individual team's project. Grant the subscribing team the bigquery. dataViewer role on the

dataset.

C.

Use BigQuery Data Transfer Service to copy datasets to a centralized BigQuery project for sharing.

D.

Use Analytics Hub to facilitate data sharing.

Full Access
Question # 8

You have a BigQuery table that contains customer data, including sensitive information such as names and addresses. You need to share the customer data with your data analytics and consumer support teams securely. The data analytics team needs to access the data of all the customers, but must not be able to access the sensitive data. The consumer support team needs access to all data columns, but must not be able to access customers that no longer have active contracts. You enforced these requirements by using an authorized dataset and policy tags After implementing these steps, the data analytics team reports that they still have access to the sensitive columns. You need to ensure that the data analytics team does not have access to restricted data What should you do?

Choose 2 answers

A.

Create two separate authorized datasets; one for the data analytics team and another for the consumer support team.

B.

Ensure that the data analytics team members do not have the Data Catalog Fine-Grained Reader role for the policy tags.

C.

Enforce access control in the policy tag taxonomy.

D.

Remove the bigquery. dataViewer role from the data analytics team on the authorized datasets.

E.

Replace the authorized dataset with an authorized view Use row-level security and apply filter_ expression to limit data access.

Full Access
Go to page: