Labour Day Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: scxmas70

Professional-Machine-Learning-Engineer Exam Dumps - Google Professional Machine Learning Engineer

Question # 4

As the lead ML Engineer for your company, you are responsible for building ML models to digitize scanned customer forms. You have developed a TensorFlow model that converts the scanned images into text and stores them in Cloud Storage. You need to use your ML model on the aggregated data collected at the end of each day with minimal manual intervention. What should you do?

A.

Use the batch prediction functionality of Al Platform

B.

Create a serving pipeline in Compute Engine for prediction

C.

Use Cloud Functions for prediction each time a new data point is ingested

D.

Deploy the model on Al Platform and create a version of it for online inference.

Full Access
Question # 5

You are developing a mode! to detect fraudulent credit card transactions. You need to prioritize detection because missing even one fraudulent transaction could severely impact the credit card holder. You used AutoML to tram a model on users' profile information and credit card transaction data. After training the initial model, you notice that the model is failing to detect many fraudulent transactions. How should you adjust the training parameters in AutoML to improve model performance?

Choose 2 answers

A.

Increase the score threshold.

B.

Decrease the score threshold.

C.

Add more positive examples to the training set.

D.

Add more negative examples to the training set.

E.

Reduce the maximum number of node hours for training.

Full Access
Question # 6

You are a lead ML engineer at a retail company. You want to track and manage ML metadata in a centralized way so that your team can have reproducible experiments by generating artifacts. Which management solution should you recommend to your team?

A.

Store your tf.logging data in BigQuery.

B.

Manage all relational entities in the Hive Metastore.

C.

Store all ML metadata in Google Cloud’s operations suite.

D.

Manage your ML workflows with Vertex ML Metadata.

Full Access
Question # 7

Your team frequently creates new ML models and runs experiments. Your team pushes code to a single repository hosted on Cloud Source Repositories. You want to create a continuous integration pipeline that automatically retrains the models whenever there is any modification of the code. What should be your first step to set up the CI pipeline?

A.

Configure a Cloud Build trigger with the event set as "Pull Request"

B.

Configure a Cloud Build trigger with the event set as "Push to a branch"

C.

Configure a Cloud Function that builds the repository each time there is a code change.

D.

Configure a Cloud Function that builds the repository each time a new branch is created.

Full Access
Question # 8

Your team has a model deployed to a Vertex Al endpoint You have created a Vertex Al pipeline that automates the model training process and is triggered by a Cloud Function. You need to prioritize keeping the model up-to-date, but also minimize retraining costs. How should you configure retraining'?

A.

Configure Pub/Sub to call the Cloud Function when a sufficient amount of new data becomes available.

B.

Configure a Cloud Scheduler job that calls the Cloud Function at a predetermined frequency that fits your team's budget.

C.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when anomalies are detected.

D.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when feature drift is detected.

Full Access
Question # 9

Your team is building a convolutional neural network (CNN)-based architecture from scratch. The preliminary experiments running on your on-premises CPU-only infrastructure were encouraging, but have slow convergence. You have been asked to speed up model training to reduce time-to-market. You want to experiment with virtual machines (VMs) on Google Cloud to leverage more powerful hardware. Your code does not include any manual device placement and has not been wrapped in Estimator model-level abstraction. Which environment should you train your model on?

A.

AVM on Compute Engine and 1 TPU with all dependencies installed manually.

B.

AVM on Compute Engine and 8 GPUs with all dependencies installed manually.

C.

A Deep Learning VM with an n1-standard-2 machine and 1 GPU with all libraries pre-installed.

D.

A Deep Learning VM with more powerful CPU e2-highcpu-16 machines with all libraries pre-installed.

Full Access
Question # 10

You are developing an image recognition model using PyTorch based on ResNet50 architecture. Your code is working fine on your local laptop on a small subsample. Your full dataset has 200k labeled images You want to quickly scale your training workload while minimizing cost. You plan to use 4 V100 GPUs. What should you do? (Choose Correct Answer and Give References and Explanation)

A.

Configure a Compute Engine VM with all the dependencies that launches the training Train your model with Vertex Al using a custom tier that contains the required GPUs.

B.

Package your code with Setuptools. and use a pre-built container Train your model with Vertex Al using a custom tier that contains the required GPUs.

C.

Create a Vertex Al Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to train your model

D.

Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs Prepare and submit a TFJob operator to this node pool.

Full Access
Question # 11

You are developing an ML model that uses sliced frames from video feed and creates bounding boxes around specific objects. You want to automate the following steps in your training pipeline: ingestion and preprocessing of data in Cloud Storage, followed by training and hyperparameter tuning of the object model using Vertex AI jobs, and finally deploying the model to an endpoint. You want to orchestrate the entire pipeline with minimal cluster management. What approach should you use?

A.

Use Kubeflow Pipelines on Google Kubernetes Engine.

B.

Use Vertex AI Pipelines with TensorFlow Extended (TFX) SDK.

C.

Use Vertex AI Pipelines with Kubeflow Pipelines SDK.

D.

Use Cloud Composer for the orchestration.

Full Access
Question # 12

You need to develop an image classification model by using a large dataset that contains labeled images in a Cloud Storage Bucket. What should you do?

A.

Use Vertex Al Pipelines with the Kubeflow Pipelines SDK to create a pipeline that reads the images from Cloud Storage and trains the model.

B.

Use Vertex Al Pipelines with TensorFlow Extended (TFX) to create a pipeline that reads the images from Cloud Storage and trams the model.

C.

Import the labeled images as a managed dataset in Vertex Al: and use AutoML to tram the model.

D.

Convert the image dataset to a tabular format using Dataflow Load the data into BigQuery and use BigQuery ML to tram the model.

Full Access
Question # 13

You work at a bank You have a custom tabular ML model that was provided by the bank's vendor. The training data is not available due to its sensitivity. The model is packaged as a Vertex Al Model serving container which accepts a string as input for each prediction instance. In each string the feature values are separated by commas. You want to deploy this model to production for online predictions, and monitor the feature distribution over time with minimal effort What should you do?

A.

1 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Ai endpoint.

2. Create a Vertex Al Model Monitoring job with feature drift detection as the monitoring objective, and provide an instance schema.

B.

1 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

2 Create a Vertex Al Model Monitoring job with feature skew detection as the monitoring objective and provide an instance schema.

C.

1 Refactor the serving container to accept key-value pairs as input format.

2. Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

3. Create a Vertex Al Model Monitoring job with feature drift detection as the monitoring objective.

D.

1 Refactor the serving container to accept key-value pairs as input format.

2 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

3. Create a Vertex Al Model Monitoring job with feature skew detection as the monitoring objective.

Full Access
Question # 14

You have a custom job that runs on Vertex Al on a weekly basis The job is Implemented using a proprietary ML workflow that produces the datasets. models, and custom artifacts, and sends them to a Cloud Storage bucket Many different versions of the datasets and models were created Due to compliance requirements, your company needs to track which model was used for making a particular prediction, and needs access to the artifacts for each model. How should you configure your workflows to meet these requirement?

A.

Configure a TensorFlow Extended (TFX) ML Metadata database, and use the ML Metadata API.

B.

Create a Vertex Al experiment, and enable autologging inside the custom job

C.

Use the Vertex Al Metadata API inside the custom Job to create context, execution, and artifacts for each model, and use events to link them together.

D.

Register each model in Vertex Al Model Registry, and use model labels to store the related dataset and model information.

Full Access
Question # 15

You are working with a dataset that contains customer transactions. You need to build an ML model to predict customer purchase behavior You plan to develop the model in BigQuery ML, and export it to Cloud Storage for online prediction You notice that the input data contains a few categorical features, including product category and payment method You want to deploy the model as quickly as possible. What should you do?

A.

Use the transform clause with the ML. ONE_HOT_ENCODER function on the categorical features at model creation and select the categorical and non-categorical features.

B.

Use the ML. ONE_HOT_ENCODER function on the categorical features, and select the encoded categorical features and non-categorical features as inputs to create your model.

C.

Use the create model statement and select the categorical and non-categorical features.

D.

Use the ML. ONE_HOT_ENCODER function on the categorical features, and select the encoded categorical features and non-categorical features as inputs to create your model.

Full Access
Question # 16

You want to migrate a scikrt-learn classifier model to TensorFlow. You plan to train the TensorFlow classifier model using the same training set that was used to train the scikit-learn model and then compare the performances using a common test set. You want to use the Vertex Al Python SDK to manually log the evaluation metrics of each model and compare them based on their F1 scores and confusion matrices. How should you log the metrics?

A.

B.

C.

D.

Full Access
Question # 17

You work for a bank and are building a random forest model for fraud detection. You have a dataset that

includes transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the performance of your classifier?

A.

Write your data in TFRecords.

B.

Z-normalize all the numeric features.

C.

Oversample the fraudulent transaction 10 times.

D.

Use one-hot encoding on all categorical features.

Full Access
Question # 18

You are an ML engineer at a global car manufacturer. You need to build an ML model to predict car sales in different cities around the world. Which features or feature crosses should you use to train city-specific relationships between car type and number of sales?

A.

Three individual features binned latitude, binned longitude, and one-hot encoded car type

B.

One feature obtained as an element-wise product between latitude, longitude, and car type

C.

One feature obtained as an element-wise product between binned latitude, binned longitude, and one-hot encoded car type

D.

Two feature crosses as a element-wise product the first between binned latitude and one-hot encoded car type, and the second between binned longitude and one-hot encoded car type

Full Access
Question # 19

You deployed an ML model into production a year ago. Every month, you collect all raw requests that were sent to your model prediction service during the previous month. You send a subset of these requests to a human labeling service to evaluate your model’s performance. After a year, you notice that your model's performance sometimes degrades significantly after a month, while other times it takes several months to notice any decrease in performance. The labeling service is costly, but you also need to avoid large performance degradations. You want to determine how often you should retrain your model to maintain a high level of performance while minimizing cost. What should you do?

A.

Train an anomaly detection model on the training dataset, and run all incoming requests through this model. If an anomaly is detected, send the most recent serving data to the labeling service.

B.

Identify temporal patterns in your model’s performance over the previous year. Based on these patterns, create a schedule for sending serving data to the labeling service for the next year.

C.

Compare the cost of the labeling service with the lost revenue due to model performance degradation over the past year. If the lost revenue is greater than the cost of the labeling service, increase the frequency of model retraining; otherwise, decrease the model retraining frequency.

D.

Run training-serving skew detection batch jobs every few days to compare the aggregate statistics of the features in the training dataset with recent serving data. If skew is detected, send the most recent serving data to the labeling service.

Full Access
Question # 20

You have recently used TensorFlow to train a classification model on tabular data You have created a Dataflow pipeline that can transform several terabytes of data into training or prediction datasets consisting of TFRecords. You now need to productionize the model, and you want the predictions to be automatically uploaded to a BigQuery table on a weekly schedule. What should you do?

A.

Import the model into Vertex Al and deploy it to a Vertex Al endpoint On Vertex Al Pipelines create a pipeline that uses the Dataf lowPythonJobop and the Mcdei3archPredictoc components.

B.

Import the model into Vertex Al and deploy it to a Vertex Al endpoint Create a Dataflow pipeline that reuses the data processing logic sends requests to the endpoint and then uploads predictions to a BigQuery table.

C.

Import the model into Vertex Al On Vertex Al Pipelines, create a pipeline that uses the DatafIowPythonJobOp and the ModelBatchPredictOp components.

D.

Import the model into BigQuery Implement the data processing logic in a SQL query On Vertex Al Pipelines create a pipeline that uses the BigqueryQueryJobop and the EigqueryPredictModejobOp components.

Full Access
Question # 21

You are working on a prototype of a text classification model in a managed Vertex AI Workbench notebook. You want to quickly experiment with tokenizing text by using a Natural Language Toolkit (NLTK) library. How should you add the library to your Jupyter kernel?

A.

Install the NLTK library from a terminal by using the pip install nltk command.

B.

Write a custom Dataflow job that uses NLTK to tokenize your text and saves the output to Cloud Storage.

C.

Create a new Vertex Al Workbench notebook with a custom image that includes the NLTK library.

D.

Install the NLTK library from a Jupyter cell by using the! pip install nltk —user command.

Full Access
Question # 22

You created an ML pipeline with multiple input parameters. You want to investigate the tradeoffs between different parameter combinations. The parameter options are

• input dataset

• Max tree depth of the boosted tree regressor

• Optimizer learning rate

You need to compare the pipeline performance of the different parameter combinations measured in F1 score, time to train and model complexity. You want your approach to be reproducible and track all pipeline runs on the same platform. What should you do?

A.

1 Use BigQueryML to create a boosted tree regressor and use the hyperparameter tuning capability

2 Configure the hyperparameter syntax to select different input datasets. max tree depths, and optimizer teaming rates Choose the grid search option

B.

1 Create a Vertex Al pipeline with a custom model training job as part of the pipeline Configure the pipeline's parameters to include those you are investigating

2 In the custom training step, use the Bayesian optimization method with F1 score as the target to maximize

C.

1 Create a Vertex Al Workbench notebook for each of the different input datasets

2 In each notebook, run different local training jobs with different combinations of the max tree depth and optimizer learning rate parameters

3 After each notebook finishes, append the results to a BigQuery table

D.

1 Create an experiment in Vertex Al Experiments

2. Create a Vertex Al pipeline with a custom model training job as part of the pipeline. Configure the pipelines parameters to include those you are investigating

3. Submit multiple runs to the same experiment using different values for the parameters

Full Access
Question # 23

You are developing an ML model intended to classify whether X-Ray images indicate bone fracture risk. You have trained on Api Resnet architecture on Vertex AI using a TPU as an accelerator, however you are unsatisfied with the trainning time and use memory usage. You want to quickly iterate your training code but make minimal changes to the code. You also want to minimize impact on the models accuracy. What should you do?

A.

Configure your model to use bfloat16 instead float32

B.

Reduce the global batch size from 1024 to 256

C.

Reduce the number of layers in the model architecture

D.

Reduce the dimensions of the images used un the model

Full Access
Question # 24

You work for a large hotel chain and have been asked to assist the marketing team in gathering predictions for a targeted marketing strategy. You need to make predictions about user lifetime value (LTV) over the next 30 days so that marketing can be adjusted accordingly. The customer dataset is in BigQuery, and you are preparing the tabular data for training with AutoML Tables. This data has a time signal that is spread across multiple columns. How should you ensure that AutoML fits the best model to your data?

A.

Manually combine all columns that contain a time signal into an array Allow AutoML to interpret this array appropriately

Choose an automatic data split across the training, validation, and testing sets

B.

Submit the data for training without performing any manual transformations Allow AutoML to handle the appropriate

transformations Choose an automatic data split across the training, validation, and testing sets

C.

Submit the data for training without performing any manual transformations, and indicate an appropriate column as the Time column Allow AutoML to split your data based on the time signal provided, and reserve the more recent data for the validation and testing sets

D.

Submit the data for training without performing any manual transformations Use the columns that have a time signal to manually split your data Ensure that the data in your validation set is from 30 days after the data in your training set and that the data in your testing set is from 30 days after your validation set

Full Access
Question # 25

You work for a global footwear retailer and need to predict when an item will be out of stock based on historical inventory data. Customer behavior is highly dynamic since footwear demand is influenced by many different factors. You want to serve models that are trained on all available data, but track your performance on specific subsets of data before pushing to production. What is the most streamlined and reliable way to perform this validation?

A.

Use the TFX ModelValidator tools to specify performance metrics for production readiness

B.

Use k-fold cross-validation as a validation strategy to ensure that your model is ready for production.

C.

Use the last relevant week of data as a validation set to ensure that your model is performing accurately on current data

D.

Use the entire dataset and treat the area under the receiver operating characteristics curve (AUC ROC) as the main metric.

Full Access
Question # 26

You work for a company that is developing an application to help users with meal planning You want to use machine learning to scan a corpus of recipes and extract each ingredient (e g carrot, rice pasta) and each kitchen cookware (e.g. bowl, pot spoon) mentioned Each recipe is saved in an unstructured text file What should you do?

A.

Create a text dataset on Vertex Al for entity extraction Create two entities called ingredient" and cookware" and label at least 200 examples of each entity Train an AutoML entity extraction model to extract occurrences of these entity types Evaluate performance on a holdout dataset.

B.

Create a multi-label text classification dataset on Vertex Al Create a test dataset and label each recipe that corresponds to its ingredients and cookware Train a multi-class classification model Evaluate the model’s performance on a holdout dataset.

C.

Use the Entity Analysis method of the Natural Language API to extract the ingredients and cookware from each recipe Evaluate the model's performance on a prelabeled dataset.

D.

Create a text dataset on Vertex Al for entity extraction Create as many entities as there are different ingredients and cookware Train an AutoML entity extraction model to extract those entities Evaluate the models performance on a holdout dataset.

Full Access
Question # 27

You work for a biotech startup that is experimenting with deep learning ML models based on properties of biological organisms. Your team frequently works on early-stage experiments with new architectures of ML models, and writes custom TensorFlow ops in C++. You train your models on large datasets and large batch sizes. Your typical batch size has 1024 examples, and each example is about 1 MB in size. The average size of a network with all weights and embeddings is 20 GB. What hardware should you choose for your models?

A.

A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM

B.

A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM

C.

A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM

D.

A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM

Full Access
Question # 28

You have been tasked with deploying prototype code to production. The feature engineering code is in PySpark and runs on Dataproc Serverless. The model training is executed by using a Vertex Al custom training job. The two steps are not connected, and the model training must currently be run manually after the feature engineering step finishes. You need to create a scalable and maintainable production process that runs end-to-end and tracks the connections between steps. What should you do?

A.

Create a Vertex Al Workbench notebook Use the notebook to submit the Dataproc Serverless feature engineering job Use the same notebook to submit the custom model training job Run the notebook cells sequentially to tie the steps together end-to-end

B.

Create a Vertex Al Workbench notebook Initiate an Apache Spark context in the notebook, and run the PySpark feature engineering code Use the same notebook to run the custom model training job in TensorFlow Run the notebook cells sequentially to tie the steps together end-to-end

C.

Use the Kubeflow pipelines SDK to write code that specifies two components

- The first is a Dataproc Serverless component that launches the feature engineering job

- The second is a custom component wrapped in the

creare_cusrora_rraining_job_from_ccraponent Utility that launches the custom model training

job.

D.

Create a Vertex Al Pipelines job to link and run both components Use the Kubeflow pipelines SDK to write code that specifies two components

- The first component initiates an Apache Spark context that runs the PySpark feature engineering code

- The second component runs the TensorFlow custom model training code Create a Vertex Al Pipelines job to link and run both components

Full Access
Question # 29

You have recently trained a scikit-learn model that you plan to deploy on Vertex Al. This model will support both online and batch prediction. You need to preprocess input data for model inference. You want to package the model for deployment while minimizing additional code What should you do?

A.

1 Upload your model to the Vertex Al Model Registry by using a prebuilt scikit-learn prediction container

2 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job that uses the instanceConfig.inscanceType setting to transform your input data

B.

1 Wrap your model in a custom prediction routine (CPR). and build a container image from the CPR local model

2 Upload your sci-kit learn model container to Vertex Al Model Registry

3 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job

C.

1. Create a custom container for your sci-kit learn model,

2 Define a custom serving function for your model

3 Upload your model and custom container to Vertex Al Model Registry

4 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job

D.

1 Create a custom container for your sci-kit learn model.

2 Upload your model and custom container to Vertex Al Model Registry

3 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job that uses the instanceConfig. instanceType setting to transform your input data

Full Access
Question # 30

Your company manages an application that aggregates news articles from many different online sources and sends them to users. You need to build a recommendation model that will suggest articles to readers that are similar to the articles they are currently reading. Which approach should you use?

A.

Create a collaborative filtering system that recommends articles to a user based on the user’s past behavior.

B.

Encode all articles into vectors using word2vec, and build a model that returns articles based on vector similarity.

C.

Build a logistic regression model for each user that predicts whether an article should be recommended to a user.

D.

Manually label a few hundred articles, and then train an SVM classifier based on the manually classified articles that categorizes additional articles into their respective categories.

Full Access
Question # 31

You have been given a dataset with sales predictions based on your company’s marketing activities. The data is structured and stored in BigQuery, and has been carefully managed by a team of data analysts. You need to prepare a report providing insights into the predictive capabilities of the data. You were asked to run several ML models with different levels of sophistication, including simple models and multilayered neural networks. You only have a few hours to gather the results of your experiments. Which Google Cloud tools should you use to complete this task in the most efficient and self-serviced way?

A.

Use BigQuery ML to run several regression models, and analyze their performance.

B.

Read the data from BigQuery using Dataproc, and run several models using SparkML.

C.

Use Vertex AI Workbench user-managed notebooks with scikit-learn code for a variety of ML algorithms and performance metrics.

D.

Train a custom TensorFlow model with Vertex AI, reading the data from BigQuery featuring a variety of ML algorithms.

Full Access
Question # 32

You are developing a Kubeflow pipeline on Google Kubernetes Engine. The first step in the pipeline is to issue a query against BigQuery. You plan to use the results of that query as the input to the next step in your pipeline. You want to achieve this in the easiest way possible. What should you do?

A.

Use the BigQuery console to execute your query and then save the query results Into a new BigQuery table.

B.

Write a Python script that uses the BigQuery API to execute queries against BigQuery Execute this script as the first step in your Kubeflow pipeline

C.

Use the Kubeflow Pipelines domain-specific language to create a custom component that uses the Python BigQuery client library to execute queries

D.

Locate the Kubeflow Pipelines repository on GitHub Find the BigQuery Query Component, copy that component's URL, and use it to load the component into your pipeline. Use the component to execute queries against BigQuery

Full Access
Question # 33

You work with a data engineering team that has developed a pipeline to clean your dataset and save it in a Cloud Storage bucket. You have created an ML model and want to use the data to refresh your model as soon as new data is available. As part of your CI/CD workflow, you want to automatically run a Kubeflow Pipelines training job on Google Kubernetes Engine (GKE). How should you architect this workflow?

A.

Configure your pipeline with Dataflow, which saves the files in Cloud Storage After the file is saved, start the training job on a GKE cluster

B.

Use App Engine to create a lightweight python client that continuously polls Cloud Storage for new files As soon as a file arrives, initiate the training job

C.

Configure a Cloud Storage trigger to send a message to a Pub/Sub topic when a new file is available in a storage bucket. Use a Pub/Sub-triggered Cloud Function to start the training job on a GKE cluster

D.

Use Cloud Scheduler to schedule jobs at a regular interval. For the first step of the job. check the timestamp of objects in your Cloud Storage bucket If there are no new files since the last run, abort the job.

Full Access
Question # 34

You are collaborating on a model prototype with your team. You need to create a Vertex Al Workbench environment for the members of your team and also limit access to other employees in your project. What should you do?

A.

1. Create a new service account and grant it the Notebook Viewer role.

2 Grant the Service Account User role to each team member on the service account.

3 Grant the Vertex Al User role to each team member.

4. Provision a Vertex Al Workbench user-managed notebook instance that uses the new service account.

B.

1. Grant the Vertex Al User role to the default Compute Engine service account.

2. Grant the Service Account User role to each team member on the default Compute Engine service account.

3. Provision a Vertex Al Workbench user-managed notebook instance that uses the default Compute Engine service account.

C.

1 Create a new service account and grant it the Vertex Al User role.

2 Grant the Service Account User role to each team member on the service account.

3. Grant the Notebook Viewer role to each team member.

4 Provision a Vertex Al Workbench user-managed notebook instance that uses the new service account.

D.

1 Grant the Vertex Al User role to the primary team member.

2. Grant the Notebook Viewer role to the other team members.

3. Provision a Vertex Al Workbench user-managed notebook instance that uses the primary user’s account.

Full Access
Question # 35

You are developing an ML model using a dataset with categorical input variables. You have randomly split half of the data into training and test sets. After applying one-hot encoding on the categorical variables in the training set, you discover that one categorical variable is missing from the test set. What should you do?

A.

Randomly redistribute the data, with 70% for the training set and 30% for the test set

B.

Use sparse representation in the test set

C.

Apply one-hot encoding on the categorical variables in the test data.

D.

Collect more data representing all categories

Full Access
Question # 36

You are building a linear regression model on BigQuery ML to predict a customer's likelihood of purchasing your company's products. Your model uses a city name variable as a key predictive component. In order to train and serve the model, your data must be organized in columns. You want to prepare your data using the least amount of coding while maintaining the predictable variables. What should you do?

A.

Create a new view with BigQuery that does not include a column with city information

B.

Use Dataprep to transform the state column using a one-hot encoding method, and make each city a column with binary values.

C.

Use Cloud Data Fusion to assign each city to a region labeled as 1, 2, 3, 4, or 5r and then use that number to represent the city in the model.

D.

Use TensorFlow to create a categorical variable with a vocabulary list Create the vocabulary file, and upload it as part of your model to BigQuery ML.

Full Access
Question # 37

You work for an advertising company and want to understand the effectiveness of your company's latest advertising campaign. You have streamed 500 MB of campaign data into BigQuery. You want to query the table, and then manipulate the results of that query with a pandas dataframe in an Al Platform notebook. What should you do?

A.

Use Al Platform Notebooks' BigQuery cell magic to query the data, and ingest the results as a pandas dataframe

B.

Export your table as a CSV file from BigQuery to Google Drive, and use the Google Drive API to ingest the file into your notebook instance

C.

Download your table from BigQuery as a local CSV file, and upload it to your Al Platform notebook instance Use pandas. read_csv to ingest the file as a pandas dataframe

D.

From a bash cell in your Al Platform notebook, use the bq extract command to export the table as a CSV file to Cloud Storage, and then use gsutii cp to copy the data into the notebook Use pandas. read_csv to ingest the file as a pandas dataframe

Full Access
Question # 38

You work on a growing team of more than 50 data scientists who all use Al Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

A.

Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.

B.

Separate each data scientist's work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources

D.

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.

Full Access
Question # 39

You are developing an ML model to identify your company s products in images. You have access to over one million images in a Cloud Storage bucket. You plan to experiment with different TensorFlow models by using Vertex Al Training You need to read images at scale during training while minimizing data I/O bottlenecks What should you do?

A.

Load the images directly into the Vertex Al compute nodes by using Cloud Storage FUSE Read the images by using the tf .data.Dataset.from_tensor_slices function.

B.

Create a Vertex Al managed dataset from your image data Access the aip_training_data_uri

environment variable to read the images by using the tf. data. Dataset. Iist_flies function.

C.

Convert the images to TFRecords and store them in a Cloud Storage bucket Read the TFRecords by using the tf. ciata.TFRecordDataset function.

D.

Store the URLs of the images in a CSV file Read the file by using the tf.data.experomental.CsvDataset function.

Full Access
Question # 40

You work for a semiconductor manufacturing company. You need to create a real-time application that automates the quality control process High-definition images of each semiconductor are taken at the end of the assembly line in real time. The photos are uploaded to a Cloud Storage bucket along with tabular data that includes each semiconductor's batch number serial number dimensions, and weight You need to configure model training and serving while maximizing model accuracy. What should you do?

A.

Use Vertex Al Data Labeling Service to label the images and train an AutoML image classification model.

Deploy the model and configure Pub/Sub to publish a message when an image is categorized into the failing class.

B.

Use Vertex Al Data Labeling Service to label the images and train an AutoML image classification model. Schedule a daily batch prediction job that publishes a Pub/Sub message when the job completes.

C.

Convert the images into an embedding representation Import this data into BigQuery, and train a BigQuery. ML K-means clustenng model with two clusters Deploy the model and configure Pub/Sub to publish a message when a semiconductor's data is categorized into the failing cluster.

D.

Import the tabular data into BigQuery use Vertex Al Data Labeling Service to label the data and train an AutoML tabular classification model Deploy the model and configure Pub/Sub to publish a message when a semiconductor's data is categorized into the failing class.

Full Access
Question # 41

You have deployed multiple versions of an image classification model on Al Platform. You want to monitor the performance of the model versions overtime. How should you perform this comparison?

A.

Compare the loss performance for each model on a held-out dataset.

B.

Compare the loss performance for each model on the validation data

C.

Compare the receiver operating characteristic (ROC) curve for each model using the What-lf Tool

D.

Compare the mean average precision across the models using the Continuous Evaluation feature

Full Access
Question # 42

You have developed a BigQuery ML model that predicts customer churn and deployed the model to Vertex Al Endpoints. You want to automate the retraining of your model by using minimal additional code when model feature values change. You also want to minimize the number of times that your model is retrained to reduce training costs. What should you do?

A.

1. Enable request-response logging on Vertex Al Endpoints.

2 Schedule a TensorFlow Data Validation job to monitor prediction drift

3. Execute model retraining if there is significant distance between the distributions.

B.

1. Enable request-response logging on Vertex Al Endpoints

2. Schedule a TensorFlow Data Validation job to monitor training/serving skew

3. Execute model retraining if there is significant distance between the distributions

C.

1 Create a Vertex Al Model Monitoring job configured to monitor prediction drift.

2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitonng alert is detected.

3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery

D.

1. Create a Vertex Al Model Monitoring job configured to monitor training/serving skew

2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected

3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery.

Full Access
Question # 43

You are building a TensorFlow text-to-image generative model by using a dataset that contains billions of images with their respective captions. You want to create a low maintenance, automated workflow that reads the data from a Cloud Storage bucket collects statistics, splits the dataset into training/validation/test datasets performs data transformations, trains the model using the training/validation datasets. and validates the model by using the test dataset. What should you do?

A.

Use the Apache Airflow SDK to create multiple operators that use Dataflow and Vertex Al services Deploy the workflow on Cloud Composer.

B.

Use the MLFlow SDK and deploy it on a Google Kubernetes Engine Cluster Create multiple components that use Dataflow and Vertex Al services.

C.

Use the Kubeflow Pipelines (KFP) SDK to create multiple components that use Dataflow and Vertex Al services Deploy the workflow on Vertex Al Pipelines.

D.

Use the TensorFlow Extended (TFX) SDK to create multiple components that use Dataflow and Vertex Al services Deploy the workflow on Vertex Al Pipelines.

Full Access
Question # 44

You are building an ML model to detect anomalies in real-time sensor data. You will use Pub/Sub to handle incoming requests. You want to store the results for analytics and visualization. How should you configure the pipeline?

A.

1 = Dataflow, 2 - Al Platform, 3 = BigQuery

B.

1 = DataProc, 2 = AutoML, 3 = Cloud Bigtable

C.

1 = BigQuery, 2 = AutoML, 3 = Cloud Functions

D.

1 = BigQuery, 2 = Al Platform, 3 = Cloud Storage

Full Access
Question # 45

You work for a retail company. You have been asked to develop a model to predict whether a customer will purchase a product on a given day. Your team has processed the company's sales data, and created a table with the following rows:

• Customer_id

• Product_id

• Date

• Days_since_last_purchase (measured in days)

• Average_purchase_frequency (measured in 1/days)

• Purchase (binary class, if customer purchased product on the Date)

You need to interpret your models results for each individual prediction. What should you do?

A.

Create a BigQuery table Use BigQuery ML to build a boosted tree classifier Inspect the partition rules of the trees to understand how each prediction flows through the trees.

B.

Create a Vertex Al tabular dataset Train an AutoML model to predict customer purchases Deploy the model

to a Vertex Al endpoint and enable feature attributions Use the "explain" method to get feature attribution values for each individual prediction.

C.

Create a BigQuery table Use BigQuery ML to build a logistic regression classification model Use the values of the coefficients of the model to interpret the feature importance with higher values corresponding to more importance.

D.

Create a Vertex Al tabular dataset Train an AutoML model to predict customer purchases Deploy the model to a Vertex Al endpoint. At each prediction enable L1 regularization to detect non-informative features.

Full Access
Question # 46

You have trained a deep neural network model on Google Cloud. The model has low loss on the training data, but is performing worse on the validation data. You want the model to be resilient to overfitting. Which strategy should you use when retraining the model?

A.

Apply a dropout parameter of 0 2, and decrease the learning rate by a factor of 10

B.

Apply a L2 regularization parameter of 0.4, and decrease the learning rate by a factor of 10.

C.

Run a hyperparameter tuning job on Al Platform to optimize for the L2 regularization and dropout parameters

D.

Run a hyperparameter tuning job on Al Platform to optimize for the learning rate, and increase the number of neurons by a factor of 2.

Full Access
Question # 47

Your task is classify if a company logo is present on an image. You found out that 96% of a data does not include a logo. You are dealing with data imbalance problem. Which metric do you use to evaluate to model?

A.

F1 Score

B.

RMSE

C.

F Score with higher precision weighting than recall

D.

F Score with higher recall weighted than precision

Full Access
Question # 48

You recently developed a deep learning model using Keras, and now you are experimenting with different training strategies. First, you trained the model using a single GPU, but the training process was too slow. Next, you distributed the training across 4 GPUs using tf.distribute.MirroredStrategy (with no other changes), but you did not observe a decrease in training time. What should you do?

A.

Distribute the dataset with tf.distribute.Strategy.experimental_distribute_dataset

B.

Create a custom training loop.

C.

Use a TPU with tf.distribute.TPUStrategy.

D.

Increase the batch size.

Full Access
Question # 49

You work for a toy manufacturer that has been experiencing a large increase in demand. You need to build an ML model to reduce the amount of time spent by quality control inspectors checking for product defects. Faster defect detection is a priority. The factory does not have reliable Wi-Fi. Your company wants to implement the new ML model as soon as possible. Which model should you use?

A.

AutoML Vision model

B.

AutoML Vision Edge mobile-versatile-1 model

C.

AutoML Vision Edge mobile-low-latency-1 model

D.

AutoML Vision Edge mobile-high-accuracy-1 model

Full Access
Question # 50

You work for a magazine distributor and need to build a model that predicts which customers will renew their subscriptions for the upcoming year. Using your company’s historical data as your training set, you created a TensorFlow model and deployed it to AI Platform. You need to determine which customer attribute has the most predictive power for each prediction served by the model. What should you do?

A.

Use AI Platform notebooks to perform a Lasso regression analysis on your model, which will eliminate features that do not provide a strong signal.

B.

Stream prediction results to BigQuery. Use BigQuery’s CORR(X1, X2) function to calculate the Pearson correlation coefficient between each feature and the target variable.

C.

Use the AI Explanations feature on AI Platform. Submit each prediction request with the ‘explain’ keyword to retrieve feature attributions using the sampled Shapley method.

D.

Use the What-If tool in Google Cloud to determine how your model will perform when individual features are excluded. Rank the feature importance in order of those that caused the most significant performance drop when removed from the model.

Full Access
Question # 51

You are training models in Vertex Al by using data that spans across multiple Google Cloud Projects You need to find track, and compare the performance of the different versions of your models Which Google Cloud services should you include in your ML workflow?

A.

Dataplex. Vertex Al Feature Store and Vertex Al TensorBoard

B.

Vertex Al Pipelines, Vertex Al Feature Store, and Vertex Al Experiments

C.

Dataplex. Vertex Al Experiments, and Vertex Al ML Metadata

D.

Vertex Al Pipelines: Vertex Al Experiments and Vertex Al Metadata

Full Access
Question # 52

You work for an online grocery store. You recently developed a custom ML model that recommends a recipe when a user arrives at the website. You chose the machine type on the Vertex Al endpoint to optimize costs by using the queries per second (QPS) that the model can serve, and you deployed it on a single machine with 8 vCPUs and no accelerators.

A holiday season is approaching and you anticipate four times more traffic during this time than the typical daily traffic You need to ensure that the model can scale efficiently to the increased demand. What should you do?

A.

1, Maintain the same machine type on the endpoint.

2 Set up a monitoring job and an alert for CPU usage

3 If you receive an alert add a compute node to the endpoint

B.

1 Change the machine type on the endpoint to have 32 vCPUs

2. Set up a monitoring job and an alert for CPU usage

3 If you receive an alert, scale the vCPUs further as needed

C.

1 Maintain the same machine type on the endpoint Configure the endpoint to enable autoscalling based on vCPU usage.

2 Set up a monitoring job and an alert for CPU usage

3 If you receive an alert investigate the cause

D.

1 Change the machine type on the endpoint to have a GPU_ Configure the endpoint to enable autoscaling based on the GPU usage.

2 Set up a monitoring job and an alert for GPU usage.

3 If you receive an alert investigate the cause.

Full Access
Question # 53

You have trained a model by using data that was preprocessed in a batch Dataflow pipeline Your use case requires real-time inference. You want to ensure that the data preprocessing logic is applied consistently between training and serving. What should you do?

A.

Perform data validation to ensure that the input data to the pipeline is the same format as the input data to the endpoint.

B.

Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline Use the same code in the endpoint.

C.

Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline Share this code with the end users of the endpoint.

D.

Batch the real-time requests by using a time window and then use the Dataflow pipeline to preprocess the batched requests. Send the preprocessed requests to the endpoint.

Full Access
Question # 54

You received a training-serving skew alert from a Vertex Al Model Monitoring job running in production. You retrained the model with more recent training data, and deployed it back to the Vertex Al endpoint but you are still receiving the same alert. What should you do?

A.

Update the model monitoring job to use a lower sampling rate.

B.

Update the model monitoring job to use the more recent training data that was used to retrain the model.

C.

Temporarily disable the alert Enable the alert again after a sufficient amount of new production traffic has passed through the Vertex Al endpoint.

D.

Temporarily disable the alert until the model can be retrained again on newer training data Retrain the model again after a sufficient amount of new production traffic has passed through the Vertex Al endpoint

Full Access
Question # 55

You are using Keras and TensorFlow to develop a fraud detection model Records of customer transactions are stored in a large table in BigQuery. You need to preprocess these records in a cost-effective and efficient way before you use them to train the model. The trained model will be used to perform batch inference in BigQuery. How should you implement the preprocessing workflow?

A.

Implement a preprocessing pipeline by using Apache Spark, and run the pipeline on Dataproc Save the preprocessed data as CSV files in a Cloud Storage bucket.

B.

Load the data into a pandas DataFrame Implement the preprocessing steps using panda’s transformations. and train the model directly on the DataFrame.

C.

Perform preprocessing in BigQuery by using SQL Use the BigQueryClient in TensorFlow to read the data directly from BigQuery.

D.

Implement a preprocessing pipeline by using Apache Beam, and run the pipeline on Dataflow Save the preprocessed data as CSV files in a Cloud Storage bucket.

Full Access
Question # 56

You need to analyze user activity data from your company’s mobile applications. Your team will use BigQuery for data analysis, transformation, and experimentation with ML algorithms. You need to ensure real-time ingestion of the user activity data into BigQuery. What should you do?

A.

Configure Pub/Sub to stream the data into BigQuery.

B.

Run an Apache Spark streaming job on Dataproc to ingest the data into BigQuery.

C.

Run a Dataflow streaming job to ingest the data into BigQuery.

D.

Configure Pub/Sub and a Dataflow streaming job to ingest the data into BigQuery,

Full Access
Question # 57

You are building an ML model to predict trends in the stock market based on a wide range of factors. While exploring the data, you notice that some features have a large range. You want to ensure that the features with the largest magnitude don’t overfit the model. What should you do?

A.

Standardize the data by transforming it with a logarithmic function.

B.

Apply a principal component analysis (PCA) to minimize the effect of any particular feature.

C.

Use a binning strategy to replace the magnitude of each feature with the appropriate bin number.

D.

Normalize the data by scaling it to have values between 0 and 1.

Full Access
Question # 58

You work for a company that manages a ticketing platform for a large chain of cinemas. Customers use a mobile app to search for movies they’re interested in and purchase tickets in the app. Ticket purchase requests are sent to Pub/Sub and are processed with a Dataflow streaming pipeline configured to conduct the following steps:

1. Check for availability of the movie tickets at the selected cinema.

2. Assign the ticket price and accept payment.

3. Reserve the tickets at the selected cinema.

4. Send successful purchases to your database.

Each step in this process has low latency requirements (less than 50 milliseconds). You have developed a logistic regression model with BigQuery ML that predicts whether offering a promo code for free popcorn increases the chance of a ticket purchase, and this prediction should be added to the ticket purchase process. You want to identify the simplest way to deploy this model to production while adding minimal latency. What should you do?

A.

Run batch inference with BigQuery ML every five minutes on each new set of tickets issued.

B.

Export your model in TensorFlow format, and add a tfx_bsl.public.beam.RunInference step to the Dataflow pipeline.

C.

Export your model in TensorFlow format, deploy it on Vertex AI, and query the prediction endpoint from your streaming pipeline.

D.

Convert your model with TensorFlow Lite (TFLite), and add it to the mobile app so that the promo code and the incoming request arrive together in Pub/Sub.

Full Access
Question # 59

You work for an international manufacturing organization that ships scientific products all over the world Instruction manuals for these products need to be translated to 15 different languages Your organization's leadership team wants to start using machine learning to reduce the cost of manual human translations and increase translation speed. You need to implement a scalable solution that maximizes accuracy and minimizes operational overhead. You also want to include a process to evaluate and fix incorrect translations. What should you do?

A.

Create a workflow using Cloud Function Triggers Configure a Cloud Function that is triggered when documents are uploaded to an input Cloud Storage bucket Configure another Cloud Function that translates the documents using the Cloud Translation API and saves the translations to an output Cloud Storage bucket Use human reviewers to evaluate the incorrect translations.

B.

Create a Vertex Al pipeline that processes the documents1 launches an AutoML Translation training job evaluates the translations, and deploys the model to a Vertex Al endpoint with autoscaling and model monitoring When there is a predetermined skew between training and live data re-trigger the pipeline with the latest data.

C.

Use AutoML Translation to tram a model Configure a Translation Hub project and use the trained model to translate the documents Use human reviewers to evaluate the incorrect translations

D.

Use Vertex Al custom training jobs to fine-tune a state-of-the-art open source pretrained model with your data Deploy the model to a Vertex Al endpoint with autoscaling and model monitoring When there is a predetermined skew between the training and live data, configure a trigger to run another training job with the latest data.

Full Access
Question # 60

One of your models is trained using data provided by a third-party data broker. The data broker does not reliably notify you of formatting changes in the data. You want to make your model training pipeline more robust to issues like this. What should you do?

A.

Use TensorFlow Data Validation to detect and flag schema anomalies.

B.

Use TensorFlow Transform to create a preprocessing component that will normalize data to the expected distribution, and replace values that don’t match the schema with 0.

C.

Use tf.math to analyze the data, compute summary statistics, and flag statistical anomalies.

D.

Use custom TensorFlow functions at the start of your model training to detect and flag known formatting errors.

Full Access
Question # 61

You are building a model to predict daily temperatures. You split the data randomly and then transformed the training and test datasets. Temperature data for model training is uploaded hourly. During testing, your model performed with 97% accuracy; however, after deploying to production, the model's accuracy dropped to 66%. How can you make your production model more accurate?

A.

Normalize the data for the training, and test datasets as two separate steps.

B.

Split the training and test data based on time rather than a random split to avoid leakage

C.

Add more data to your test set to ensure that you have a fair distribution and sample for testing

D.

Apply data transformations before splitting, and cross-validate to make sure that the transformations are applied to both the training and test sets.

Full Access
Question # 62

You recently deployed a pipeline in Vertex Al Pipelines that trains and pushes a model to a Vertex Al endpoint to serve real-time traffic. You need to continue experimenting and iterating on your pipeline to improve model performance. You plan to use Cloud Build for CI/CD You want to quickly and easily deploy new pipelines into production and you want to minimize the chance that the new pipeline implementations will break in production. What should you do?

A.

Set up a CI/CD pipeline that builds and tests your source code If the tests are successful use the Google Cloud console to upload the built container to Artifact Registry and upload the compiled pipeline to Vertex Al Pipelines.

B.

Set up a CI/CD pipeline that builds your source code and then deploys built artifacts into a pre-production environment Run unit tests in the pre-production environment If the tests are successful deploy the pipeline to production.

C.

Set up a CI/CD pipeline that builds and tests your source code and then deploys built artifacts into a pre-production environment. After a successful pipeline run in the pre-production environment deploy the pipeline to production

D.

Set up a CI/CD pipeline that builds and tests your source code and then deploys built arrets into a pre-production environment After a successful pipeline run in the pre-production environment, rebuild the source code, and deploy the artifacts to production

Full Access
Question # 63

You are an ML engineer on an agricultural research team working on a crop disease detection tool to detect leaf rust spots in images of crops to determine the presence of a disease. These spots, which can vary in shape and size, are correlated to the severity of the disease. You want to develop a solution that predicts the presence and severity of the disease with high accuracy. What should you do?

A.

Create an object detection model that can localize the rust spots.

B.

Develop an image segmentation ML model to locate the boundaries of the rust spots.

C.

Develop a template matching algorithm using traditional computer vision libraries.

D.

Develop an image classification ML model to predict the presence of the disease.

Full Access
Question # 64

You are responsible for building a unified analytics environment across a variety of on-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide range of disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?

A.

Dataflow

B.

Dataprep

C.

Apache Flink

D.

Cloud Data Fusion

Full Access
Question # 65

You are developing a custom TensorFlow classification model based on tabular data. Your raw data is stored in BigQuery contains hundreds of millions of rows, and includes both categorical and numerical features. You need to use a MaxMin scaler on some numerical features, and apply a one-hot encoding to some categorical features such as SKU names. Your model will be trained over multiple epochs. You want to minimize the effort and cost of your solution. What should you do?

A.

1 Write a SQL query to create a separate lookup table to scale the numerical features.

2. Deploy a TensorFlow-based model from Hugging Face to BigQuery to encode the text features.

3. Feed the resulting BigQuery view into Vertex Al Training.

B.

1 Use BigQuery to scale the numerical features.

2. Feed the features into Vertex Al Training.

3 Allow TensorFlow to perform the one-hot text encoding.

C.

1 Use TFX components with Dataflow to encode the text features and scale the numerical features.

2 Export results to Cloud Storage as TFRecords.

3 Feed the data into Vertex Al Training.

D.

1 Write a SQL query to create a separate lookup table to scale the numerical features.

2 Perform the one-hot text encoding in BigQuery.

3. Feed the resulting BigQuery view into Vertex Al Training.

Full Access
Question # 66

You developed a BigQuery ML linear regressor model by using a training dataset stored in a BigQuery table. New data is added to the table every minute. You are using Cloud Scheduler and Vertex Al Pipelines to automate hourly model training, and use the model for direct inference. The feature preprocessing logic includes quantile bucketization and MinMax scaling on data received in the last hour. You want to minimize storage and computational overhead. What should you do?

A.

Create a component in the Vertex Al Pipelines directed acyclic graph (DAG) to calculate the required statistics, and pass the statistics on to subsequent components.

B.

Preprocess and stage the data in BigQuery prior to feeding it to the model during training and inference.

C.

Create SQL queries to calculate and store the required statistics in separate BigQuery tables that are referenced in the CREATE MODEL statement.

D.

Use the TRANSFORM clause in the CREATE MODEL statement in the SQL query to calculate the required statistics.

Full Access
Question # 67

You work for an online retailer. Your company has a few thousand short lifecycle products. Your company has five years of sales data stored in BigQuery. You have been asked to build a model that will make monthly sales predictions for each product. You want to use a solution that can be implemented quickly with minimal effort. What should you do?

A.

Use Prophet on Vertex Al Training to build a custom model.

B.

Use Vertex Al Forecast to build a NN-based model.

C.

Use BigQuery ML to build a statistical AR1MA_PLUS model.

D.

Use TensorFlow on Vertex Al Training to build a custom model.

Full Access
Question # 68

You have a functioning end-to-end ML pipeline that involves tuning the hyperparameters of your ML model using Al Platform, and then using the best-tuned parameters for training. Hypertuning is taking longer than expected and is delaying the downstream processes. You want to speed up the tuning job without significantly compromising its effectiveness. Which actions should you take?

Choose 2 answers

A.

Decrease the number of parallel trials

B.

Decrease the range of floating-point values

C.

Set the early stopping parameter to TRUE

D.

Change the search algorithm from Bayesian search to random search.

E.

Decrease the maximum number of trials during subsequent training phases.

Full Access
Question # 69

You are training an object detection model using a Cloud TPU v2. Training time is taking longer than expected. Based on this simplified trace obtained with a Cloud TPU profile, what action should you take to decrease training time in a cost-efficient way?

A.

Move from Cloud TPU v2 to Cloud TPU v3 and increase batch size.

B.

Move from Cloud TPU v2 to 8 NVIDIA V100 GPUs and increase batch size.

C.

Rewrite your input function to resize and reshape the input images.

D.

Rewrite your input function using parallel reads, parallel processing, and prefetch.

Full Access
Question # 70

You work for a credit card company and have been asked to create a custom fraud detection model based on historical data using AutoML Tables. You need to prioritize detection of fraudulent transactions while minimizing false positives. Which optimization objective should you use when training the model?

A.

An optimization objective that minimizes Log loss

B.

An optimization objective that maximizes the Precision at a Recall value of 0.50

C.

An optimization objective that maximizes the area under the precision-recall curve (AUC PR) value

D.

An optimization objective that maximizes the area under the receiver operating characteristic curve (AUC ROC) value

Full Access
Question # 71

You work for a company that is developing a new video streaming platform. You have been asked to create a recommendation system that will suggest the next video for a user to watch. After a review by an AI Ethics team, you are approved to start development. Each video asset in your company’s catalog has useful metadata (e.g., content type, release date, country), but you do not have any historical user event data. How should you build the recommendation system for the first version of the product?

A.

Launch the product without machine learning. Present videos to users alphabetically, and start collecting user event data so you can develop a recommender model in the future.

B.

Launch the product without machine learning. Use simple heuristics based on content metadata to recommend similar videos to users, and start collecting user event data so you can develop a recommender model in the future.

C.

Launch the product with machine learning. Use a publicly available dataset such as MovieLens to train a model using the Recommendations AI, and then apply this trained model to your data.

D.

Launch the product with machine learning. Generate embeddings for each video by training an autoencoder on the content metadata using TensorFlow. Cluster content based on the similarity of these embeddings, and then recommend videos from the same cluster.

Full Access
Question # 72

You work for a large social network service provider whose users post articles and discuss news. Millions of comments are posted online each day, and more than 200 human moderators constantly review comments and flag those that are inappropriate. Your team is building an ML model to help human moderators check content on the platform. The model scores each comment and flags suspicious comments to be reviewed by a human. Which metric(s) should you use to monitor the model’s performance?

A.

Number of messages flagged by the model per minute

B.

Number of messages flagged by the model per minute confirmed as being inappropriate by humans.

C.

Precision and recall estimates based on a random sample of 0.1% of raw messages each minute sent to a human for review

D.

Precision and recall estimates based on a sample of messages flagged by the model as potentially inappropriate each minute

Full Access
Question # 73

Your data science team is training a PyTorch model for image classification based on a pre-trained RestNet model. You need to perform hyperparameter tuning to optimize for several parameters. What should you do?

A.

Convert the model to a Keras model, and run a Keras Tuner job.

B.

Run a hyperparameter tuning job on AI Platform using custom containers.

C.

Create a Kuberflow Pipelines instance, and run a hyperparameter tuning job on Katib.

D.

Convert the model to a TensorFlow model, and run a hyperparameter tuning job on AI Platform.

Full Access
Question # 74

You work on a team that builds state-of-the-art deep learning models by using the TensorFlow framework. Your team runs multiple ML experiments each week which makes it difficult to track the experiment runs. You want a simple approach to effectively track, visualize and debug ML experiment runs on Google Cloud while minimizing any overhead code. How should you proceed?

A.

Set up Vertex Al Experiments to track metrics and parameters Configure Vertex Al TensorBoard for visualization.

B.

Set up a Cloud Function to write and save metrics files to a Cloud Storage Bucket Configure a Google Cloud VM to host TensorBoard locally for visualization.

C.

Set up a Vertex Al Workbench notebook instance Use the instance to save metrics data in a Cloud Storage bucket and to host TensorBoard locally for visualization.

D.

Set up a Cloud Function to write and save metrics files to a BigQuery table. Configure a Google Cloud VM to host TensorBoard locally for visualization.

Full Access
Question # 75

You have a demand forecasting pipeline in production that uses Dataflow to preprocess raw data prior to model training and prediction. During preprocessing, you employ Z-score normalization on data stored in BigQuery and write it back to BigQuery. New training data is added every week. You want to make the process more efficient by minimizing computation time and manual intervention. What should you do?

A.

Normalize the data using Google Kubernetes Engine

B.

Translate the normalization algorithm into SQL for use with BigQuery

C.

Use the normalizer_fn argument in TensorFlow's Feature Column API

D.

Normalize the data with Apache Spark using the Dataproc connector for BigQuery

Full Access
Question # 76

You recently deployed a model lo a Vertex Al endpoint and set up online serving in Vertex Al Feature Store. You have configured a daily batch ingestion job to update your featurestore During the batch ingestion jobs you discover that CPU utilization is high in your featurestores online serving nodes and that feature retrieval latency is high. You need to improve online serving performance during the daily batch ingestion. What should you do?

A.

Schedule an increase in the number of online serving nodes in your featurestore prior to the batch ingestion jobs.

B.

Enable autoscaling of the online serving nodes in your featurestore

C.

Enable autoscaling for the prediction nodes of your DeployedModel in the Vertex Al endpoint.

D.

Increase the worker counts in the importFeaturevalues request of your batch ingestion job.

Full Access
Question # 77

Your work for a textile manufacturing company. Your company has hundreds of machines and each machine has many sensors. Your team used the sensory data to build hundreds of ML models that detect machine anomalies Models are retrained daily and you need to deploy these models in a cost-effective way. The models must operate 24/7 without downtime and make sub millisecond predictions. What should you do?

A.

Deploy a Dataflow batch pipeline and a Vertex Al Prediction endpoint.

B.

Deploy a Dataflow batch pipeline with the Runlnference API. and use model refresh.

C.

Deploy a Dataflow streaming pipeline and a Vertex Al Prediction endpoint with autoscaling.

D.

Deploy a Dataflow streaming pipeline with the Runlnference API and use automatic model refresh.

Full Access
Question # 78

You manage a team of data scientists who use a cloud-based backend system to submit training jobs. This system has become very difficult to administer, and you want to use a managed service instead. The data scientists you work with use many different frameworks, including Keras, PyTorch, theano. Scikit-team, and custom libraries. What should you do?

A.

Use the Al Platform custom containers feature to receive training jobs using any framework

B.

Configure Kubeflow to run on Google Kubernetes Engine and receive training jobs through TFJob

C.

Create a library of VM images on Compute Engine; and publish these images on a centralized repository

D.

Set up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure.

Full Access
Question # 79

Your team is building an application for a global bank that will be used by millions of customers. You built a forecasting model that predicts customers1 account balances 3 days in the future. Your team will use the results in a new feature that will notify users when their account balance is likely to drop below $25. How should you serve your predictions?

A.

1. Create a Pub/Sub topic for each user

2 Deploy a Cloud Function that sends a notification when your model predicts that a user's account balance will drop below the $25 threshold.

B.

1. Create a Pub/Sub topic for each user

2. Deploy an application on the App Engine standard environment that sends a notification when your model predicts that

a user's account balance will drop below the $25 threshold

C.

1. Build a notification system on Firebase

2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when the average of all account balance predictions drops below the $25 threshold

D.

1 Build a notification system on Firebase

2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when your model predicts that a user's account balance will drop below the $25 threshold

Full Access
Question # 80

You work at a bank. You need to develop a credit risk model to support loan application decisions You decide to implement the model by using a neural network in TensorFlow Due to regulatory requirements, you need to be able to explain the models predictions based on its features When the model is deployed, you also want to monitor the model's performance overtime You decided to use Vertex Al for both model development and deployment What should you do?

A.

Use Vertex Explainable Al with the sampled Shapley method, and enable Vertex Al Model Monitoring to

check for feature distribution drift.

B.

Use Vertex Explainable Al with the sampled Shapley method, and enable Vertex Al Model Monitoring to

check for feature distribution skew.

C.

Use Vertex Explainable Al with the XRAI method, and enable Vertex Al Model Monitoring to check for feature distribution drift.

D.

Use Vertex Explainable Al with the XRAI method and enable Vertex Al Model Monitoring to check for feature distribution skew.

Full Access