Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: scxmas70

Databricks-Machine-Learning-Professional Exam Dumps - Databricks Certified Machine Learning Professional

Question # 4

A data scientist has developed a scikit-learn random forest model model, but they have not yet logged model with MLflow. They want to obtain the input schema and the output schema of the model so they can document what type of data is expected as input.

Which of the following MLflow operations can be used to perform this task?

A.

mlflow.models.schema.infer_schema

B.

mlflow.models.signature.infer_signature

C.

mlflow.models.Model.get_input_schema

D.

mlflow.models.Model.signature

E.

There is no way to obtain the input schema and the output schema of an unlogged model.

Full Access
Question # 5

A data scientist is utilizing MLflow to track their machine learning experiments. After completing a series of runs for the experiment with experiment ID exp_id, the data scientist wants to programmatically work with the experiment run data in a Spark DataFrame. They have an active MLflow Client client and an active Spark session spark.

Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark DataFrame?

A.

client.list_run_infos(exp_id)

B.

spark.read.format("delta").load(exp_id)

C.

There is no way to programmatically return row-level results from an MLflow Experiment.

D.

mlflow.search_runs(exp_id)

E.

spark.read.format("mlflow-experiment").load(exp_id)

Full Access
Question # 6

A machine learning engineer is attempting to create a webhook that will trigger a Databricks Jobjob_idwhen a model version for modelmodeltransitions into any MLflow Model Registry stage.

They have the following incomplete code block:

Which of the following lines of code can be used to fill in the blank so that the code block accomplishes the task?

A.

"MODEL_VERSION_CREATED"

B.

"MODEL_VERSION_TRANSITIONED_TO_PRODUCTION"

C.

"MODEL_VERSION_TRANSITIONED_TO_STAGING"

D.

"MODEL_VERSION_TRANSITIONED_STAGE"

E.

"MODEL_VERSION_TRANSITIONED_TO_STAGING", "MODEL_VERSION_TRANSITIONED_TO_PRODUCTION"

Full Access
Question # 7

A machine learning engineer wants to log and deploy a model as an MLflow pyfunc model. They have custom preprocessing that needs to be completed on feature variables prior to fitting the model or computing predictions using that model. They decide to wrap this preprocessing in a custom model class ModelWithPreprocess, where the preprocessing is performed when calling fit and when calling predict. They then log the fitted model of the ModelWithPreprocess class as a pyfunc model.

Which of the following is a benefit of this approach when loading the logged pyfunc model for downstream deployment?

A.

The pvfunc model can be used to deploy models in a parallelizable fashion

B.

The same preprocessing logic will automatically be applied when calling fit

C.

The same preprocessing logic will automatically be applied when calling predict

D.

This approach has no impact when loading the logged Pvfunc model for downstream deployment

E.

There is no longer a need for pipeline-like machine learning objects

Full Access
Question # 8

Which of the following statements describes streaming with Spark as a model deployment strategy?

A.

The inference of batch processed records as soon as a trigger is hit

B.

The inference of all types of records in real-time

C.

The inference of batch processed records as soon as a Spark job is run

D.

The inference of incrementally processed records as soon as trigger is hit

E.

The inference of incrementally processed records as soon as a Spark job is run

Full Access
Question # 9

A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on customer-level Spark DataFrame spark_df, but it is missing a few of the static features that were used when training the model. The customer_id column is the primary key of spark_df and the training set used when training and logging the model.

Which of the following code blocks can be used to compute predictions for spark_df when the missing feature values can be found in the Feature Store by searching for features by customer_id?

A.

df = fs.get_missing_features(spark_df, model_uri)

fs.score_model(model_uri, df)

B.

fs.score_model(model_uri, spark_df)

C.

df = fs.get_missing_features(spark_df, model_uri)

fs.score_batch(model_uri, df)

df = fs.get_missing_features(spark_df)

D.

fs.score_batch(model_uri, df)

E.

fs.score_batch(model_uri, spark_df)

Full Access
Question # 10

Which of the following tools can assist in real-time deployments by packaging software with its own application, tools, and libraries?

A.

Cloud-based compute

B.

None of these tools

C.

REST APIs

D.

Containers

E.

Autoscaling clusters

Full Access
Question # 11

A machine learning engineer needs to deliver predictions of a machine learning model in real-time. However, the feature values needed for computing the predictions are available one week before the query time.

Which of the following is a benefit of using a batch serving deployment in this scenario rather than a real-time serving deployment where predictions are computed at query time?

A.

Batch servinghas built-in capabilities in Databricks Machine Learning

B.

There is no advantage to using batch serving deployments over real-time serving deployments

C.

Computing predictions in real-time provides more up-to-date results

D.

Testing is not possible in real-time serving deployments

E.

Querying stored predictions can be faster than computing predictions in real-time

Full Access
Question # 12

A machine learning engineer is converting a Hyperopt-based hyperparameter tuning process from manual MLflow logging to MLflow Autologging. They are trying to determine how to manage nested Hyperopt runs with MLflow Autologging.

Which of the following approaches will create a single parent run for the process and a child run for each unique combination of hyperparameter values when using Hyperopt and MLflow Autologging?

A.

Startinq amanual parent run before callingfmin

B.

Ensuring that a built-in model flavor is used for the model logging

C.

Starting a manual child run within the objective function

D.

There is no way to accomplish nested runs with MLflow Autoloqqinq and Hyperopt

E.

MLflow Autoloqqinq will automatically accomplish this task with Hyperopt

Full Access
Question # 13

Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-Smirnov (KS) test for numeric feature drift detection?

A.

All of these reasons

B.

JS is not normalized or smoothed

C.

None of these reasons

D.

JS is more robust when working with large datasets

E.

JS does not require any manual threshold or cutoff determinations

Full Access
Question # 14

Which of the following operations in Feature Store Client fs can be used to return a Spark DataFrame of a data set associated with a Feature Store table?

A.

fs.create_table

B.

fs.write_table

C.

fs.get_table

D.

There is no way to accomplish this task with fs

E.

fs.read_table

Full Access
Question # 15

A data scientist has computed updated feature values for all primary key values stored in the Feature Store table features. In addition, feature values for some new primary key values have also been computed. The updated feature values are stored in the DataFrame features_df. They want to replace all data in features with the newly computed data.

Which of the following code blocks can they use to perform this task using the Feature Store Client fs?

A)

B)

C)

D)

E)

A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

Full Access
Question # 16

Which of the following lists all of the model stages are available in the MLflow Model Registry?

A.

Development. Staging. Production

B.

None. Staging. Production

C.

Staging. Production. Archived

D.

None. Staging. Production. Archived

E.

Development. Staging. Production. Archived

Full Access
Question # 17

Which of the following MLflow Model Registry use cases requires the use of an HTTP Webhook?

A.

Starting a testing job when a new model is registered

B.

Updatingdata in a source table for a Databricks SQL dashboard when a model version transitions to the Production stage

C.

Sending an email alert when an automated testing Job fails

D.

None of these use cases require the use of an HTTP Webhook

E.

Sending a message to a Slack channel when a model version transitions stages

Full Access
Question # 18

A machine learning engineer is migrating a machine learning pipeline to use Databricks Machine Learning. They have programmatically identified the best run from an MLflow Experiment and stored its URI in themodel_urivariable and its Run ID in therun_idvariable. They have also determined that the model was logged with the name"model". Now, the machine learning engineer wants to register that model in the MLflow Model Registry with the name"best_model".

Which of the following lines of code can they use to register the model to the MLflow Model Registry?

A.

mlflow.register_model(model_uri, "best_model")

B.

mlflow.register_model(run_id, "best_model")

C.

mlflow.register_model(f"runs:/{run_id}/best_model", "model")

D.

mlflow.register_model(model_uri, "model")

E.

mlflow.register_model(f"runs:/{run_id}/model")

Full Access