Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: scxmas70

AIP-210 Exam Dumps - CertNexus Certified Artificial Intelligence Practitioner

Question # 4

Workflow design patterns for the machine learning pipelines:

A.

Aim to explain how the machine learning model works.

B.

Represent a pipeline with directed acyclic graph (DAG).

C.

Seek to simplify the management of machine learning features.

D.

Separate inputs from features.

Full Access
Question # 5

You have a dataset with thousands of features, all of which are categorical. Using these features as predictors, you are tasked with creating a prediction model to accurately predict the value of a continuous dependent variable. Which of the following would be appropriate algorithms to use? (Select two.)

A.

K-means

B.

K-nearest neighbors

C.

Lasso regression

D.

Logistic regression

E.

Ridge regression

Full Access
Question # 6

For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.

Which assumption of linear regression is being violated?

A.

Equality of variance (Homoscedastidty)

B.

Independence

C.

Linearity

D.

Normality

Full Access
Question # 7

Which of the following pieces of AI technology provides the ability to create fake videos?

A.

Generative adversarial networks (GAN)

B.

Long short-term memory (LSTM) networks

C.

Recurrent neural networks (RNN)

D.

Support-vector machines (SVM)

Full Access
Question # 8

Which of the following tests should be performed at the production level before deploying a newly retrained model?

A.

A/Btest

B.

Performance test

C.

Security test

D.

Unit test

Full Access
Question # 9

Which type of regression represents the following formula: y = c + b*x, where y = estimated dependent variable score, c = constant, b = regression coefficient, and x = score on the independent variable?

A.

Lasso regression

B.

Linear regression

C.

Polynomial regression

D.

Ridge regression

Full Access
Question # 10

A dataset can contain a range of values that depict a certain characteristic, such as grades on tests in a class during the semester. A specific student has so far received the following grades: 76,81, 78, 87, 75, and 72. There is one final test in the semester. What minimum grade would the student need to achieve on the last test to get an 80% average?

A.

82

B.

89

C.

91

D.

94

Full Access
Question # 11

What is the primary benefit of the Federated Learning approach to machine learning?

A.

It does not require a labeled dataset to solve supervised learning problems.

B.

It protects the privacy of the user's data while providing well-trained models.

C.

It requires less computation to train the same model using a traditional approach.

D.

It uses large, centralized data stores to train complex machine learning models.

Full Access
Question # 12

Why do data skews happen in the ML pipeline?

A.

Test and evaluation data are designed incorrectly.

B.

There Is a mismatch between live input data and offline data.

C.

There is a mismatch between live output data and offline data.

D.

There is insufficient training data for evaluation.

Full Access
Question # 13

Which of the following regressions will help when there is the existence of near-linear relationships among the independent variables (collinearity)?

A.

Clustering

B.

Linear regression

C.

Polynomial regression

D.

Ridge regression

Full Access
Question # 14

Which of the following best describes distributed artificial intelligence?

A.

It does not require hyperparemeter tuning because the distributed nature accounts for the bias.

B.

It intelligently pre-distributes the weight of starting a neural network.

C.

It relies on a distributed system that performs robust computations across a network of unreliable nodes.

D.

It uses a centralized system to speak to decentralized nodes.

Full Access
Question # 15

A product manager is designing an Artificial Intelligence (AI) solution and wants to do so responsibly, evaluating both positive and negative outcomes.

The team creates a shared taxonomy of potential negative impacts and conducts an assessment along vectors such as severity, impact, frequency, and likelihood.

Which modeling technique does this team use?

A.

Business

B.

Harms

C.

Process

D.

Threat

Full Access
Question # 16

Which two of the following statements about the beta value in an A/B test are accurate? (Select two.)

A.

The Beta value is the rate of type II errors for the test.

B.

The Beta value is the rate of type I errors for the test.

C.

The statistical power of a test is the inverse of the Beta value, or 1 - Beta.

D.

The Beta in an Alpha/Beta test represents one of the two variants of the A/B test.

Full Access
Question # 17

Personal data should not be disclosed, made available, or otherwise used for purposes other than specified with which of the following exceptions? (Select two.)

A.

If it is for a good cause.

B.

If it was collected accidentally.

C.

If it was requested by the authority of law.

D.

If it was with consent of the person it is collected from.

E.

If the data is only collected once.

Full Access
Question # 18

You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?

A.

Decision tree

B.

Logistic regression

C.

Random forest

D.

XGBoost

Full Access
Question # 19

An organization sells house security cameras and has asked their data scientists to implement a model to detect human feces, as distinguished from animals, so they can alert th customers only when a human gets close to their house.

Which of the following algorithms is an appropriate option with a correct reason?

A.

A decision tree algorithm, because the problem is a classification problem with a small number of features.

B.

k-means, because this is a clustering problem with a small number of features.

C.

Logistic regression, because this is a classification problem and our data is linearly separable.

D.

Neural network model, because this is a classification problem with a large number of features.

Full Access
Question # 20

Which of the following metrics is being captured when performing principal component analysis?

A.

Kurtosis

B.

Missingness

C.

Skewness

D.

Variance

Full Access
Question # 21

Which of the following is NOT a valid cross-validation method?

A.

Bootstrapping

B.

K-fold

C.

Leave-one-out

D.

Stratification

Full Access
Question # 22

R-squared is a statistical measure that:

A.

Combines precision and recall of a classifier into a single metric by taking their harmonic mean.

B.

Expresses the extent to which two variables are linearly related.

C.

Is the proportion of the variance for a dependent variable thaf’ s explained by independent variables.

D.

Represents the extent to which two random variables vary together.

Full Access
Question # 23

What is Word2vec?

A.

A bag of words.

B.

A matrix of how frequently words appear in a group of documents.

C.

A word embedding method that builds a one-hot encoded matrix from samples and the terms that appear in them.

D.

A word embedding method that finds characteristics of words in a very large number of documents.

Full Access
Question # 24

A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?

A.

Cyberprotection

B.

Cybersecurity

C.

Data privacy

D.

Data security

Full Access
Question # 25

Which two encodes can be used to transform categories data into numerical features? (Select two.)

A.

Count Encoder

B.

Log Encoder

C.

Mean Encoder

D.

Median Encoder

E.

One-Hot Encoder

Full Access
Question # 26

You create a prediction model with 96% accuracy. While the model's true positive rate (TPR) is performing well at 99%, the true negative rate (TNR) is only 50%. Your supervisor tells you that the TNR needs to be higher, even if it decreases the TPR. Upon further inspection, you notice that the vast majority of your data is truly positive.

What method could help address your issue?

A.

Normalization

B.

Oversampling

C.

Principal components analysis

D.

Quality filtering

Full Access