Curious about Actual Snowflake SnowPro Certification (DSA-C02) Exam Questions?
Here are sample Snowflake SnowPro Advanced: Data Scientist Certification (DSA-C02) Exam questions from real exam. You can get more Snowflake SnowPro Certification (DSA-C02) Exam premium practice questions at TestInsights.
Which of the following cross validation versions may not be suitable for very large datasets with hundreds of thousands of samples?
Correct : B
Leave-one-out cross-validation (LOO cross-validation) is not suitable for very large datasets due to the fact that this validation technique requires one model for every sample in the training set to be created and evaluated.
Cross validation
It is a technique to evaluate a machine learning model and it is the basis for whole class of model evaluation methods. The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it. It works by the idea of splitting dataset into number of subsets, keep a subset aside, train the model, and test the model on the holdout subset.
Leave-one-out cross validation
Leave-one-out cross validation is K-fold cross validation taken to its logical extreme, with K equal to N, the number of data points in the set. That means that N separate times, the function approximator is trained on all the data except for one point and a prediction is made for that point. As be-fore the average error is computed and used to evaluate the model. The evaluation given by leave-one-out cross validation is very expensive to compute at first pass.
Start a Discussions
Which of the following cross validation versions is suitable quicker cross-validation for very large datasets with hundreds of thousands of samples?
Correct : C
Holdout cross-validation method is suitable for very large dataset because it is the simplest and quicker to compute version of cross-validation.
Holdout method
In this method, the dataset is divided into two sets namely the training and the test set with the basic property that the training set is bigger than the test set. Later, the model is trained on the training dataset and evaluated using the test dataset.
Start a Discussions
Which of the following is a common evaluation metric for binary classification?
Correct : D
The area under the ROC curve (AUC) is a common evaluation metric for binary classification, which measures the performance of a classifier at different threshold values for the predicted probabilities. Other common metrics include accuracy, precision, recall, and F1 score, which are based on the confusion matrix of true positives, false positives, true negatives, and false negatives.
Start a Discussions
The most widely used metrics and tools to assess a classification model are:
Correct : D
Start a Discussions
You are training a binary classification model to support admission approval decisions for a college degree program.
How can you evaluate if the model is fair, and doesn't discriminate based on ethnicity?
Correct : C
By using ethnicity as a sensitive field, and comparing disparity between selection rates and performance metrics for each ethnicity value, you can evaluate the fairness of the model.
Start a Discussions
Total 65 questions