A data scientist receives an update on a business case about a machine that has thousands of error codes. The data scientist creates the following summary statistics profile while reviewing the logs for each machine:
| Number of machines observed | 3,000,000
| Number of unique error codes observed | 19,000
| Median number of unique codes per machine | 7
| Median number of error transactions | 45
Which of the following is the most likely concern with respect to data design for model ingestion?
A data analyst wants to generate the most data using tables from a database. Which of the following is the best way to accomplish this objective?
A computer vision model is trained to identify cats on a training set that is composed of both cat and dog images. The model predicts a picture of a cat is a dog. Which of the following describes this error?
A data scientist is building a proof of concept for a commercialized machine-learning model. Which of the following is the best starting point?
Which of the following best describes the minimization of the residual term in a LASSO linear regression?
A data scientist needs to:
Build a predictive model that gives the likelihood that a car will get a flat tire.
Provide a data set of cars that had flat tires and cars that did not.
All the cars in the data set had sensors taking weekly measurements of tire pressure similar to the sensors that will be installed in the cars consumers drive.
Which of the following is the most immediate data concern?
Which of the following belong in a presentation to the senior management team and/or C-suite executives? (Choose two.)
Which of the following environmental changes is most likely to resolve a memory constraint error when running a complex model using distributed computing?
A data scientist needs to analyze a company's chemical businesses and is using the master database of the conglomerate company. Nothing in the data differentiates the data observations for the different businesses. Which of the following is the most efficient way to identify the chemical businesses' observations?
A data analyst wants to save a newly analyzed data set to a local storage option. The data set must meet the following requirements:
Be minimal in size
Have the ability to be ingested quickly
Have the associated schema, including data types, stored with it
Which of the following file types is the best to use?
PDF + Testing Engine
|
---|
$57.75 |
Testing Engine
|
---|
$43.75 |
PDF (Q&A)
|
---|
$36.75 |
CompTIA Free Exams |
---|
![]() |