CompTIA DY0-001 Exam
CompTIA DataX (Page 4 )

Updated On: 7-Feb-2026

Which of the following modeling tools is appropriate for solving a scheduling problem?

  1. One-armed bandit
  2. Constrained optimization
  3. Decision tree
  4. Gradient descent

Answer(s): B

Explanation:

Scheduling problems require finding the best allocation of resources subject to constraints (e.g., time slots, resource availability), which is precisely what constrained optimization algorithms are designed to handle.



Which of the following environmental changes is most likely to resolve a memory constraint error when running a complex model using distributed computing?

  1. Converting an on-premises deployment to a containerized deployment
  2. Migrating to a cloud deployment
  3. Moving model processing to an edge deployment
  4. Adding nodes to a cluster deployment

Answer(s): D

Explanation:

Increasing the number of nodes in your cluster directly expands the total available memory across the distributed system, alleviating memory-constraint errors without changing your code or deployment paradigm. Containerization or edge deployments don't inherently provide more memory, and migrating to the cloud alone doesn't guarantee additional nodes unless you explicitly scale out.



A data analyst wants to save a newly analyzed data set to a local storage option. The data set must meet the following requirements:

Be minimal in size

Have the ability to be ingested quickly

Have the associated schema, including data types, stored with it

Which of the following file types is the best to use?

  1. JSON
  2. Parquet
  3. XML
  4. CSV

Answer(s): B

Explanation:

Parquet is a columnar storage format that automatically includes schema (data types), uses efficient compression to minimize file size, and enables very fast reads for analytic workloads.



Which of the following is a key difference between KNN and k-means machine-learning techniques?

  1. KNN operates exclusively on continuous data, while k-means can work with both continuous and categorical data.
  2. KNN performs better with longitudinal data sets, while k-means performs better with survey data sets.
  3. KNN is used for finding centroids, while k-means is used for finding nearest neighbors.
  4. KNN is used for classification, while k-means is used for clustering.

Answer(s): D

Explanation:

KNN is a supervised algorithm that assigns labels based on the closest labeled examples, whereas k- means is an unsupervised method that partitions data into clusters by finding centroids without using any pre-existing labels.



A data scientist needs to:

Build a predictive model that gives the likelihood that a car will get a flat tire.

Provide a data set of cars that had flat tires and cars that did not.

All the cars in the data set had sensors taking weekly measurements of tire pressure similar to the sensors that will be installed in the cars consumers drive.
Which of the following is the most immediate data concern?

  1. Granularity misalignment
  2. Multivariate outliers
  3. Insufficient domain expertise
  4. Lagged observations

Answer(s): D

Explanation:

Because tire-pressure sensors report only weekly measurements, you risk missing the critical pressure drop immediately preceding a flat. Those stale ("lagged") readings may not reflect the condition just before failure, undermining your model's ability to learn the true precursors to a flat tire.



Viewing page 4 of 18
Viewing questions 16 - 20 out of 85 questions



Post your Comments and Discuss CompTIA DY0-001 exam prep with other Community members:

Join the DY0-001 Discussion