CompTIA DA0-002 Exam Questions
CompTIA Data+ (2025) (Page 4 )

Updated On: 31-Mar-2026

A data analyst receives a request for the current employee head count and runs the following SQL statement:

SELECT COUNT(EMPLOYEE_ID) FROM JOBS

The returned head count is higher than expected because employees can have multiple jobs.
Which of the following should return an accurate employee head count?

  1. SELECT JOB_TYPE, COUNT DISTINCT(EMPLOYEE_ID) FROM JOBS
  2. SELECT DISTINCT COUNT(EMPLOYEE_ID) FROM JOBS
  3. SELECT JOB_TYPE, COUNT(DISTINCT EMPLOYEE_ID) FROM JOBS
  4. SELECT COUNT(DISTINCT EMPLOYEE_ID) FROM JOBS

Answer(s): D

Explanation:

This question falls under the Data Analysis domain of CompTIA Data+ DA0-002, which involves using SQL queries to analyze data and address issues like duplicates in datasets. The issue here is that the initial query counts all instances of EMPLOYEE_ID in the JOBS table, but employees can have multiple jobs, leading to an inflated head count. The goal is to count unique employees.

SELECT JOB_TYPE, COUNT DISTINCT(EMPLOYEE_ID) FROM JOBS (Option A): This query is syntactically incorrect because COUNT DISTINCT(EMPLOYEE_ID) should use parentheses as COUNT(DISTINCT EMPLOYEE_ID). It also groups by JOB_TYPE, which is unnecessary for a total head count.

SELECT DISTINCT COUNT(EMPLOYEE_ID) FROM JOBS (Option B): This query is incorrect because DISTINCT applies to the rows returned, not the COUNT function directly. It doesn't address the duplicate EMPLOYEE_ID issue.

SELECT JOB_TYPE, COUNT(DISTINCT EMPLOYEE_ID) FROM JOBS (Option C): While this query correctly uses COUNT(DISTINCT EMPLOYEE_ID) to count unique employees, grouping by JOB_TYPE breaks the count into separate groups, which isn't required for a total head count.

SELECT COUNT(DISTINCT EMPLOYEE_ID) FROM JOBS (Option D): This query correctly counts only unique EMPLOYEE_IDs by using the DISTINCT keyword within the COUNT function, providing an accurate total head count without grouping.

The DA0-002 Data Analysis domain emphasizes "given a scenario, applying the appropriate descriptive statistical methods using SQL queries," which includes handling duplicates with functions like COUNT(DISTINCT). Option D is the most direct and accurate method for a total unique head count.


Reference:

CompTIA Data+ DA0-002 Draft Exam Objectives, Domain 3.0 Data Analysis.



A data analyst created a dashboard to illustrate the traffic volume and mean response time for a call center. The traffic data is current, but the mean response time has not updated for more than an hour.
Which of the following is the best way to verify the data's freshness?

  1. Refactoring the code base
  2. Testing for network connectivity issues
  3. Checking the last time the calculation script ran
  4. Determining the number of calls with no timestamps

Answer(s): C

Explanation:

This question pertains to the Data Governance domain, which in DA0-002 includes ensuring data quality and freshness, especially in dashboards. The issue is that the mean response time isn't updating, while traffic data is current, indicating a potential issue with the data refresh process for the response time metric.

Refactoring the code base (Option A): Refactoring might improve long-term performance but doesn't directly address verifying data freshness.

Testing for network connectivity issues (Option B): Network issues could cause delays, but since traffic data is updating, connectivity is likely not the issue.

Checking the last time the calculation script ran (Option C): Mean response time is a calculated metric, likely derived from a script. Checking when the script last ran directly verifies if the data refresh process failed, making this the best approach.

Determining the number of calls with no timestamps (Option D): Missing timestamps might indicate data quality issues, but it doesn't directly verify why the mean response time isn't updating.

The DA0-002 Data Governance domain focuses on "data quality control concepts," including ensuring data freshness in reporting. Checking the script's last run time aligns with this objective.


Reference:

CompTIA Data+ DA0-002 Draft Exam Objectives, Domain 5.0 Data Governance.



Which of the following pieces of information, if made public, results in a data privacy violation?

  1. Gender
  2. Driver's license
  3. Age
  4. Employment status

Answer(s): B

Explanation:

This question falls under the Data Governance domain, which in DA0-002 includes understanding data privacy and compliance with regulations like GDPR. The question asks which piece of information, if made public, constitutes a privacy violation, meaning it must be personally identifiable information (PII).

Gender (Option A): Gender is not typically considered PII on its own, as it's not uniquely identifiable.

Driver's license (Option B): A driver's license number is PII because it uniquely identifies an individual and can be linked to other personal information, such as name and address. Making it public violates privacy regulations.

Age (Option C): Age alone isn't PII, as it's not uniquely identifiable.

Employment status (Option D): Employment status (e.g., employed, unemployed) isn't PII, as it doesn't uniquely identify an individual.

The DA0-002 Data Governance domain includes "identifying PII and data privacy concepts," and a driver's license is a clear example of PII that, if exposed, results in a privacy violation.


Reference:

CompTIA Data+ DA0-002 Draft Exam Objectives, Domain 5.0 Data Governance.



A data analyst receives four files that need to be unified into a single spreadsheet for further analysis. All of the files have the same structure, number of columns, and field names, but each file contains different values.
Which of the following methods will help the analyst convert the files into a single spreadsheet?

  1. Merging
  2. Appending
  3. Parsing
  4. Clustering

Answer(s): B

Explanation:

This question is part of the Data Acquisition and Preparation domain, which involves combining data from multiple sources. The files have the same structure but different values, meaning they need to be stacked vertically into one dataset.

Merging (Option A): Merging typically involves joining datasets on a common key (e.g., a customer ID), which isn't indicated here since the files only differ in values, not keys.

Appending (Option B): Appending stacks datasets vertically, combining rows from files with the same structure into a single dataset, which matches the scenario.

Parsing (Option C): Parsing involves breaking down data (e.g., splitting text), not combining files.

Clustering (Option D): Clustering is a machine learning technique for grouping similar data points, not for combining files.

The DA0-002 Data Acquisition and Preparation domain includes "executing data manipulation," such as appending datasets with identical structures.


Reference:

CompTIA Data+ DA0-002 Draft Exam Objectives, Domain 2.0 Data Acquisition and Preparation.



A data analyst team needs to segment customers based on customer spending behavior. Given one million rows of data like the information in the following sales order table:

Customer_ID Region Amount_spent Product_category Quantity_of_items

00123 East 20000 Baby 4

00124 West 30000 Home 6

00125 South 40000 Garden 7

00126 North 50000 Furniture 8

00127 East 60000 Baby 10

Which of the following techniques should the team use for this task?

  1. Standardization
  2. Concatenate
  3. Binning
  4. Appending

Answer(s): C

Explanation:

This question falls under the Data Analysis domain, focusing on techniques for segmenting data. The task is to segment customers based on spending behavior, which involves grouping numerical data (Amount_spent) into categories.

Standardization (Option A): Standardization scales numerical data to a common range (e.g., z-scores), but it doesn't segment customers into groups.

Concatenate (Option B): Concatenation combines text fields, not numerical data for segmentation.

Binning (Option C): Binning involves grouping numerical data into discrete intervals (e.g., low, medium, high spending), which is ideal for segmenting customers based on spending behavior.

Appending (Option D): Appending combines datasets vertically, not relevant for segmentation.

The DA0-002 Data Analysis domain includes "applying the appropriate descriptive statistical methods," and binning is a common method for segmenting numerical data like spending amounts.


Reference:

CompTIA Data+ DA0-002 Draft Exam Objectives, Domain 3.0 Data Analysis.



Viewing page 4 of 26
Viewing questions 16 - 20 out of 111 questions



Post your Comments and Discuss CompTIA DA0-002 exam dumps with other Community members:

DA0-002 Exam Discussions & Posts

AI Tutor 👋 I’m here to help!