You are using MADlib for Linear Regression analysis. Which value does the statement return?
SELECT (linregr(depvar, indepvar)).r2 FROM zeta1;
A. Goodness of fit
C. Standard error
Which data asset is an example of quasi-structured data?
A. Webserver log
B. XML data file
C. Database table
D. News article
What would be considered "Big Data"?
A. An OLAP Cube containing customer demographic information about 100, 000, 000
B. Daily Log files from a web server that receives 100, 000 hits per minute
C. Aggregated statistical data stored in a relational database table
D. Spreadsheets containing monthly sales data for a Global 100 corporation
A data scientist plans to classify the sentiment polarity of 10, 000 product reviews collected from
the Internet. What is the most appropriate model to use? Suppose labeled training data is
A. Na?ve Bayesian classifier
B. Linear regression
C. Logistic regression
D. K-means clustering
In which lifecycle stage are test and training data sets created?