Practice Exam A

CAR WORTH DATA SET: For this data set, a representative sample of over eight hundred, 2005 GM cars were selected, then an algorithm was developed following the 2005 Central Edition of the Kelly Blue Book to estimate retail price. Stata and SAS students can find this data set under “Practice Data” on the P-drive. R students can import the data with the following code:

R: 

car_worth<-read.csv(“https://raw.githubusercontent.com/nazzstat/AppliedData/master/PracticeData/car_worth.csv”)

SAS:

LIBNAME myFolder “P:\QAC\QAC201\PracticeData”;

data new; set myFolder.car_worth; run;

Stata:

import delimited “https://raw.githubusercontent.com/nazzstat/AppliedData/master/PracticeData/car_worth.csv”, clear

VARIABLE DESCRIPTIONS:

  • Price: suggested retail price of the used 2005 GM car in excellent condition. The condition of a car can greatly affect price. All cars in this data set were less than one year old when priced and considered to be in excellent condition.
  • Mileage: number of miles the car has been driven
  • Make: manufacturer of the car such as Saturn, Pontiac, and Chevrolet
  • Model: specific models for each car manufacturer such as Ion, Vibe, Cavalier
  • Trim (of car): specific type of car model such as SE Sedan 4D, Quad Coupe 2D
  • Type: body type such as sedan, coupe, etc.
  • Cylinder: number of cylinders in the engine Liter: a more specific measure of engine size
  • Doors: number of doors
  • Cruise: indicator variable representing whether the car has cruise control (1 = cruise)
  • Sound: indicator variable representing whether the car has upgraded speakers (1 = upgraded)
  • Leather: indicator variable representing whether the car has leather seats (1 = leather)

Instructions

  • Import the car worth csv file.
  • Construct a plot so that you can assess the distribution of car prices from this sample. Answer Question 1 below.
  • Determine the min, max, and mean of car prices from this sample. Answer Questions 2-3 below.
  • Determine what percent of cars that cost at most $20,000. Answer Questions 4 below.
  • Construct a scatterplot to visually determine the strength of the association between price and mileage. Answer Question 5 below.
  • Construct an appropriate test to determine whether there is a significant linear relationship between price and mileage.  Answer Question 6 below.
  • Now, build a model that allows you to predict price from mileage. Answer Questions 7-8 below.
  • Determine the mean price for each make and then determine whether a car’s make is significantly associated with it’s price.  If necessary, construct a post-hoc test to reveal what makes have significantly different mean prices. Answer Questions 9-10 below.
  • Build a model that allows you to assess the relationship between price and mileage, controlling for make. Answer Question 11 below.
  • Create a new variable called “LowMileage” have it equal 1 if a car has mileage under 10,000 miles and 0 otherwise.
  • Create a new variable called “HighDemandFeatures”, have it equal the number of high demand features that a particular car has. Suppose these features include whether the car has cruise control, upgraded speakers, leather interior, and low mileage.
  • Finally, determine the mean price of the cars based on the number of high demand features it has. Answer Question 12 below.

Questions to answer in moodle:

Question 1:  Which of the following best describes the distribution of car prices?

Question 2:  What is the least expensive car in this sample? The most expensive?

Question 3: What is the mean car price in this sample?

Question 4: ___% of cars in the sample cost at most $20,000.

Question 5: Which of the following best describes the scatterplot you obtained?

Question 6: State the appropriate p-value that corresponds with the test you selected. What can you conclude?

Question 7:Suppose two cars are identical, except that one has one more mile on it’s odometer. What is the expected price difference based on this model?

Question 8:Suppose two cars are identical, except that one has 60,000 more mile on it’s odometer. What is the expected price difference based on this model?

Question 9:What is the mean price of the cars in the sample based on make?

Question 10:What is the test statistic and p-value to determine whether price is significantly associated with make. What can be concluded?

Question 11: After controlling for make, what can be said about the relationship between mileage and price?

Question 12: A car with no high demand features has an average price of __________. A car with 4 high demand features has an average price of ___________.

Check your answers by submitting them on moodle under Practice A for Exam 4.