A completed spreadsheet file with all entries for the data analyses requested by the client and responses or plots to any specific questions embedded in the spreadsheet.

Background:

Mercedes Benz (MB) has prototyped a new V-4 engine for CLA and GLA class vehicles (sedan and mini-SUV, respectively). The prototype is a 2.4L V-block with a projected maximum output of 242 horsepower (HP) @ 5600 RPM. The prototype also integrates a single-coil turbo, which has a peak output of ~9 HP in the range of 2500-4500 RPMs. The manufacturer has noted that the turbo is susceptible to humidity fluctuations with higher output occurring at lower humidity levels and degraded output under high humidity conditions.

This past summer MB conducted field tests with a GLA class vehicle on three hot days (90F+). In all tests, the engine turbo was active and a professional driver was instructed to maintain 2500 RPM on a straight 5-mile segment of highway. Ten (10) observations were collected on engine output and vehicle speed in each mile of a test trial (50 observations for each run/day). (See the data in the associated spreadsheet file.) MB has advised that, based on prior dynamometer and wind tunnel tests, the following estimated engine output and vehicle speeds are expected under the various conditions:

CONDITION
Humidity
Turbo
RPM
HP
SPEED (MPH)

1
Low
ON
2500
Avg.=120; SD=4
Avg.=88; SD=1

2
Moderate
ON
2500
Avg.=105; SD=4
Avg.=85; SD=1

3
High
ON
2500
Avg.=92.5; SD=4
Avg.=82.5; SD=1

There are a few engine attribute relationships that MB expected to see in the field test data on the prototype model.

As turbo output efficiency decreases, HP boosting is degraded and lower speed results at stable RPM; and
With the turbo active, there is variability in prototype engine output (and, consequently, vehicle speed). However, MB expects a variance of +/- 4 HP and +/1 MPH to be conservative estimates of any variability.

MB has asked that you confirm or refute each of these relationships through data mining of the field test dataset.

Procedure:

1. [4 pts.] Spreadsheet development. The client has provided raw test data, which includes the engine status during testing and output readings. Find the spreadsheet here:

Observations are numbered for reference but the data objects are sorted based on HP readings with the max output first and the final reading is the minimum output. The dataset does not provide information on the test day or humidity conditions, etc. (but MB would like you to make some inference on the conditions through clustering). The client has also structured the spreadsheet for you to identify any correlation of the HP and Speed (SPD) outputs. There are also fields for completing an observation similarity analysis (using Euclidean distance) and additional fields for clustering observations to determine the frequency and density of data objects under specific operating conditions.

2. [3 pts.] Visual correlation analysis. Prepare an appropriate plot to visually represent any relationship between engine HP output and vehicle speed. If there is a trend in the data, this should be identified in your spreadsheet file. If there appear to be clusters of data objects in the plot, these should also be noted along with ranges of attribute values corresponding with the clusters.

3. [4 pts.] Correlation analysis and confirmation. MB prefers that you manually calculate any correlation between the prototype HP output and the observed vehicle speed at constant RPM. Please recall that correlation is the dot product of standard score transformations of each data point in a data set.

Complete the spreadsheet columns for calculating deviations of individual test observations from mean HP and Speed readings. The squared deviations will be automatically calculated for you. You will also need to calculate the SUMPRODUCT of the HP and Speed deviations as well as compute the root of the sum of the squared deviations in HP and Speed. These products should be used to compute an r-value (Pearson correlation coefficient). MB has also asked that you use the Excel CORREL functions to verify your manual calculation. (See the cells at the bottom of the data table.)

4. [4 pts.] Similarity analysis. Based on the expected average HP and Speed outcomes for each of the conditions shown in the table above, MB would like you to identify the minimum multi-dimensional Euclidean distance of each test observation from the expected output pairs (i.e., C1 – 120, 88; C2 – 105, 85; C3 – 92.5, 82.5). These distances should be reported in separate columns in the spreadsheet. The minimum distance for each data object should also be identified in the “MIN_DIST” column.

5. [4 pts.] Cluster analysis and inferences on groups. Once the minimum Euclidean distance for any observation (from the various condition means) is obtained, MB would like you to use a binary variable to indicate membership of each data object for test Condition 1, 2 or 3. On the basis of these classifications, MB also expects that you will calculate the frequency of observations under any test condition and that you will report the density of the data object clusters as a fraction (percentage) of the total sample size. (Again, see the cells at the bottom of the data table.) Are the clusters of equal sizes?

6. [6 pts.] Conclusions on client’s expectations. Based on your clustering of the various data objects, MB asks that you calculate the observed mean and standard deviations for HP and Speed for each of the identified operating condition data clusters. (You will need to sort the data accordingly.) Please respond to the questions in the spreadsheet file as to whether these calculations substantially deviate from the MB estimates in the above reference table. Please also address the expectations that MB provided for the test data. Does engine/turbo HP output appear to degrade with humidity? Are there lower vehicle speeds at comparable RPM under higher humidity? Is there variability of prototype HP output within specific operating conditions? Are MB estimates of any variability conservative (exceeding observed values)?

Expected Outcomes:

A completed spreadsheet file with all entries for the data analyses requested by the client and responses or plots to any specific questions embedded in the spreadsheet.
For Items 1-6 above, provide your written response in the spreadsheet file. Number your responses.
Please keep all written responses concise and to the point.

Reference no: EM132069492

WhatsApp
Hello! Need help with your assignments? We are here

GRAB 25% OFF YOUR ORDERS TODAY

X