# MASYGC 1210 -Programming Data Analysis Multiple Linear Regression Project.

Group Assignment 2 MASY GC1210-201 Fall 2021 PART 1 Multiple Linear Regression Understanding Retail Sales • Eight-Twelve Convenience Stores Inc. is evaluating sales in its 27 franchise store locations to get a better understanding how to plan new stores. Its franchise operators have provided annual sales, square footage (size), inventory, advertising spending, number of families in their district, and the number of competing stores in their district. • Eight-Twelve Management would like to answer: • What factors affect sales in our franchises and how? • Can we create a model so we design a store size given certain demographics so we can achieve a certain level of sales? Multiple Linear Regression Using Excel • Use the attached course data file Franchises.xlxs. • Open it using Excel. • Following the practice of not changing the raw data, select the entire the file and copy it as the shaped file in a new spreadsheet and label the tab Model. • Make sure the file is properly shaped by having the dependent variable (SALES) to the left of the independent variables (SQFT, INVENTORY, ADVERTISING, FAMILIES, STORES), and that the independent variable columns are contiguous. Multiple Linear Regression Using Excel • Invoke the Excel Analysis ToolPak and select the Regression function. For the y-variable use the SALES data (make sure to include the column header and click “my data has headers” box.) For the xvariable use all the data in the impendent variable columns (also include headers.). Check “My data has headers” box. • You can put the results on another spreadsheet or right next to our data table. For now, select a cell next to the table. • Analyze the results. Multiple Linear Regression Using Excel • How much of the variation in the SALES data can be explained by the model? (See R-squared.) • How confident are you in the validity of this model (See Significance F.) • What are the relationships between SALES and the other factors (See Coefficients.) • Build a linear model using the intercept and the coefficients. • How confident are we on each coefficient? (See Pvalue.) • What does the model predict the sales for the first store and how far off is it compared to actual data? Multiple Linear Regression Using Excel • Now that management understands its existing stores, they would like you to apply what you found to project sales at a new store opening. • We just opened a store in a neighborhood with 5,000 families, the store is 5,000 sq ft, we are planning to spend $5,000 a month in advertising, carry $250,000 in inventory and there are 5 competing stores in the neighborhood. What are the projected sales? (Hint, use the information in the data dictionary to normalize the variables properly and use the model equation) • Management would also like you to evaluate up to how much we should spend on advertising at a potential new store: • We want to open a 10,000 sq ft store and realize $500,000 a month in sales in a neighborhood with 10,000 families, we are planning to spend $10,000 a month in advertising, carry $500,000 in inventory and there are 10 competing stores in the neighborhood. How much should we spend monthly in advertising to realize our expected sales? (Hint, use the model equation in reverse) PART 2 Time Series Analysis & Forecasting Presenting a Data Set and Analysis • Your team should imagine itself a consulting firm being charged to evaluate a large publicly available data set and demonstrate a wide application of the concepts learned in our Time Series Analysis & Forecasting class and text and make meaningful analysis of the data and what decisions they can inform. • Choose an appropriate real-world dataset that allows you to graphically illustrate “horizontal” pattern that includes a MINIMUM of 5 years of data in a monthly breakdown or 10 years of data in a quarterly breakdown • Illustrate the data set graphically and apply moving average, weighted moving average, exponential smoothing, & linear regression to forecast a future year. Also demonstrate calculation of the forecasting errors. • Analyze and comment on the accuracy of these forecasts and what they tell you about the behavior and conclusions you can make of future behavior of the system you are analyzing going forward and what decisions it may inform. Note if the data demonstrates seasonality or cyclicality and apply the appropriate adjustments to your analysis Potential sources for data sets (optional) • https://www.springboard.com/blog/data-science/free-public-datasets-data-science-project/ • https://libraryguides.missouri.edu/datasets/public-use • https://guides.emich.edu/data/free-data • https://r-dir.com/reference/datasets.html Part 3 Presenting a Decision Analysis Team 1 Case – Property Purchase Strategy Glenn Foreman, President of Oceanview Development Corporation, is considering submitting a bid to purchase property that will be sold by sealed bid at a county tax foreclosure. Glenn’s initial judgement is to submit a bid of $5 million. Based on his experience, Glenn estimates that a bid of $5 million will have a 0.2 probability of being the highest bid and securing the property for Oceanview. The current date is June 1. Sealed bids for the property must be submitted by August 15. The winning bid will be announced on September 1. If Oceanview submits the highest bid and obtains the property, the firm plans to build and sell a complex of luxury condominiums. However, a complicating factor is that the property is currently zoned for single-family residences only. Glenn believes that a referendum could be placed on the voting ballot in time for the November election. Passage of the referendum would change the zoning of the property and permit construction of the condominiums. The sealed-bid procedure requires the bid to be submitted with a certified check for 10% of the amount bid. If the bid is rejected, the deposit is refunded. If the bid is accepted, the deposit is the down payment for the property. However, if the bid is accepted and the bidder does not follow through with the purchase and meet the remainder of the financial obligation within six months, the deposit will be forfeited. In this case, the county will offer the property to the next highest bidder. To determine whether Oceanview should submit the $5 million bid, Glenn conducted some preliminary analysis. This preliminary work provided an assessment of 0.3 for the probability that the referendum for a zoning change will be approved and resulted in the following estimates of the costs and revenues that will be incurred if the condominiums are built: Cost and Revenue Estimates Revenue from condominium sales $15,000,000 Cost Property $5,000,000 Construction expenses $8,000,000 If Oceanview obtains the property and the zoning change is rejected in November, Glenn believes that the best option would be for the firm not to complete the purchase of the property. In this case, Oceanview would forfeit the 10% deposit that accompanied the bid. Because the likelihood that the zoning referendum will be approved is such an important factor in the decision process, Glenn suggested that the firm hire a market research service to conduct a survey of voters. The survey would provide a better estimate of the likelihood that the referendum for a zoning change would be approved. The market research firm that Oceanview Development has worked with in the past has agreed to do the study for $15,000. The results of the study will be available on August 1, so that Oceanview will have this information before the August 15 bid deadline. The results of the survey will be either a prediction that the zoning change will be approved or a prediction that the zoning change will be rejected. After considering the record of the market research service in previous studies conducted for Oceanview, Glenn developed the following probability estimates concerning the accuracy of the market research information: Perform an analysis of the problem facing the Oceanview Development Corporation, and prepare a report that summarizes you

r findings and recommendations. Include the following items in your report: • • • • A decision tree that shows the logical sequence of the decision problem A recommendation regarding what Oceanview should do if the market research information is not available A decision strategy that Oceanview should follow if the market research is conducted A recommendation as to whether Oceanview should employ the market research firm, along with the value of the information provided by the market research firm Include the details of your analysis as an appendix to your report. Team 2 Case – Lawsuit Defense Strategy John Campbell, an employee of Manhattan Construction Company, claims to have injured his back as a result of a fall while repairing the roof at one of the Eastview apartment buildings. He filed a lawsuit against Doug Reynolds, the owner of Eastview Apartments, asking for damages of $1,500,000. John claims that the roof had rotten sections and that his fall could have been prevented if Mr. Reynolds had told Manhattan Construction about the problem. Mr. Reynolds notified his insurance company, Allied Insurance, of the lawsuit. Allied must defend Mr. Reynolds and decide what action to take regarding the lawsuit. Some depositions and a series of discussions took place between both sides. As a result, John Campbell offered to accept a settlement of $750,000. Thus, one option is for Allied to pay John $750,000 to settle the claim. Allied is also considering making John a counteroffer of $400,000 in the hope that he will accept a lesser amount to avoid the time and cost of going to trial. Allied’s preliminary investigation shows that John’s case is strong; Allied is concerned that John may reject its counteroffer and request a jury trial. Allied’s lawyers spent some time exploring John’s likely reaction if they make a counteroffer of $400,000. The lawyers concluded that it is adequate to consider three possible outcomes to represent John’s possible reaction to a counteroffer of $400,000: (1) John will accept the counteroffer and the case will be closed; (2) John will reject the counteroffer and elect to have a jury decide the settlement amount; or (3) John will make a counteroffer to Allied of $600,000. If the case goes to a jury trial, Allied considers three outcomes possible: (1) the jury may reject John’s claim and Allied will not be required to pay any damages; (2) the jury will find in favor of John and award him $750,000 in damages; or (3) the jury will conclude that John has a strong case and award him the full amount of $1,500,000. Key considerations as Allied develops its strategy for disposing of the case are the probabilities associated with John’s response to an Allied counteroffer of $400,000 and the probabilities associated with the three possible trial outcomes. Allied’s lawyers believe that the probability that John will accept a counteroffer of $400,000 is 0.10, the probability that John will reject a counteroffer of $400,000 is 0.40, and the probability that John will, himself, make a counteroffer to Allied of $600,000 is 0.50. If the case goes to court, they believe that the probability that the jury will award John damages of $1,500,000 is 0.30, the probability that the jury will award John damages of $750,000 is 0.50, and the probability that the jury will award John nothing is 0.20. Perform an analysis of the problem facing Allied Insurance and prepare a report that summarizes your findings and recommendations. Be sure to include the following items: • • • • A decision tree A recommendation regarding whether Allied should accept John’s initial offer to settle the claim for $750,000 A decision strategy that Allied should follow if they decide to make John a counteroffer of $400,000 A risk profile for your recommended strategy Team 3 Case – Rob’s Market Rob’s Market (RM) is a regional food store chain in the southwest United States. David White, director of Business Intelligence for RM, would like to initiate a study of the purchase behavior of customers who use the RM loyalty card (a card that customers scan at checkout to qualify for discounted prices). The use of the loyalty card allows RM to capture what is known as “point-of-sale” data, that is, a list of products purchased by a given customer as he/she checks out of the market. David feels that better understanding of which products tend to be purchased together could lead to insights for better pricing and display strategies as well as a better understanding of sales and the potential impact of different levels of coupon discounts. This type of analysis is known as market basket analysis, as it is a study of what different customers have in their “shopping baskets” as they check out of the store. As a prototype study, David wants to investigate customer buying behavior with regard to bread, jelly, and peanut butter. RM’s Information Technology (IT) group, at David’s request, has provided a data set of purchases made by 1000 customers over a one-week period. The data set contains the following variables for each customer: • Bread — wheat, white, or none • Jelly — grape, strawberry, or none • Peanut Butter — creamy, natural, or none The variables appear in the above order from left to right in the data set, where each row is a customer. For example, the first record of the data set is: white grape none which means that customer #1 purchased white bread, grape jelly, and no peanut butter. The second record is: white strawberry none which means that customer #2 purchased white bread, strawberry jelly, and no peanut butter. The sixth record in the data set is: none none none which means that the sixth customer did not purchase bread, jelly, or peanut butter. Other records are interpreted in a similar fashion. David would like you to do an initial study of the data (reference MarketBasket.xls included in Assignment post) to get a better understanding of RM customer behavior with regard to these three products. Prepare a report that gives insight into the purchase behavior of customers who use the RM loyalty card. At a minimum your report should include estimates of the following: • The probability that a random customer does not purchase any of the three products (bread, jelly, or peanut butter). • The probability that a random customer purchases white bread. • The probability that a random customer purchases wheat bread. • The probability that a random customer purchases grape jelly given that he/she purchases white bread. • The probability that a random customer purchases strawberry jelly given that he/she purchases white bread. • The probability that a random customer purchases creamy peanut butter given that he/she purchases white bread. • The probability that a random customer purchases natural peanut butter given that he/she purchases white bread. • The probability that a random customer purchases creamy peanut butter given that he/she purchases wheat bread. • The probability that a random customer purchases natural peanut butter given that he/she purchases wheat bread. • The probability that a random customer purchases white bread, grape jelly, and creamy peanut butter. Team 4 Case – College Softball Recruiting College softball programs have a limited number of scholarships to offer promising high school seniors, so the programs invest a great deal of effort in evaluating these players. One measure of performance the programs commonly use to evaluate recruits is the batting average—the proportion of at-bats (excluding times when the player is walked or hit by a pitch) in which the player gets a hit. For example, a player who gets 50 hits in 150 at-bats has a batting average of 50/150 = 0.333 A college softball program is considering two players, Fran Hayes and Millie Marshall, who have recently completed their senior years of high school. Their respective statistics for their junior and senior years are as shown in the following Table. The Athletic Director and Coach of the women’s softball team at a large public university are trying to decide to which of these two players th

ey will offer an athletic scholarship (i.e., an opportunity to attend the university for free in exchange for playing on the university’s softball team). Take the following steps to determine which player had the better batting average over the two-year period provided in the table, and use your results to advise the Athletic Director and Coach on their decision. • Calculate the batting average of each player for her junior year; then also calculate the batting average of each player for her senior year. Which player would this analysis lead you to choose? • Calculate the batting average of each player for her combined junior and senior years. Which player would this analysis lead you to choose? • After considering both of your analyses, which player would you choose? Why? Prepare a report on your findings for the athletic director and coach of the college program. Focus on clearly explaining the discrepancy in your two analyses.