Counterfeiting banknotes has been a problem since the introduction of color photocopiers and computer image scanners. The banking industry has suffered from counterfeits due to inflation and reduction in the value of real money. Assume that you are a data mining expert who works in the banking industry.
The dataset called banknotes.csv (Links to an external site.) contains 5 variables (or columns) and the description-bank.docx (Links to an external site.) contains a description of the dataset. The end goal is to build an appropriate model (or tool) to successfully predict forgery. Using SAS Studio, perform the following tasks:
- Explore the dataset by providing summary statistics and graphical summaries of all the variables.
- Explain some of the key aspects of data in part 1.
- Examine if the dataset has any anomalies. Describe the method(s) you used as well as the results.
- Examine if there are any association among the variables. Describe the approaches as well as the results.
- Using one of the clustering techniques, analyze all the variables. Explain the results.
- Using one of the classification techniques from the course, build the model that predicts forgery. Explain why you think the model you’ve chosen is most appropriate for this dataset.
- Evaluate the model. How well does the model fit? Can you improve the model? Explain.
Please post all SAS code required to accomplish each step.