Contents:
The following sections discuss the possible approaches. The first approach uses a business rule application and customer information to derive the analytical risk scores and risk category, which are used to calculate the combined risk category. The analytical scoring does not happen in real time. This approach is suitable for batch mode operations where the predictive model is run on a batch of customer information periodically so that business rules in turn might use this outcome to arrive at the combined risk category.
This approach is described in Figure 8. The second approach makes use of an analytical scoring service to get the risk category in real time. A decision service is exposed that is a combination of the analytical scoring service and the business rules service as described in Figure 9. A third approach involves integrating the scoring service into the business rule using the LB02 Support Pac, which has the functionality to define business rules that reference predictive scores obtained at runtime by invoking the SPSS scoring service.
The PMML is an XML-based file format developed by the Data Mining Group to provide a way for applications to describe and exchange models produced by data mining and machine learning algorithms. Because PMML enables the scoring configuration to be imported into Operational Decision Manager, it is possible to create better business rules using analytical outcomes at design time. The scoring service is invoked during the rule execution. If using Analytical Decision Management, no additional integration is needed because Analytical Decision Management already has SPSS Modeler as part of its core analytic infrastructure, and predictive models may be imported into the solution.
An Analytical Decision Management application is a combination of business rules and predictive models. For example, an Analytical Decision Management application can combine predictive data for customer churn, lifetime value, and customer satisfaction with business rules from the marketing team to suggest the right offers for a telecom customer.
Building an Analytical Decision Management application typically involves seven steps as shown in Figure Simple applications might include only a few steps while advanced applications might include all seven steps.
The application is defined by an XML template and there are prebuilt applications available that might be used to create a new application rather than build from scratch. The following screenshots show the settings for a sample application for Customer Interactions that comes with Analytical Decision Management software:. Approach 5 uses Analytical Decision Management and references external business rules already created in Operational Decision Manager.
The following summary table lists all the approaches and the products used for each. Saba Bank makes an architectural decision to use Approach 2 for the following reasons:. Predictive model development and rule development may happen in parallel, but could feed into each other.
At times you will feel you are reading a masters level project report while skimming through the book. The hierarchical decision model allows an in-depth analysis of the reasons for particular ranking values of a BPSS tool. In that manner, there are several aspects that are important: Each of these tools was installed and evaluated. Statistical reporting The statistical analysis of the results are crucial for proper understanding of the simulated BP. For terminating simulations the run can be stopped as soon as the terminating event occurs. However, the course by Dr.
For example, data understanding might help harvest new business rules or a business rule might be identified as considerably enhanced if combined with an analytical outcome. The following sections describe a simple predictive model using fictitious customer data to train the model. The scenario uses a classification model known as Chi Square Automatic Interaction Detection CHAID to derive a risk category based on the input values selected as applicable for risk category.
The CHAID analysis is a form of analysis that determines how variables best combine to explain the outcome in a given dependent variable. For a detailed explanation of deploying a predictive model as a scoring service, see Integrating SPSS predictive analytics into Business Intelligent applications, Part 1: Training the predictive model involves using historical customer data with risk category as input.
Figure 18 shows a snapshot of the sample customer data. Figure 19 shows how the input variables and the target variable for prediction are defined. Figure 20 shows a predictive modeling stream that is created to predict the risk category. Nodes, which represent individual operations on data, are linked together in a stream to represent the flow of data. Algorithms are represented by a modeling node, which is a five-sided shape. When a stream contains a modeling node, the resulting model is represented as a model "nugget," and is shaped like a gold nugget.
Figure 21 displays the final output. Comparing the predicted value with the available value helps assess model accuracy. If the predictive model is not found to be reasonably accurate, other modeling techniques are evaluated to decide the most suitable technique. After training models using a dataset where the risk category value is known, you can score records where the risk category value is not known by using the nugget that is generated during the training process.
A scoring data file contains the same fields as a training data file, except that the risk category is not set as the target field. The highlighted path in Figure 20 shows a scoring stream that populates the scoring outcome to a database. The sample predictive model kyc. The business rule development might happen in parallel. Saba Bank opted for the Operational Decision Manager business rule development approach Approach 2 described in the section Selecting a suitable approach.
The Operational Decision Manager business rule development approach is summarized in Figure The initial phase of rule discovery and analysis is now complete, as described in the section Understanding business requirements. The next phase is rule development. The rule development details are not in scope of this article. Refer to the article Develop decision services, Part 1: A smarter city case study for more details on rule development. Once the predictive models and business rules are developed and tested, the model is integrated and deployed as described in the section Approach 2 — Integration with Operational Decision Manager.
The key steps are deploying the predictive model and creating and deploying a decision service as described in the following sections. The sample code provided demonstrates how this is done. This scoring service returns the risk category for customer information provided. The predictive model kyc. After deploying the predictive model and business rule applications, the next step is to create a wrapper decision service that returns the combined risk category as shown in Figure The business rule application is a simple example using an XML execution object model XOM and is enabled as a hosted transparent decision service following deployment on the Rule Execution server.
As the combined risk category calculation is also a business rule in this case, another business rule application is created that returns the combined risk category based on the predicted risk category and business rule risk category. The details of decision service development are beyond the scope of this article, but a sample decision service CustomerRiskService is provided to demonstrate the Saba solution. This section describes how the IBM decision management products help support predictive model validation and governance.
The sample application provides a basic user interface to test the risk service as shown in the following screenshots. Operational Decision Manager supports Decision Validation Services in both the developer and business user environments to do the following:. Decision Warehouse is a tool within Operational Decision Manager that monitors rule execution and stores execution traces in a database. The risk management team at Saba Bank can then generate reports and make an assessment of the following:. Governance is also set in place for business rules.
Operational Decision Manager supports business rule governance with Decision Validation Services and Decision Warehouse, and also with version management, access control and reporting. This article provided a comprehensive overview of the business rules and predictive analytics product stack from IBM for decision management. We have covered the application development methodology using a case study. The author would like to thank Raj Rao for his helpful suggestions and review of the article.
The author would also like to thank Srinivasan Govindaraj for reviewing the predictive analytics part of this article. The customer risk application consists of the following:. Predicting the future, Part 1: Expectations were high since Dr. Andrew Ng is associated with this site and his course on machine learning is delightful. However, the course by Dr. Peng fell short of my expectations by some margin.
The instructor is a good communicator, an expert in R and the topics of this course are highly relevant for learning R. The biggest problem for me with this course is its tone which is highly didactic. Peng could slightly redesign this course around applications and examples it will become a fantastic course. This course is not as comprehensive as the above course on coursera. However, the tone of the course is much more applied and learner-friendly. UCI machine learning repository has tons of freely available datasets.
This site is not associated with R. The reason you may still want to go this site is because they have provided links to research papers that have used these datasets. Let me create a loose parallel between Excel and R to offer you an advice about learning R. As I have mentioned earlier, R has more than add-on packages on CRAN library and millions of functions for data analysis. This may sound a bit daunting to a new learner.
Moreover, if you have worked on Excel, you will know that there are just a handful of functions that you use repeatedly based on your style of analysis. This same pattern will emerge with R as well. I second what Abhinav has said. Great job in helping others with relevant knowledge sharing. Keep up the oustanding work!!
Thanks Randy for pointing out that Modeler has included this feature to integrate R. I have used Clementine in the past, and Modeler 14 was the last version that I tried. I think it would be great if someone can develop a viable platform like modeler for R with a reasonable price tag. A little verbose at times, but worth treading it.
Do you have some thing similar for Python as well? Thanks Pradeep, try this link to find free books on Python for data science. I still refer to it frequently. Perhaps list the year of publication above? I still need to read Advanced R by Wickham to evaluate it properly. However, it will be great if you could evaluate this book in a paragraph or 2 — I will add your comments with credit in this article with the book cover.
I am sure the readers will find it useful. Excellent piece of information. Business Case Analysis with R: A Simulation Tutorial to Support Complex Business Decisions takes the use of R in a different direction from that of data and statistical analysis to business case analysis. This is the top level description…. Business case analysis, often conducted in spreadsheets, exposes decision makers to additional risks that arise just from the spreadsheet environment. The R language, traditionally used for statistical analysis, also provides a more explicit, flexible, and extensible environment than spreadsheets for conducting business case analysis.
Thank you for sharing this. I have been looking for business case studies using R. Would love to see more!
Kayla, Nearly a year later, I just now saw your response. Were you able to make use of the tutorial? As always, Roopam, you have done fabulous work and a great service to the data analytics community in describing all of these resources for learning R and your personal experiences with them. As I am currently inexperienced with R and trying to get up to speed, it looks like the best sequence with online resources might be Code School, then Lynda, then Coursera, moving from basic to heavy duty. Does that make sense? Additionally, I am also trying to figure which of the R interfaces like R studio would be the best to pursue.
I must apologize, I have not read all of your blogs on YOU CANalytics, it is very possible you have commented elsewhere on these issues. Any thoughts you have on this would be much appreciated. Yes, your sequence of courses seems right to me in terms of difficulty levels. I would recommend between CodeSchools and Lynda you may want to squeeze in two more free courses: If you feel ready after them you could skip Lynda all together and move to Kaggle challenges. Lynda, in my opinion, serves more as a warm up. However, it is a good course to start with.
In terms of R interfaces, I am highly biased towards R-studio.
I have never used any other interface after using R studio for all these years. I used to rely on base R interface which I have not used for more than five years now. R-studio slowly grows on you so I recommend stick with it. You may want to try out Rattle as well. I have heard good reviews about H2O package but have not tried it just yet. That is a great online resource as well. It is user friendly and covers the R basics for those getting started, also includes links to data sets. I think you need to look at overall schema of data science offered by coursera.
Dr Peng programming in R is an introduction in R, is one of the subject. The title of the book is:. Behind every good decision: I read the book and it has 2 main components in my view: Examples of how to use business analytics to gain a competitive advantage. These examples are not exhaustive, but more of a describing nature. The overall flow of a data science project in a business environment.
The great thing about this book is that they describe in a very rigorous way what steps to take to go from a business question to good insights ans what pitfalls to avoid. How to create an analytics organisation. My experience in engineering is that using a structured approach to solving problems is one of the most important aspects of making a project succesfull and this book explains in great detail how to do that for data science. I reviewed it and found it to be very helpful.
I also have a book on using R for business case analysis, which is a slightly different use case for R from its usual data analytics. It incorporates principles of decision and risk analysis.
It is now available at https: R Programming is an software environment for statistical computing which are most widely used by data miners and statisticians for developing statistical software and data analysis…. The blog is very informative …Thanks for updating these types of informative…. Nice blog…Am an beginner to R Programming field.. This information is much useful for all…Keep instantly updating these types of informative….
Can we use R for Retail Analytics also? Hi, I am really happy to found such a helpful and fascinating post that is written in well manner.