As a business analyst, you want to get the most out of your data and have access to the best techniques to really understand what is going on in your business. Empowerment is at your fingertips with Smart Discovery in SAP Analytics Cloud.
Smart Discovery offers a viable way to use automated machine learning on top of your BI data, without losing precious analysis time on data preparation. Simply decide the business question you want to ask your data and let Smart Discovery analyze it for you by running a machine learning algorithm. You can then explore the generated results to gain insights into your data.
What’s new with Smart Discovery in Q1 2021?
In Q1 2021 a significant update to Smart Discovery was released to better answer your business question by helping you more clearly define the context of your question. Defining a better question means Smart Discovery can automatically prepare your data for you and create a better analysis results to explore. To make sure you are satisfied with the business question you have defined, Smart Discovery now offers you a preview of your question before it starts its analysis. The three main benefits to you, as a business analyst, are that Smart Discovery:
- Bridges the knowledge gap between machine learning and BI analysis
- Automatically leverages data contained in existing BI models, eliminating manual preparation requirements
- Maps the output cleanly to the business question, making it simple to then refine and improve
Using Smart Discovery
Presented in this article is an explanation of some of the recent updates to Smart Discovery. This is a starting point for those who want to further explore their data and receive answers to real business questions with a few simple clicks.
1. Specify the Business Question
Smart Discovery helps you, as a business analyst, understand the process in use for analyzing your data. It helps you to specify the right question and quickly understand the generated results. You can refine the question by modifying the target or entity, filtering the dataset, or excluding variables for the analysis. The Target is the measure or dimension you would like to know more about like Revenue or Customer Churn. The Entity defines the dimension(s) describing the object in the data you would like to know more about, for instance customer or product. The entity describes the key which identifies each instance of a specified object. Smart Discovery will aggregate the data to the level described by the entity.
Previously, business analysts required specialist data science knowledge to effectively apply machine learning techniques to business data. Some of the challenges of this were:
- Selecting the correct machine learning technique for a particular problem
- Selecting and preparing the data
- Correctly interpreting the results
Smart Discovery allows granular, yet simple specification of your business questions. Based on this question the correct predictive algorithm is selected and the BI data is automatically prepared to allow the predictive algorithm to be applied. Smart Discovery then produces results that are easy to understand. Since the automatic data preparation allows machine learning to be applied directly to BI data, it is simple to refine the question or even ask more than one question at a time. Explore your data from different angles by asking Smart Discovery to analyze the same target in relation to different entities.
2. Confirm the Business Question
Smart Discovery will analyze the data and generate content to gain insights into how underlying variables influencing a target relative to an entity within a dataset. From here, Smart Discovery automatically prepares the data and builds a predictive model to forecast Gross Margin for Customer Name. From this predictive model, it extracts and generates content that helps the analyst understand Gross Margin.
A key issue when using machine learning to BI data is not having data structured in a way there it can be easily applied. This can mean the results generated by machine learning do not match the user’s expectation and be misleading. When configuring Smart Discovery, you specify the question by selecting both the Target and Entity. The Entity defines the object in the data you wish to explore and can be defined by one or multiple dimensions. This forms the key for the generate dataset – specifying both these parameters ensures the generated output is safe and easily understood.
In this example, you specify the target as Gross Margin and the entity as customer name. The other dimensions in the data may play an important role for explaining the target and must be represented in the flattened dataset. Measures are aggregated based on their aggregation type at the entity level.
How a dimension is represented in the dataset depends on the relationship it has to the entity:
- If a dimension has a single value per entity it will be included in the dataset as is with its original name. The relationship in this case will be many to one (m:1).
- If there is a unique value of a dimension for each value of the key the dimension will not be included. In this case the relationship is one to one (1:1).
- If a dimension has multiple values per entity a count of the distinct values will be included with a “Number of” prefix. The relationship in this case can be many to many (m:m) or one to many (1:m).
Smart Discovery automatically prepares a dataset that contains one row of data for each instance of the entity. For example, if the selected dimension is customer ID, the dataset would contain one row of data for each unique entry. Identifying the entity allows the automated machine learning to provide a more focused analysis.
3. Automatically Generated Story
Smart Discovery automatically prepares the data for the business question, analyzes data and generates content for you. This process automatically builds a predictive model for a specified target; In the example above, the insights provided on the Key Influencers, Unexpected Values and Simulation pages are based on this model.
It is important to note the analysis is performed on a snapshot of the data at the time Smart Discovery is run. The analysis is not updated automatically in response to updates to the data. All the content generated by Smart Discovery is dynamic and changes based on the underlying data.
The Overview page provides visualizations to summarize the results for your target dimension or measure in relation to your entity.
The Key Influencers page is generated based on the predictive model; This lists up to 10 dimensions (ranked from highest to lowest) and measures that significantly impact the target. For each influencer, visualizations are provided that show the average target value and a distribution of the target for each value in a dimension or for each binned value for measures.
In this case there is a record in the data for every customer name. This record contains information at the customer level, such as the aggregated Gross Margin for that customer and any dimension values that are unique for that customer.
The Unexpected Values page provides records in the data where the value projected by the predictive model is very different to the actual value in the data. These values are significant as the predicted values are based on the general patterns within the data – these values are exceptions to the general rule. In this example below, the value of Gross Margin for these customer names is different from that predicted by the behavior of the other customers. These customers may be interesting to the analyst as they may reveal special cases that require further investigation, or may show issues with the underlying data quality.
The Simulation page gives you the opportunity to run different scenarios on the dimensions to see what potential impact they will have on the estimated value in future, providing a type of ‘what if’ analysis. Values can be provided for each of the Key Influencers and the predictive model is used to generate the expected value for Gross Margin.
The influencers are listed with an indication of the relative impact the selected values have on the expected value. In the example below, we can see the expected value for a customer with these properties has an expected Gross Margin of 2,278,419.
With Smart Discovery, business analysts can easily use automated machine learning to quickly understand their BI data directly in SAP Analytics Cloud without the need for data science or machine learning expertise. By simply specifying your business question, you can benefit from insights quickly generated by machine learning. The elimination of data preparation can be a game changer for quickly running analyses, interpreting simple results, and modifying the settings to run further analysis. This iterative approach allows you to gain a better understanding of your business data quickly and effectively. By simplifying the process, you are empowered to make better decision informative stories, all while generating useful content on demand.
Want to experience and run Smart Discovery for yourself? Take the leap and start your journey towards making data-driven decisions with confidence by signing up for a 90-day free trial. Or, if you would like to request enhancements to Smart Discovery please enter your requests on SAP Customer Influence.