
Reading Time: 5 Min
The Data Flow: a Quick Overview
-
What is the Data Flow?
We’re introducing a new exciting feature into SAP Data Warehouse Cloud — the Data Flow. The Data Flow makes structured and semi-structured (JSON, XML) data integration from various data sources easy.
Moreover, the Data Flow:
- provides a new and easy-to-use data flow based data modeling experience for ETL requirements;
- allows you to load and combine structured and semi-structured data from different data sources (SAP and non-SAP) like cloud file storages, database management systems (DBMS), or SAP S/4HANA;
- assists you with standard data transformation capabilities and scripting for advanced requirements;
Transformation Capabilities
The Data Flow possesses transformation capabilities that range from standard transformations to Python scripting for advanced transformations.
The Standard Transformations:
- Combine structured and semi-structured data
- Operators for Projections, Aggregations, Joins, Filters, Unions
- Data source can be tables, CDS views, remote files, OData, and more
- Outlook on additional operators: Case, Lookup, Cleanse, Rules
Scripting capabilities
- Advanced transformation requirements like extraction of text
- Embedded scripting editor in Data Flow Modeler
- Support of standard Python 3 scripting language
- Pandas and NumPy libraries included
Data Integration Workflow – in a nutshell
The goals of the Data Integration Workflow via the Data Flow are to explore the data model, combine data with existing data models, and share the results with others!
- Create Connections: Create a connection either in a central connection management or inside the context of a space. Then, show the available Semantic objects.
- Build Data Flow: Browse a great variety of data sources and select source data objects. Then, browse table targets (from the repository) and finally, define transformations and map source data to target objects.
- Execute/Schedule Data flow: Trigger the execution of the data flow and see the executions status in real-time. Optionally, you can schedule a data flow or find other scheduled jobs in the job monitor.
- Story Designer/Data Modeling: Create data stories based on the loaded data on top of data warehouse analytical artifacts. Build new data models on other, integrated data sources.