Hero Backgroud Elements 2
Reading Time: 5 Min

The Data Flow Builder: a Quick Overview

Learning Article
  • What is the Data Flow Builder?

    We’re introducing a new exciting feature into SAP Data Warehouse Cloud — the Data Flow Builder. The Data Flow Builder makes structured and semi-structured (JSON, XML) data integration from various data sources easy.

    Moreover, the Data Flow Builder:

    • provides a new and easy-to-use data flow based data modeling experience for ETL requirements;
    • allows you to load and combine structured and semi-structured data from different data sources (SAP and non-SAP) like cloud file storages, database management systems (DBMS), or SAP S/4HANA;
    • assists you with standard data transformation capabilities and scripting for advanced requirements;

     

    Transformation Capabilities

    The Data Flow Builder possesses transformation capabilities that range from standard transformations to Python scripting for advanced transformations.

    The Standard Transformations:

    • Combine structured and semi-structured data
    • Operators for Projections, Aggregations, Joins, Filters, Unions
    • Data source can be tables, CDS views, remote files, OData, and more
    • Outlook on additional operators: Case, Lookup, Cleanse, Rules

    Screenshot of the UI showing the standard operators (such as join, union, projection) in the Data Flow Builder.

    Scripting capabilities

    • Advanced transformation requirements like extraction of text
    • Embedded scripting editor in Data Flow Modeler
    • Support of standard Python 3 scripting language
    • Pandas and NumPy libraries included

    Screenshots showing the use of the Python code in the Script Node of the Data Flow Builder

    Data Integration Workflow – in a nutshell

    The goals of the Data Integration Workflow via the Data Flow Builder are to explore the data model, combine data with existing data models, and share the results with others!

    1. Create Connections: Create a connection either in a central connection management or inside the context of a space. Then, show the available Semantic objects.
    2. Build Data Flow: Browse a great variety of data sources and select source data objects. Then, browse table targets (from the repository) and finally, define transformations and map source data to target objects.
    3. Execute/Schedule Data flow: Trigger the execution of the data flow and see the executions status in real-time. Optionally, you can schedule a data flow or find other scheduled jobs in the job monitor.
    4. Story Designer/Data Modeling: Create data stories based on the loaded data on top of data warehouse analytical artifacts. Build new data models on other, integrated data sources.