Skip to main content

SAP Datasphere

ERPL-Web is the first native DuckDB client for SAP Datasphere. Discover, explore, and query your Datasphere data directly from SQL with full OAuth2 security.

Overview

SAP Datasphere (formerly Data Warehouse Cloud) is SAP's cloud data platform. ERPL-Web integrates with both the DWAAS Core APIs and the Catalog OData service to provide:

What you get:

  • ✅ OAuth2 authentication (authorization code & client credentials)
  • ✅ Discover spaces and assets automatically
  • ✅ Rich metadata for relational and analytical datasets
  • ✅ Query relational data with full SQL pushdown
  • ✅ Query analytical data with metrics and dimensions
  • ✅ Secure secret management
  • ✅ Automatic token refresh

Prerequisites

Before you start, you'll need:

  1. SAP Datasphere Tenant - Active Datasphere instance
  2. OAuth2 App Registration - Client ID and secret (or use pre-delivered credentials)
  3. Permissions - Access to spaces and views you want to query
  4. Network Access - Ability to reach your Datasphere tenant
tip

For quick testing, ERPL-Web can use Datasphere's pre-delivered OAuth2 client. You only need your tenant name and data center!


Quick Start

Step 1: Create OAuth2 Secret

-- Load the extension
LOAD erpl_web;

-- Create OAuth2 secret (minimal configuration)
CREATE SECRET datasphere (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'your-tenant',
DATA_CENTER 'eu10',
SCOPE 'default'
);

This opens your browser for authentication. After you log in, the tokens are stored securely.

Step 2: Discover Your Spaces

-- List all accessible spaces
SELECT * FROM datasphere_show_spaces();

Example Output:

┌──────────────┐
│ name │
├──────────────┤
│ SALES │
│ FINANCE │
│ SUPPLY_CHAIN │
└──────────────┘

Step 3: Discover Assets in a Space

-- List assets in SALES space
SELECT name, object_type, technical_name
FROM datasphere_show_assets('SALES');

Example Output:

┌─────────────────────┬──────────────┬────────────────────┐
│ name │ object_type │ technical_name │
├─────────────────────┼──────────────┼────────────────────┤
│ Sales Analytics │ View │ SALES_ANALYTICS_V │
│ Customer Master │ View │ CUSTOMER_MASTER_V │
│ Revenue by Region │ View │ REVENUE_REGION_V │
└─────────────────────┴──────────────┴────────────────────┘

Step 4: Query Your Data

-- Read relational data
SELECT *
FROM datasphere_read_relational('SALES', 'CUSTOMER_MASTER_V')
LIMIT 10;

Done! You're now querying SAP Datasphere from DuckDB.


OAuth2 Authentication

ERPL-Web supports multiple OAuth2 flows for different use cases.

Authorization Code Flow (Interactive)

Best for interactive use - opens browser for login:

CREATE SECRET datasphere (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'mytenant',
DATA_CENTER 'eu10',
SCOPE 'default',
REDIRECT_URI 'http://localhost:65000' -- Default
);

What happens:

  1. A local server starts on port 65000
  2. Browser opens to Datasphere login
  3. You authenticate with your credentials
  4. Tokens are saved in the secret
  5. Tokens auto-refresh when expired

Client Credentials Flow (Service Accounts)

Best for automation and scripts - no browser needed:

CREATE SECRET datasphere_svc (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'mytenant',
DATA_CENTER 'eu10',
CLIENT_ID 'your-client-id',
CLIENT_SECRET 'your-client-secret',
GRANT_TYPE 'client_credentials',
SCOPE 'default'
);
warning

Keep CLIENT_SECRET secure! Never commit it to version control. Use environment variables or secure vaults.

Using Pre-Delivered Client

If you don't have custom OAuth2 app, use Datasphere's built-in client:

CREATE SECRET datasphere (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'mytenant',
DATA_CENTER 'eu10'
-- No CLIENT_ID or CLIENT_SECRET needed!
);

Data Centers

Specify your Datasphere data center:

Data CenterLocation
eu10Europe (Frankfurt)
us10US East
ap10Asia Pacific (Sydney)
jp10Japan (Tokyo)

Discovery Functions

ERPL-Web provides comprehensive discovery to explore your Datasphere environment.

datasphere_show_spaces

List all accessible spaces.

Signature:

datasphere_show_spaces([secret VARCHAR])

Returns: Table with column:

  • name (VARCHAR): Space ID

Example:

SELECT * FROM datasphere_show_spaces();

datasphere_show_assets

List assets in a space or across all spaces.

Signatures:

-- Assets in specific space
datasphere_show_assets(space_id VARCHAR [, secret VARCHAR])

-- Assets across all spaces
datasphere_show_assets([secret VARCHAR])

Returns: Table with columns:

  • name (VARCHAR): Asset name (label)
  • object_type (VARCHAR): Type (View, Table, etc.)
  • technical_name (VARCHAR): Technical identifier
  • space_name (VARCHAR): Space ID (only when querying all spaces)

Examples:

-- Assets in one space
SELECT * FROM datasphere_show_assets('SALES');

-- All accessible assets
SELECT * FROM datasphere_show_assets();

-- Filter by type
SELECT * FROM datasphere_show_assets('SALES')
WHERE object_type = 'View';

datasphere_describe_space

Get detailed space metadata.

Signature:

datasphere_describe_space(space_id VARCHAR [, secret VARCHAR])

Returns: Table with columns:

  • name (VARCHAR): Space ID
  • label (VARCHAR): Display label

Example:

SELECT * FROM datasphere_describe_space('SALES');

datasphere_describe_asset

Get comprehensive asset metadata including schema details.

Signature:

datasphere_describe_asset(
space_id VARCHAR,
asset_id VARCHAR
[, secret VARCHAR]
)

Returns: Table with 15 columns:

Basic Metadata:

  • name (VARCHAR): Technical name
  • space_name (VARCHAR): Space ID
  • label (VARCHAR): Display label
  • asset_type (VARCHAR): Asset type

Access URLs:

  • asset_relational_metadata_url (VARCHAR)
  • asset_relational_data_url (VARCHAR)
  • asset_analytical_metadata_url (VARCHAR)
  • asset_analytical_data_url (VARCHAR)

Capabilities:

  • supports_analytical_queries (BOOLEAN)
  • has_relational_access (BOOLEAN)
  • has_analytical_access (BOOLEAN)

Schema Information:

  • relational_schema (STRUCT): Relational columns definition
  • analytical_schema (STRUCT): Metrics, dimensions, variables
  • odata_context (VARCHAR)
  • odata_metadata_etag (VARCHAR)

Example:

SELECT 
name,
label,
supports_analytical_queries,
relational_schema,
analytical_schema
FROM datasphere_describe_asset('SALES', 'REVENUE_REGION_V');

Reading Data

datasphere_read_relational

Query relational (table-like) data.

Signature:

datasphere_read_relational(
space_id VARCHAR,
asset_id VARCHAR
[, secret VARCHAR]
[, top BIGINT]
[, skip BIGINT]
[, params MAP(VARCHAR, VARCHAR)]
)

Parameters:

  • space_id: Space identifier
  • asset_id: Asset technical name
  • secret: Optional secret name (uses default if not specified)
  • top: Limit number of rows (OData $top)
  • skip: Skip rows (OData $skip)
  • params: Input parameters for parameterized views

Example:

-- Read all data
SELECT * FROM datasphere_read_relational('SALES', 'CUSTOMER_MASTER_V');

-- With limit
SELECT *
FROM datasphere_read_relational('SALES', 'CUSTOMER_MASTER_V', top := 100);

-- With pagination
SELECT *
FROM datasphere_read_relational(
'SALES',
'CUSTOMER_MASTER_V',
top := 50,
skip := 100
);

-- With parameters
SELECT *
FROM datasphere_read_relational(
'SALES',
'PARAMETERIZED_VIEW',
params := {'YEAR': '2024', 'REGION': 'EMEA'}
);

datasphere_read_analytical

Query analytical (multidimensional) data.

Signature:

datasphere_read_analytical(
space_id VARCHAR,
asset_id VARCHAR
[, secret VARCHAR]
[, top BIGINT]
[, skip BIGINT]
[, params MAP(VARCHAR, VARCHAR)]
[, metrics LIST(VARCHAR)]
[, dimensions LIST(VARCHAR)]
)

Parameters:

  • space_id: Space identifier
  • asset_id: Asset technical name
  • secret: Optional secret name
  • top: Limit rows
  • skip: Skip rows
  • params: Input parameters
  • metrics: List of measures to retrieve
  • dimensions: List of dimensions to retrieve
tip

The metrics and dimensions parameters automatically generate an optimal OData $select clause for analytical queries.

Examples:

-- Read all metrics and dimensions
SELECT * FROM datasphere_read_analytical('SALES', 'REVENUE_ANALYTICS');

-- Select specific metrics and dimensions
SELECT *
FROM datasphere_read_analytical(
'SALES',
'REVENUE_ANALYTICS',
metrics := ['TotalRevenue', 'TotalCost', 'Profit'],
dimensions := ['Region', 'ProductCategory', 'Year']
);

-- With parameters and limit
SELECT *
FROM datasphere_read_analytical(
'SALES',
'REVENUE_ANALYTICS',
params := {'FISCAL_YEAR': '2024'},
metrics := ['TotalRevenue'],
dimensions := ['Quarter', 'Region'],
top := 100
);

Complete Workflow Example

Here's a full end-to-end workflow:

-- 1. Setup
LOAD erpl_web;

CREATE SECRET datasphere (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'acme-corp',
DATA_CENTER 'eu10'
);

-- 2. Discover spaces
SELECT * FROM datasphere_show_spaces();
-- Result: SALES, FINANCE, HR

-- 3. Explore SALES space
SELECT name, object_type, technical_name
FROM datasphere_show_assets('SALES')
ORDER BY name;

-- 4. Get metadata for specific asset
SELECT
label,
supports_analytical_queries,
has_relational_access,
has_analytical_access
FROM datasphere_describe_asset('SALES', 'Q1_REVENUE_VIEW');

-- 5. Query relational data
CREATE TABLE local_customers AS
SELECT *
FROM datasphere_read_relational('SALES', 'CUSTOMER_MASTER')
WHERE country_code = 'US';

-- 6. Query analytical data
CREATE TABLE revenue_by_region AS
SELECT *
FROM datasphere_read_analytical(
'SALES',
'REVENUE_ANALYTICS',
metrics := ['Revenue', 'Margin'],
dimensions := ['Region', 'Quarter']
);

-- 7. Analyze locally with DuckDB
SELECT
Region,
SUM(Revenue) as total_revenue,
AVG(Margin) as avg_margin
FROM revenue_by_region
GROUP BY Region
ORDER BY total_revenue DESC;

Advanced Features

Named Parameters for Parameterized Views

Some Datasphere views require input parameters:

SELECT *
FROM datasphere_read_relational(
'SALES',
'MONTHLY_SALES_VIEW',
params := {
'P_YEAR': '2024',
'P_MONTH': '03',
'P_REGION': 'EMEA'
}
);

Snake_Case Column Names

ERPL-Web converts column names to snake_case for consistency:

-- Datasphere column: TotalRevenue
-- DuckDB column: total_revenue

SELECT total_revenue, product_category
FROM datasphere_read_analytical('SALES', 'REVENUE_VIEW');

Combine with DuckDB Features

-- Export to Parquet
COPY (
SELECT * FROM datasphere_read_relational('SALES', 'CUSTOMERS')
) TO 'customers.parquet' (FORMAT PARQUET);

-- Join Datasphere data with local data
SELECT
c.customer_name,
l.local_data
FROM datasphere_read_relational('SALES', 'CUSTOMERS') c
JOIN local_table l ON c.customer_id = l.id;

-- Window functions
SELECT
region,
revenue,
AVG(revenue) OVER (PARTITION BY region) as avg_regional_revenue
FROM datasphere_read_analytical('SALES', 'REVENUE_VIEW');

Secret Management

Using Config Files

Store credentials in a config file:

# ~/.datasphere/config.ini
[datasphere]
tenant_name=acme-corp
data_center=eu10
client_id=abc123
client_secret=secret456
CREATE SECRET datasphere_cfg (
TYPE datasphere,
PROVIDER config,
CONFIG_FILE '/home/user/.datasphere/config.ini'
);

Using File Provider

Store credentials in JSON:

{
"tenant_name": "acme-corp",
"data_center": "eu10",
"access_token": "...",
"refresh_token": "..."
}
CREATE SECRET datasphere_file (
TYPE datasphere,
PROVIDER file,
FILEPATH '/secure/path/datasphere-creds.json'
);

Multiple Secrets for Multiple Tenants

-- Production tenant
CREATE SECRET datasphere_prod (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'acme-prod',
DATA_CENTER 'eu10'
);

-- Development tenant
CREATE SECRET datasphere_dev (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'acme-dev',
DATA_CENTER 'eu10'
);

-- Use specific secret
SELECT *
FROM datasphere_read_relational(
'SALES',
'CUSTOMERS',
secret := 'datasphere_prod'
);

Performance Tips

1. Use Column Selection

Select only columns you need for better performance:

-- Good: Specific columns
SELECT customer_id, customer_name, country
FROM datasphere_read_relational('SALES', 'CUSTOMERS');

-- Less efficient: All columns
SELECT * FROM datasphere_read_relational('SALES', 'CUSTOMERS');

2. Use top Parameter for Sampling

-- Quick sample for exploration
SELECT *
FROM datasphere_read_relational('SALES', 'LARGE_VIEW', top := 100);

3. Filter in DuckDB After Loading

OData filters can be limited. Sometimes it's faster to load data and filter locally:

-- Load once, filter multiple times
CREATE TABLE customer_cache AS
SELECT * FROM datasphere_read_relational('SALES', 'CUSTOMERS');

-- Fast local filtering
SELECT * FROM customer_cache WHERE country = 'US';
SELECT * FROM customer_cache WHERE revenue > 100000;

4. For Analytical Queries, Specify Metrics/Dimensions

-- Efficient: Only requested columns are transferred
SELECT *
FROM datasphere_read_analytical(
'SALES',
'REVENUE_VIEW',
metrics := ['Revenue'],
dimensions := ['Region']
);

Troubleshooting

OAuth2 Browser Not Opening

Issue: Browser doesn't open for authentication

Solution:

-- Check redirect_uri and port
CREATE SECRET datasphere (
TYPE datasphere,
PROVIDER oauth2,
TENANT_NAME 'mytenant',
DATA_CENTER 'eu10',
REDIRECT_URI 'http://localhost:65000' -- Try different port if blocked
);

Token Expired

Issue: 401 Unauthorized after some time

Solution: Tokens are auto-refreshed. If refresh fails, recreate the secret:

DROP SECRET datasphere;
CREATE SECRET datasphere (...); -- Re-authenticate

Asset Not Found

Issue: 404 Not Found when querying asset

Solution:

-- Verify asset exists
SELECT * FROM datasphere_show_assets('SPACE_NAME');

-- Check exact technical_name (case-sensitive)
SELECT technical_name FROM datasphere_show_assets('SPACE_NAME')
WHERE name LIKE '%search_term%';

Permission Denied

Issue: 403 Forbidden

Solution:

  • Verify your user has access to the space
  • Check space permissions in Datasphere UI
  • Ensure OAuth2 scopes include necessary permissions

Enable Tracing

For detailed diagnostics:

SET erpl_trace_enabled = TRUE;
SET erpl_trace_level = 'DEBUG';

-- Run your query
SELECT * FROM datasphere_show_spaces();

-- Check trace logs for details

See Tracing & Diagnostics for more information.


Integration Patterns

ETL to Data Lake

-- Extract Datasphere data to Parquet
COPY (
SELECT * FROM datasphere_read_relational('SALES', 'TRANSACTIONS')
WHERE transaction_date >= '2024-01-01'
) TO 's3://datalake/sales/transactions.parquet' (FORMAT PARQUET);

Federated Analytics

-- Combine Datasphere with other sources
SELECT
ds.customer_id,
ds.customer_name,
pg.order_count,
sf.crm_score
FROM datasphere_read_relational('SALES', 'CUSTOMERS') ds
LEFT JOIN postgres_customers pg ON ds.customer_id = pg.id
LEFT JOIN snowflake_crm sf ON ds.customer_id = sf.customer_id;

Incremental Sync

-- Track last sync time
CREATE TABLE sync_metadata (
table_name VARCHAR,
last_sync_time TIMESTAMP
);

-- Initial load
INSERT INTO sync_metadata VALUES ('CUSTOMERS', CURRENT_TIMESTAMP);

-- Incremental updates (if Datasphere view supports it)
INSERT INTO local_customers
SELECT *
FROM datasphere_read_relational(
'SALES',
'CUSTOMERS',
params := {'LAST_MODIFIED': (
SELECT last_sync_time FROM sync_metadata WHERE table_name = 'CUSTOMERS'
)::VARCHAR}
);

UPDATE sync_metadata
SET last_sync_time = CURRENT_TIMESTAMP
WHERE table_name = 'CUSTOMERS';

Next Steps


Summary

ERPL-Web's Datasphere integration provides:

Native Integration - First DuckDB client for Datasphere
Secure - Full OAuth2 with automatic token refresh
Discoverable - Explore spaces and assets easily
Powerful - Query relational and analytical data
Flexible - Combine with all DuckDB features

Connect your SAP Datasphere to DuckDB analytics today!