Delta Replicate Stock Data from SAP ERP to Parquet
This guide demonstrates how to use SAP ODP (Operational Data Provisioning) to replicate stock data from SAP ERP to DuckDB and export it to Parquet files. ODP provides a standardized way to extract and replicate data from SAP systems.
What is SAP ODP?
SAP ODP (Operational Data Provisioning) is a framework that provides:
- Standardized Data Extraction: Consistent interface for data replication
- Delta Capabilities: Incremental data updates
- Real-time Processing: Near real-time data availability
- Multiple Formats: Support for various data formats
- Error Handling: Built-in error recovery and monitoring
Prerequisites
Before starting ODP replication, ensure you have:
- ERPL ODP extension installed (subscription required)
- Access to SAP ERP system with ODP enabled
- ODP extractors configured for your data sources
- User account with ODP authorizations
- Sufficient storage space for Parquet files
Understanding ODP Extractors
Types of ODP Extractors
- CDS Views: Core Data Services views as ODP sources
- BW Extractors: Traditional BW extractors
- Function Modules: Custom function modules
- Tables: Direct table access
Stock Data Extractors
Common ODP extractors for stock data:
- MCHB: Material stock by storage location
- MSKA: Material stock by sales area
- MSLB: Material stock by storage location and batch
- MCH1: Material stock by plant
Setting Up ODP Replication
Step 1: Discover Available Extractors
-- List all available ODP contexts
SELECT * FROM sap_odp_show_contexts();
-- List extractors in BW context
SELECT * FROM sap_odp_show('BW');
-- Search for stock-related extractors
SELECT * FROM sap_odp_show('BW')
WHERE data_source LIKE '%STOCK%';
Step 2: Get Extractor Metadata
-- Get detailed information about a specific extractor
SELECT * FROM sap_odp_describe('BW', 'MCHB$F');
This returns information about:
- Available fields
- Key fields
- Data types
- Extraction methods
Step 3: Initialize ODP Connection
-- Connect to SAP ERP system
SELECT odp_connect(
host = 'your-sap-server.com',
system_number = '00',
client = '100',
user = 'your-username',
password = 'your-password'
);
Basic ODP Replication
Simple Full Replication
-- Replicate all stock data from MCHB extractor
SELECT * FROM sap_odp_read_full('BW', 'MCHB$F');
Replication with Filters
-- Replicate stock data for specific materials
SELECT * FROM sap_odp_read_full(
'BW',
'MCHB$F',
replication_mode => 'FULL',
selection => 'MATNR = "MAT001" AND WERKS = "1000"'
);
Delta Replication
Understanding Delta Replication
Delta replication only extracts changed data since the last extraction:
- Initial Load: Full data extraction
- Delta Loads: Only changed data
- Delta Tokens: Track extraction progress
- Error Handling: Manage failed extractions
Initial Delta Setup
-- Initialize delta replication for MCHB extractor
SELECT * FROM sap_odp_read_full(
'BW',
'MCHB$F',
replication_mode => 'FULL',
subscriber_name => 'ERPL'
);
Delta Replication Process
-- Step 1: Check subscription status
SELECT * FROM sap_odp_show_subscriptions()
WHERE subscriber_name = 'ERPL' AND data_source = 'MCHB$F';
-- Step 2: Extract delta data
SELECT * FROM sap_odp_read_full(
'BW',
'MCHB$F',
replication_mode => 'DELTA'
);
-- Step 3: Check extraction cursors
SELECT * FROM sap_odp_show_cursors();
Complete Stock Data Replication Example
Step 1: Setup Replication Environment
-- Create table for stock data
CREATE TABLE stock_data AS
SELECT * FROM sap_odp_read_full('BW', 'MCHB$F') WHERE 1=0;
-- Create subscription tracking table
CREATE TABLE subscription_status (
data_source VARCHAR,
subscriber_name VARCHAR,
last_extraction TIMESTAMP,
status VARCHAR
);
Step 2: Initial Data Load
-- Perform initial full load
INSERT INTO stock_data
SELECT * FROM sap_odp_read_full(
'BW',
'MCHB$F',
replication_mode => 'FULL',
subscriber_name => 'ERPL'
);
-- Record subscription status
INSERT INTO subscription_status VALUES (
'MCHB$F',
'ERPL',
CURRENT_TIMESTAMP,
'ACTIVE'
);
Step 3: Delta Replication Function
-- Create function for delta replication
CREATE OR REPLACE FUNCTION replicate_stock_delta()
RETURNS TABLE (
records_processed INTEGER,
subscription_status VARCHAR
) AS $$
DECLARE
record_count INTEGER;
current_status VARCHAR;
BEGIN
-- Extract delta data
INSERT INTO stock_data
SELECT * FROM sap_odp_read_full(
'BW',
'MCHB$F',
replication_mode => 'DELTA'
);
-- Update subscription status
UPDATE subscription_status
SET last_extraction = CURRENT_TIMESTAMP
WHERE data_source = 'MCHB$F' AND subscriber_name = 'ERPL';
-- Get current status
SELECT status INTO current_status
FROM subscription_status
WHERE data_source = 'MCHB$F' AND subscriber_name = 'ERPL';
-- Return results
SELECT COUNT(*) INTO record_count FROM stock_data;
RETURN QUERY SELECT record_count, current_status;
END;
$$ LANGUAGE plpgsql;
Step 4: Export to Parquet
-- Export stock data to Parquet
COPY (
SELECT
MATNR as material_number,
WERKS as plant,
LGORT as storage_location,
CHARG as batch,
CLABS as unrestricted_stock,
CUMLM as stock_in_transit,
CINSM as stock_in_quality_inspection
FROM stock_data
) TO 'stock_data.parquet' WITH HEADER;
Advanced ODP Features
Error Handling
-- Check for extraction errors
SELECT * FROM sap_odp_show_subscriptions()
WHERE status = 'ERROR';
-- Check extraction cursors
SELECT * FROM sap_odp_show_cursors();
-- Use recovery mode for failed extractions
SELECT * FROM sap_odp_read_full(
'BW',
'MCHB$F',
replication_mode => 'RECOVER'
);
Performance Optimization
-- Use selection parameters to limit data
SELECT * FROM sap_odp_read_full(
'BW',
'MCHB$F',
replication_mode => 'DELTA',
selection => 'MATNR = "MAT001"'
);
-- Preview data before extraction
SELECT * FROM sap_odp_preview('BW', 'MCHB$F', LIMIT => 100);
Data Transformation
-- Transform data during extraction
SELECT
MATNR as material_number,
WERKS as plant,
LGORT as storage_location,
CLABS as stock_quantity,
CASE
WHEN CLABS > 1000 THEN 'HIGH_STOCK'
WHEN CLABS > 100 THEN 'MEDIUM_STOCK'
ELSE 'LOW_STOCK'
END as stock_level
FROM sap_odp_read_full('BW', 'MCHB$F');
Monitoring and Maintenance
Replication Monitoring
-- Check replication status
SELECT
subscriber_name,
data_source,
context,
status,
last_used_date
FROM sap_odp_show_subscriptions();
-- Monitor extraction cursors
SELECT * FROM sap_odp_show_cursors();
Data Quality Checks
-- Validate data completeness
SELECT
COUNT(*) as total_records,
COUNT(DISTINCT MATNR) as unique_materials,
COUNT(DISTINCT WERKS) as unique_plants
FROM stock_data;
-- Check for data anomalies
SELECT
MATNR,
WERKS,
CLABS
FROM stock_data
WHERE CLABS < 0 OR CLABS > 1000000;
Cleanup and Archiving
-- Archive old data
CREATE TABLE stock_data_archive AS
SELECT * FROM stock_data
WHERE extraction_date < CURRENT_DATE - INTERVAL '1 year';
-- Clean up archived data
DELETE FROM stock_data
WHERE extraction_date < CURRENT_DATE - INTERVAL '1 year';
Integration with Cloud Platforms
AWS S3 Integration
-- Export to S3
COPY (
SELECT * FROM stock_data
) TO 's3://your-bucket/stock-data/stock_data.parquet'
WITH HEADER;
Google Cloud Storage
-- Export to GCS
COPY (
SELECT * FROM stock_data
) TO 'gs://your-bucket/stock-data/stock_data.parquet'
WITH HEADER;
Azure Blob Storage
-- Export to Azure Blob
COPY (
SELECT * FROM stock_data
) TO 'az://your-container/stock-data/stock_data.parquet'
WITH HEADER;
Best Practices
1. Delta Replication Strategy
- Use appropriate delta modes (FULL, DELTA, INIT)
- Monitor delta tokens regularly
- Implement error recovery mechanisms
- Schedule regular delta extractions
2. Performance Optimization
- Use parallel extraction where possible
- Implement data compression
- Optimize filter conditions
- Monitor extraction performance
3. Data Quality
- Implement data validation rules
- Monitor data completeness
- Handle missing or invalid data
- Maintain data lineage
4. Security and Compliance
- Use secure connections
- Implement proper authorization
- Log all extractions
- Protect sensitive data
Troubleshooting
Common Issues
- Extraction Failures: Check ODP extractor status
- Delta Token Issues: Verify token validity
- Performance Problems: Optimize extraction parameters
- Data Quality Issues: Implement validation rules
Debugging Tips
-- Enable detailed logging
SET log_level = 'debug';
-- Check extractor status
SELECT * FROM sap_odp_show_subscriptions()
WHERE data_source = 'MCHB$F';
-- Test with minimal data
SELECT * FROM sap_odp_preview('BW', 'MCHB$F', LIMIT => 10);
Next Steps
Now that you understand ODP replication, explore:
- ERP Table Access - Reading SAP ERP tables
- BAPI Function Calls - Calling SAP BAPIs
- BW Query Execution - Working with SAP BW
- SQL Reference - Complete function reference
Additional Resources
- SAP ODP Documentation
- ERPL GitHub Repository
- SAP Community
- Contact Support - Get help with ODP replication