Data engineering teams confront numerous pitfalls when moving SAP HANA data to Snowflake for analytical purposes. These include high costs of SAP analytics solutions, the complexities of data migration, and the possibility for vendor lock-in. This article explores common pain points encountered during the data development lifecycle. It offers insights in SAP Hana to Snowflake migration challenges when developing SAP analytical capabilities.
Requirements Gathering
This phase where Business objectives meet technical reality often reveals a chasm between the two. Although SAP ERP systems are rich in data, extracting meaningful requirements from these cryptic structures can be challenging. This task is often made more daunting when business and tech teams struggle to communicate effectively.
Communication is a challenge due to many reasons such as conflicting terminology, confusing business requirements, non-aligned priorities, and lack of common knowledge between the teams. Business teams may find it hard to communicate precise technical needs even while they are aware of the broad goals. The result is conflicting rules, mismatched objectives, and erroneous end data models.
Consider the case of collecting financial data. BSEG (Account Document Segment) and ACDOCA (Universal Journal) tables within SAP Hana, are commonly used for developing financial data models. While aggregate and detailed information can be secured from BSEG, relevant references for the financial transactions must be retrieved from ACDOCA, depending on the analytical requirements. Attempting to replicate the SAP CDS views logic to develop data models within external systems such as Snowflake, requires SME support and constant intervention. Addressing complexities such as these and the many table relationships in SAP analytics calls for a deep understanding of the system's intricacies.
Data Modeling
A good run in the Requirements Gathering phase is not enough to guarantee smooth sail during data modeling. The phase of Data modeling in the migration from SAP HANA to Snowflake has its fair share of hurdles. A crucial part of the process is de-normalizing SAP's normalized transactional data for Snowflake's analytical engine. In doing so, one must not only pay attention to SAP's complex data links and business logic but also carefully build and map data types to appropriately aggregate and summarize data.
Take for instance, decoding SAP-specific values from the multipurpose cluster table STCL. This often requires using function modules like read_text, replicating complex hierarchical structures (e.g., BOMs) and implementing change data capture to maintain real-time consistency. Operations like these require custom conversions and specialized knowledge. Two other important areas that demand careful consideration during modeling are maintaining strong security and governance requirements and ensuring data quality through cleansing and validation.
To overcome these data modeling complexities is key to unlocking the full potential of SAP analytical solutions within the Snowflake environment, leading to faster, more insightful analytics and data-driven decision-making.
Technical Development
With the modeling specifications in hand, the development phase commences, during which the actual data pipelines are constructed. The development stage places a greater emphasis on the practical application of data pipelines, a process that presents its own distinctive problems:
Skill Gaps
The skills essential to construct SAP analytics solutions within Snowflake are specialized. Teams may lack the requisite skills or require more training, causing delays.
Performance and Data Volume
SAP ERP systems produce massive amounts of data. It is not easy to extract, transform, and load (ETL) this data into Snowflake efficiently while maintaining performance.
Error Handling and Debugging
Developing robust pipelines that handle errors gracefully and provide meaningful diagnostics is critical but challenging, especially when dealing with the idiosyncrasies of SAP data.
“Conversion Exit” Pitfalls
Conversion Exits are often used in SAP to manage data format changes. Inexperienced data engineering teams turn to copying SAP conversion exit logic within their ELTs\ETLs, increasing the possibility of failures.
On the other hand, replicating multilevel "Hierarchical" models, such as Profit Center or BOM structures, within an external Data Warehouse like Snowflake is particularly complex. Accurately recreating these intricate structures demands significant data engineering expertise.
Testing
The stability and integrity of the deployed data pipelines, particularly when supporting SAP analytics solutions, are dependent on both technical and business testing. While performance improvement is critical, our present focus is on another component of data migration: validating business logic. This is often a difficult task due to factors such as:
Data Variability
The considerable variability and context-specificity of SAP ERP data make it difficult to compile extensive test cases that cover all scenarios.
Automated Testing
The implementation of automated testing frameworks that can manage the intricacies of SAP data and the transformations applied in Snowflake is a difficult but essential task for efficiency, particularly when the downstream system is heavily reliant on the application upstream in terms of data and metadata integrity.
Misaligned Expectations
Technical solutions may not meet business's expectations, forcing rework and modifications. It is essential to involve business users in User Acceptance Testing (UAT); however, this can be difficult due to their limited availability and the need for consistency in the analytical models.
For instance, in purchasing scenarios, validating business testing is crucial. Here, a single purchase order with multiple items delivered in parts can make it challenging to pinpoint which shipment was delayed if the actual goods receipt doesn't align with the vendor's confirmed schedule. If business testing isn't included in the data model for this scenario, simply doing a cross join using the available information might lead to inaccurate results.
While in the case of technical testing, situations such as Primary Keys holding NULL values leading to incorrect joins, can be prevented by incorporating data quality checks using automation scripts meant to perform data validation.
Rework
Despite best efforts, rework remains a reality in data development, and it is common to hit one of the common hurdles:
Continuous iterations can lead to iteration fatigue among team members, impacting morale and productivity.
Rework consumes additional time and resources. This resource drain often causes project delays and cost overruns.
Keeping documentation and knowledge bases up to date with ongoing changes is challenging yet crucial for long-term project success.
Cases that necessitate addition of new business rules or configurations to capture details/ variations in business process are missed out.
Incorrect logic implemented with limited knowledge of the business processes.
Quick fixes for the incoming data by applying different strategies for loading the data or changing the changing primary keys of the table.
Next Steps:
Migrating from SAP HANA to Snowflake involves a number of challenges. Working with qualified professionals is the key to success. But, is there a faster or simpler way? Learn about the advantages and disadvantages of the classic "lift and shift" method.
Visit us at www.indigoChart.com or drop us a line at hello@indigochart.com
Comments