Taming complex cloud integrations with Informatica Cloud
The reason why new enterprise technology imperatives continuously capture attention is that the perfect universal IT solution is still a long way in the future. Until that perfection is realized, there will be a need for system integration solutions — and new challenges in implementing them.
The explosive proliferation of cloud solutions is creating challenges for both vendors and enterprises. Integration approaches that worked well for monolithic, on-premise applications are well understood and readily consumable, so re-using them can seem like a good idea. However, what looks easy at a high level may become a massive struggle when design moves to implementation. In the case of cloud adoption, integration APIs that are easily consumed by custom-developed applications can become unwieldy when needed in a declarative implementation-style cloud application.
Technologies generally evolve to be more flexible, more reliable, and simpler to use. For early adopters, the legacy approach was to build a custom adapter to bridge the gaps between levels of maturity. These custom adapters are labor-intensive to develop and difficult to maintain. A faster approach is to combine pre-built, standards-based tools in order to reduce the complexity of the integration. In this article, I’ll examine one such solution to provide a concrete example that is adaptable to similar challenges.
Case Study: Company X's Workday-to-SAP 4/HANA Integration
Company X is a well-known consumer technology company that needed to synchronize HR data from Workday as a source to SAP for use in automatic workflow routing.
At the 50,000-foot level, integrating data from cloud-based Workday to SAP looks simple. Workday is a mature, enterprise-grade software-as-a-service (SaaS) application that has been serving major multi-nationals since 2008 and has a comprehensive library of SOAP-based APIs. SAP provides so many interfaces that question is never whether something can be integrated, but which option best fits the needs of the particular solution.
Drop down to the 10,000-foot level and the first thing you notice is that (no surprise) the same data is modeled and formatted differently between the two applications. We decided at the outset to use Informatica Cloud as an integration-platform-as-a-service (iPaaS) solution, and these types of transformations fall right into a mature sweet-spot for Informatica. No real worries … yet.
Stepping into the arena and looking at the two sides of source and target, differences take on a much more distinct outline. Like any good ERP system, SAP has rules for data relationships, as does Workday. The challenge arises when those rules look very different.
APIs tend to follow data structures, so what looked like snapping two large puzzle pieces together from 50,000 feet away, now looks like a maze from 100 yards distance. The fastest way to navigate a maze is with a well-defined map, which helps us get the data in the first time. Then we learn that the return trip with data updates introduces a whole new level of complexity. We still have our map, but now when we are actually navigating the maze, we learn that it changes on subsequent trips. Sometimes we even have to solve a puzzle blindfolded to get around the next corner.
First-Level Translations are Human
Initial decisions in an integration project are made with a view from the highest level of abstraction to prevent the dreaded "analysis paralysis.” In this case, we made the decision to use Informatica Cloud as the iPaaS/ETL solution, Workday's API as the source integration point, and SAPs iDoc format for target interface.
For some (rare) integration scenarios, once the source, target, and ETL tool are determined, necessary access is exposed to the ETL team, and everyone except the ETL team is done until user acceptance testing (UAT). In most cases, the best way to ensure success is for stakeholders and SMEs to be involved right up to post-production, with levels of participation varying along the way.
It may at first seem sufficient for the source and target teams to provide the ETL team with definitions and let them map the fields accordingly. In practice, the best way to create a mapping from one system to another is for all teams to walk through the lifecycle of the data, from the point where it is first enters into the system, via the interface that will be used in the integration, all the way to where it is used in the target system.
This process of "living the data" can reveal nuances that may not be obvious to anyone when defining the field mapping. A perfect example is the issue of field lengths in SAP. When data is entered through a UI, the zero-padding necessary to load a 4-digit number into an 8-digit field happens in the background. The same data sent via an iDoc can be rejected as improperly formatted.
Another area where working as a team to walk through the process has a positive impact is mitigating cultural differences. In system integration, "culture" can refer to how developers in different departments work (or in different companies, in the case of partner integrations), as well as terminology differences between countries. An example is when one group described a field as a “number + character,” which was interpreted to mean a VARCHAR (a field that will accept either numbers or letters). What the developer meant was a number that needed to be zero-padded to fit the size of the field. Such confusion will often be recognized during the mapping definition phase, when the whole team walks through the transitions together. Failure to do so can result in many work hours spent re-mapping fields and updating data ingestion processes.
For the integration performed for Company X, the initial pass of field mapping definitions encompassed 36 target fields to be sent using a single iDoc definition and was thought to be complete. The final integration that went into production had 69 fields in various combinations, transmitted using six different iDoc definitions.
An Elegant Solution with Multiple Tools
In theory, Company X's Workday-to-SAP integration could be done entirely within the Informatica Cloud Data Integration system (sometimes referred to as ICS). In practice, the effort to do so would be lengthy, painstaking, and so complex that changes or troubleshooting could only be done in a reasonable amount of time by the original developer — and even that person would probably forget some of the intricacies before long.
The Workday connector available from Informatica is an excellent tool for accessing data using the Workday API from within the data integration platform. The challenge is that the Workday API is designed to support traditional web service integrations, which typically happen through a framework that makes it easy to ignore unwanted data and loop through arrays to find a particular match. In a visually-oriented tool like the Informatica Cloud Mapping Designer, this is like looking for a needle in field of haystacks.
Workday is not unusual in providing a complex response object with its API. Fortunately, there is a way to make this easier with the Informatica Application Integration platform, a.k.a. Informatica Cloud Real Time (ICRT).
With ICRT, APIs are accessed by creating a service connector, which defines actions that can be called on the service based on input parameters, and then returns a response that is defined in the service connector. One of the wizard-driven response formats is "simplified XML,” and for many integrations all that needs to be done is to create a service connector and expose it through a process that ICS can use a source through one of its web service connectors.
In the case of integrating Workday HR data with SAP, much more work is necessary than just exposing the relevant Workday APIs as simplified XML. There are many ways in which the data could be managed and processed in ICRT and shared with ICS.
We first defined two data objects, one each for organizations and employees. The fields within the data objects are named and formatted as they will be used in the iDoc that will load them into SAP. For those 69 fields that are eventually populated in SAP, it may surprise you to learn that they are all variations of only 30 fields derived from the Workday API responses.
Those 30 fields are easily translated to the two data objects from one Workday API (getOrganizations) and one custom report-as-a-service (RaaS) of current employees from workday for the initial data load. The complicated part is the difference in data hierarchies, where Workday associates positions to employees and SAP considers them to be children of organizations. To manage the sequencing necessary to maintain the hierarchy rules, the ICRT processes are divided into sub-processes, with common steps used in the population of organization data re-used by the Employee load solution, ensuring that the appropriate positions are inserted before the employee iDocs are submitted.
Again, it is possible to use one platform, ICRT, to manage the entire integration from end to end. Yet the tooling would make this a very complex process, as a BPEL service would be forced to do double-duty as an ETL tool. Instead, the data objects are output to CSV files that serve as the source in an ICS mapping to transform the data into the SAP target.
Scott Nelson is a Senior Technical Director at Primitive Logic. Scott is a seasoned IT leader and agent of change with over 20 years of experience working with the full delivery life cycle of solutions for system integration, process automation, and human workflow applications. He is also deeply experienced working with enterprises in need of guidance through complex changes, and has a long track record of managing digital transformation, expansion and stabilization initiatives.