In this tutorial, we will discuss Parse XML pallet which is useful to validate the input XML files. By using this pallet with validation enabled we can make sure that only valid data formats are getting processed. we can consider this as the first level of validation for an XML input. Let discuss it in detail.
Here I am going to create a project which takes a large XML file as input which has multiple book details and returns sperate XML files for each book.
Parse XML & Render XML BW Process
Input XML File
<?xml version="1.0"?> <catalog> <book id="bk101"> <author>Gambardella, Matthew</author> <title>XML Developer's Guide</title> <genre>Computer</genre> <price>44.95</price> <publish_date>2000-10-01</publish_date> <description>An in-depth look at creating applications with XML.</description> </book> <book id="bk102"> <author>Ralls, Kim</author> <title>Midnight Rain</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2000-12-16</publish_date> <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description> </book> </catalog>
Read File: I used the Read file activity to read the large XML file and pass the XML string to Parse XML activity. By using Read File we can read the file as binary format or in text format. For this example, I am using text format. Then in the input tab configure the input XML file.
Parse XML: Parse XML activity takes XML as input and returns an XML tree based on the XML Schema Definition (XSD) or Document Type Definition (DTD). So we need to create an XSD before configuring the Parse XML activity.
XSD
Once the XSD is created we can start configuring the Parse XML activity. As we used text format for Read File, we need to set the input style as Text itself. Also, make sure to enable the Validate Output checkbox. So that the output of the activity should be validated against the schema specified in the Output Editor tab. Else it will parse any XML format. Why we configuring this is to confirm we are processing the expected data which have the exact feild specified in the schema definition.
Parse XML Input configuration
Now click on the output Editor tab to set the output format. First, click in the ‘+’ symbol and select ‘XML Element Reference’ from the list box. Then click on the ‘Find’ icon. It will open up a ‘Select a Resource’ window which will have the schema we have created earlier. Click on the same and select the element under the tree tab and click ‘OK’.
Parse XML Output Editor
Note: Some cases the Parse XML activity will fail due to namespace issues and the output of the activity won’t match with the XSD namespace. To avoid this error we can add the below line in the input tab for xmlString. Please note that you need to replace the highlighted contents with your code details.
concat(‘<?xml version = “1.0” encoding = “UTF-8”?><catalog xmlns = “http://www.tibco.com/schemas/Sample/Schema.xsd” xmlns:xsi = “http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation = “http://www.tibco.com/schemas/Sample/Schema.xsd XSLTSchema.xsd”>’,substring-after($Read-File/ns1:ReadActivityOutputTextClass/fileContent/textContent, ‘<catalog>’))
Now you can do a test run and see the result.
Group Activity: Used the Group to iterate for each book elements and pass it to the Render XML activity.
Group Activity
Render XML: Same like in Parse XML we can use a schema definition for Render XML to generate the desired output format. i.e we can customize the tag name to generate the output XML.
Render XML output schema
As you noticed, I have changed the elements name to a different one and removed the description field. Like I explained in the Parse XML activity, follow the same steps to set the output schema under the ‘Input Editor’ tab. Then map the elements as needed.
Render XML input mapping
When you run the code, the input & output of Render XML activity will look like below.
Render XML input DataRender XML Output Data
Write File: In this example, I am writing all contents to the same file and the output is like the below. Instead of write file activity use queue sender or JDBC pallet to perform the desired action.
<?xml version="1.0" encoding="UTF-8"?> <ns0:bookname xmlns:ns0="http://www.tibco.com/schemas/Sample/Schema1.xsd"> <ns0:authorname>Gambardella, Matthew</ns0:authorname> <ns0:genre>Computer</ns0:genre> <ns0:price>44.95</ns0:price> <ns0:publish_date>2000-10-01</ns0:publish_date> <ns0:titlebook>XML Developer's Guide</ns0:titlebook> </ns0:bookname><?xml version="1.0" encoding="UTF-8"?> <ns0:bookname xmlns:ns0="http://www.tibco.com/schemas/Sample/Schema1.xsd"> <ns0:authorname>Ralls, Kim</ns0:authorname> <ns0:genre>Fantasy</ns0:genre> <ns0:price>5.95</ns0:price> <ns0:publish_date>2000-12-16</ns0:publish_date> <ns0:titlebook>Midnight Rain</ns0:titlebook> </ns0:bookname><?xml version="1.0" encoding="UTF-8"?>
Hope this tutorial help you guys to understand how the Parse XML and Render XML are working. Let me know your feedback and suggestions in the comment section below.