Pass a String Longer than 64 KB from Java to RPG with PCML

Article ID: 53274

Q: Thank you for the articles about parsing XML from RPG using the Expat tool. I'd like to write an RPG program that uses Program Call Markup Language (PCML) to receive a large XML string as a parameter from a Java program. The RPG program should parse the XML string and return a response, also in XML. Unfortunately, RPG and PCML appear to be limited to 64 KB for regular strings, and only 16 KB for Unicode strings. I don't want to write the data to a file, because this is for a realtime application, but the XML document is often longer than 64 KB. Is there an API that will help me, or am I out of luck?

A: RPG's string limitations can be a royal pain. However, you're not completely out of luck! In this article, I demonstrate how RPG can work with parameters that are megabytes long, and how the parameter data can be parsed by Expat and returned to a Java program as a new XML document.

Variables Are an Illusion

RPG's alphanumeric variables are limited to a maximum of 64 KB. The UCS2 (Unicode) variables are limited to 16 KB. Nothing I say in this article will change that. However, variables are an illusion.

When the computer is running a compiled program, it knows nothing of variables. Variables are just a tool designed to make the programmer's job easier. From the computer's perspective, they don't really exist. All the computer knows about is memory. When your program is compiled, it translates your variable names into references to memory addresses. These memory addresses are where the actual work gets done.

The system is not restricted to accessing only 64 KB of memory at a time. It can access 16 MB at a time. Because of this, you can handle longer parameters in your program by declaring a variable based on a pointer and changing the location that the pointer points to.

Pointer Basics

A pointer is a variable designed to hold the address of a byte in the system's memory. Just as a packed field stores a number, and an alphanumeric field stores letters, a pointer stores memory addresses.

When a variable is declared to be "based on" a pointer, it means that the variable accesses the memory that the pointer points to. If you change the address stored in the pointer, the spot in memory that you're accessing is changed.

Parameters Are Shared Memory

How do programs pass variables to one another as parameters? They do so by sharing memory. The caller tells the callee where each parameter is located in memory. In other words, parameters are implicitly based on a pointer, and that pointer is supplied by the calling program!

If your program wants to receive a 500,000-character UCS2 field (which is 1,000,000 bytes of memory, because each UCS2 character occupies two bytes) it can declare a 16383C parameter. Then it can declare a 10000C field based on a pointer. If that pointer is pointing to the address of the parameter, it can access the first 10,000 characters (or 20,000 bytes) of that UCS2 parameter. If you change the pointer to point 20,000 bytes later in memory, it accesses the next 10,000 characters of the parameter. You can repeat that until you've accessed the entire 1,000,000 bytes passed from the Java application.

An Example of Long Parameters

For the sake of example, let's say that I have a Java program that wants to get details for a list of items that my company makes. The Java program does so by creating an XML document containing the item numbers that it wants the details for. Here's a sample of what that document might look like:

<?xml version="1.0" encoding="UTF-16"?>
<request>
<detail id="00020" />
<detail id="00021" />
<detail id="00022" />
<detail id="00023" />
   .
   .
<detail id="91015" />
<detail id="91016" />
</request>

On my system, there are more than 4,800 different items in the list, and the total size of the XML data is 221,802 bytes. As my company expands its business, the number of items might grow, so I want to be able to handle at least twice as many items.

The Java program wants to use PCML to pass this XML document to an RPG program, and that RPG program is to look up each item number and supply the details. For the sake of this demonstration, the only detail it supplies is the product description, and the response XML document looks like this:

<?xml version="1.0" encoding="UTF-16"?>
<response>
<detail id="00020" desc="SAUSAGE CHEESE GIFT BOX A" />
<detail id="00021" desc="SAUSAGE CHEESE GIFT BOX B" />
<detail id="00022" desc="SAUSAGE CHEESE GIFT BOX C" />
<detail id="00023" desc="SAUSAGE CHEESE GIFT BOX D" />
  .
  .
<detail id="91015" desc="BRATWURST TYPE A" />
<detail id="91016" desc="BRATWURST TYPE B" />
</response>

When all items have their descriptions filled in, this response document is more than 500,000 bytes long. Again, I want to provide room for the company to grow, so I want to allow at least 1,000,000 bytes for this parameter.

My RPG program should receive the request document, use the Expat XML parser to get the product information from it, then use pointer logic to create the response document and pass it back to the Java program.

The RPG Program's Parameter List

The RPG program that receives the XML request and returns the XML response has the following parameter list:

     D GIVEDESC        PR                  ExtPgm('GIVEDESC')
     D   Request                           likeds(XmlParm_t)
     D   Response                          likeds(XmlParm_t)
     D GIVEDESC        PI
     D   Request                           likeds(XmlParm_t)
     D   Response                          likeds(XmlParm_t)

     D XmlParm_t       ds                  qualified
     D   Len                         10I 0
     D   Data                     16383C

     D MAXRESP         c                    500000

As you can see, this program receives two parameters. The first is named Request, and the second is named Response. These parameters are declared to be like the XmlParm_t data structure, and that data structure contains two subfields, a length and a data subfield. I've also declared a named constant named MAXRESP to represent the maximum length that the RPG program returns in the Response parameter.

The Data fields are declared to be only 16383C long, and at first glance you might think that's the most data that can be stored in them. However, in reality, the Java program is calling this one. The Java program reserves enough memory for each field so that each one can store 1,000,000 bytes. Because parameters are passed by sharing memory between the two programs, there is 1,000,000 bytes of data in those two parameters.

Remember, the parameters don't really exist. Variables are an illusion! Java is passing a pointer that points to 1,000,000 bytes of memory. If the RPG parameter shares that memory, the 16383C parameter is viewing the first 32,766 bytes of that place in memory. If I use pointer logic to view the bytes beyond that 32,766 bytes, I can see more of the parameter that Java passed to me.

The MAXRESP constant is there to keep track of how many characters I can actually use for each parameter. I check it later in my code to ensure that I don't go beyond the memory that Java has provided.

In my sample program, I use a Prototype (PR) and Procedure Interface (PI) instead of the more traditional *ENTRY PLIST to receive the parameters. I prefer to use that technique because it's more flexible and doesn't request any fixed-format RPG code. However, if you want to use *ENTRY PLIST instead, you can. Here's how you code it:

     D XmlParm_t       ds                  qualified
     D   Len                         10I 0
     D   Data                     16383C

     D MAXRESP         c                    500000

     D Request         ds                  likeds(XmlParm_t)
     D Response        ds                  likeds(XmlParm_t)

     C     *ENTRY        PLIST
     C                   PARM                    Request
     C                   PARM                    Response

The result of this is identical to that of the PR/PI method.

Parsing the Parameter with Expat

As with any program that uses Expat, I have to create an XML Parser object and then feed the XML data into Expat so it can parse it. Here's the code to do that:

        parser = XML_ParserCreate(XML_ENC_UTF16);
         if (parser = *NULL);
            ErrMsg = 'No memory to create XML parser!';
         endif;

         if ( %len(ErrMsg) = 0 );

            XML_SetStartElementHandler(parser: %paddr(start));

            if (XML_Parse( parser
                         : %addr(Request.data)
                         : Request.len * 2
                         : 1
                         ) = XML_STATUS_ERROR );
                 ErrMsg = 'Parse error at line '
                        + %char(XML_GetCurrentLineNumber(parser)) + ': '
                        + %str(XML_ErrorString(XML_GetErrorCode(parser)));
            endif;

            XML_ParserFree(parser);

         endif;

Explaining how Expat works is beyond this article's scope. If you're unfamiliar with Expat, please see the following article:
http://www.SystemiNetwork.com/article.cfm?id=53061

I'd like to point out the second and third parameters to XML_Parse() in the preceding example. The first one tells Expat the address in memory of the XML request — this is the address of the data portion of the Request parameter. Again, because this address is actually shared from the Java program, Expat views the Java program's memory! Because the third parameter tells Expat the length, and because that length can exceed 16,383 characters, Expat reads beyond the end of my RPG variable to get the whole parameter. I don't even have to code any special logic!

Expat parses the XML document and calls my start() subprocedure (which is the Start Element Handler that I'm using in this sample program). The handler looks for <detail> XML tags that contain a product ID, and it loads them into an array of item numbers.

I don't include the code for the start() subprocedure in this article, though it is included in the code download, in case you're interested in seeing how it works. For now, suffice it to say that it loads an array called "items" with all the item numbers in the XML document, and it keeps track of the total number of items in the "num_items" variable.

Response Can Be an Error Document, if Needed

When the XML_Parse() routine from Expat finishes running, that entire array should be loaded. Immediately after calling XML_Parse(), I check the array and make sure that all is well. If anything is wrong, I create an XML document containing the error, and I return that to the Java program in lieu of the product details:

         if (%len(errMsg)=0 and num_items<1);
            ErrMsg = 'No items found in request!';
         endif;

         if (%len(errMsg)<>0);
            Response.data = %ucs2('<?xml version="1.0" encoding="UTF-16"?>'
                                + '<response>'
                                +   '<error>' + errMsg + '</error>'
                                + '</response>');
            Response.len = %len(%trimr(Response.data));
            *inlr = *on;
            return;
         endif;

If All Is Well, Create the Response

The preceding code ends the program (and returns control to the Java program) if any errors were found. If I make it beyond that, all is well and I have a list of items for which to return the description. The RPG program loops through that list of items, chains to the Item Master File (ITMMAST) to get the description, and then writes it to the response document. After that response document has been filled out, the program ends and returns control to the Java program:

         Response.len = 0;
         addResponse('<?xml version="1.0" encoding="utf-16"?>' + CRLF);
         addResponse('<response>');

         for x = 1 to num_items;
            chain (item(x)) ITMMAST;
            if %found;
               addResponse('<detail id="' + %char(item(x)) + '"'
                          +' description="' + %trim(Descr) + '" />'
                          + CRLF );
            else;
               addResponse('<detail id="' + %char(item(x)) + '"'
                          +' error="No such item!" />'
                          + CRLF );
            endif;
         endfor;

         addResponse('</response>');
         *inlr = *on;
         return;

The preceding code creates the response XML document. For every item, it chains to the ITMMAST file to get the description. If the item isn't found, it tells Java by setting an "error" attribute on the <detail> tag. When an item is found, it sets the description in the "desc" attribute of the same tag.

The tricky part about sending back the response is that it's necessary to include more than 500,000 bytes of response data in the parameter. Ordinary RPG string manipulation can't do that, so I have to use pointer logic.

To keep the code relatively easy to read, I decided to write a subprocedure to do the pointer logic that adds more data to the end of the response. I called it the addResponse() subprocedure. It's simpler than you might think. Here's the code:

     P AddResponse     B
     D AddResponse     PI
     D   text                     16383A   varying const options(*varsize)

     D data            s          16383C   based(p_data)
     D newlen          s             10I 0

      /free

           newlen = Response.len + %len(text);
           if (newlen > MAXRESP);
               return;
           endif;

           p_data = %addr(Response.data) + (Response.len * 2);
           %subst(data:1:%len(text)) = %ucs2(text);
           Response.len = newlen;

      /end-free
     P                 E

The Len subfield of the Response parameter keeps track of how many characters have been added to the XML response parameter. The first thing that addResponse() needs to do is verify that it doesn't write past the 1,000,000 byte mark, so it adds the length of the new data to the previous length of the field and verifies that it's not longer than the MAXRESP constant. If it is longer, it quits the subprocedure to make sure that we don't write too far into the system's memory.

The data variable is a UCS2 field based on a pointer named p_data. The data variable can access up to 16,383 characters (or 32,766 bytes of memory) at a time. I set the pointer to point to the next available spot in the response parameter by retrieving the address of the parameter's start and adding the length (in bytes) of the data added previously. The result is the next available space in the parameter.

Next, I use the %SUBST() and %UCS2() built-in functions (BIFs) to assign the new text to the appropriate spot in the response document and convert it to Unicode.

Finally, I set the new length of the response document so that subsequent calls don't overwrite the data that I just wrote.

Creating the PCML Document

In order for Java to call the RPG program, I need a PCML document. Starting in V5R2, the RPG compiler can create that document for you when you compile your program. For example, you can use the following command to compile the sample code from this article:

CRTBNDRPG PGM(GIVEDESC) SRCFILE(xxx/QRPGLESRC) DBGVIEW(*LIST) +
          PGMINFO(*PCML) INFOSTMF('/tmp/GIVEDESC.pcml')

When I run this command to compile my program, the compiler generates the following data in the /tmp/givedesc.pcml file:

<pcml version="4.0">
   <!-- RPG program: GIVEDESC  -->
   <!-- created: 2006-09-28-08.48.56 -->
   <!-- source: mylib/QRPGLESRC(GIVEDESC) -->
   <!-- 412 -->
   <struct name="XMLPARM_T">
      <data name="LEN" type="int" length="4" precision="31" usage="inherit" />
      <data name="DATA" type="char" length="16383" chartype="twobyte" ccsid="1200" usage="inherit" />
   </struct>
   <!-- 6 -->
   <program name="GIVEDESC" path="/QSYS.LIB/mylib.LIB/GIVEDESC.PGM">
      <data name="REQUEST" type="struct" struct="XMLPARM_T" usage="inputoutput" />
      <data name="RESPONSE" type="struct" struct="XMLPARM_T" usage="inputoutput" />
   </program>
</pcml>

The preceding PCML document has a problem. The RPG compiler still thinks that our REQUEST and RESPONSE data can hold only 16,383 characters. Furthermore, it doesn't know that the REQUEST document is to be used only for input, and the RESPONSE document is to be used only for output. To solve this problem, I open the GIVEDESC.pcml document with a text editor. On the System i, you can use the following command to edit the PCML code:

EDTF '/tmp/GIVEDESC.pcml'

The following shows the PCML as it should be for Java to call this program correctly. I've highlighted the changes that I made in red:

<pcml version="4.0">
   <!-- RPG program: GIVEDESC  -->
   <!-- created: 2006-09-28-08.48.56 -->
   <!-- source: mylib/QRPGLESRC(GIVEDESC) -->
   <!-- 412 -->
   <struct name="XMLPARM_T">
      <data name="LEN" type="int" length="4" precision="31" usage="inherit" />
      <data name="DATA" type="char" length="500000" chartype="twobyte" ccsid="1200" usage="inherit" />
   </struct>
   <!-- 6 -->
   <program name="GIVEDESC" path="/QSYS.LIB/mylib.LIB/GIVEDESC.PGM">
      <data name="REQUEST" type="struct" struct="XMLPARM_T" usage="input" />
      <data name="RESPONSE" type="struct" struct="XMLPARM_T" usage="output" />
   </program>
</pcml>

The Java Side of the Fence

Assuming that the Java program has the XML request document in a String object named contents, the Java program can call the RPG program as follows:

    try {
        AS400 conn = new AS400("host","userid","password");
        ProgramCallDocument pcml = new ProgramCallDocument(conn,"GIVEDESC");

        pcml.setValue("GIVEDESC.REQUEST.LEN", new Integer(contents.length()));
        pcml.setValue("GIVEDESC.REQUEST.DATA", new String(contents.toString()));

        if (pcml.callProgram("GIVEDESC")) {
          int len = pcml.getIntValue("GIVEDESC.RESPONSE.LEN");
          String data = pcml.getStringValue("GIVEDESC.RESPONSE.DATA",
                             BidiStringType.DEFAULT ).substring(1,len);
          System.out.println(data);
        }
        else {
           // Retrieve list of server messages
           AS400Message[] msgs = pcml.getMessageList("GIVEDESC");
           String msgId, msgText;

           // Iterate through messages and write them to standard output
           for (int m = 0; m < msgs.length; m++)
           {
               msgId = msgs[m].getID();
               msgText = msgs[m].getText();
               System.out.println("    " + msgId + " - " + msgText);
           }
           System.out.println("** Call to QSYRUSRI failed. See messages above **");
        }
     }
     catch (Exception e) {
         e.printStackTrace(System.out);
         System.exit(0);
     }

In this code, the GIVEDESC.REQUEST.LEN field is set to the length of the contents variable that contains the XML document, and GIVEDESC.REQUEST.DATA is set to the XML data itself. Because the PCML document specifies a Coded Character Set Identifier (CCSID) of 1200, the XML document is automatically converted to UTF-16 if necessary.

After the call is complete, it retrieves the GIVEDESC.RESPONSE.LEN field so that it knows the length of the data that the RPG program sent back. It takes a substring of the GIVEDESC.RESPONSE.DATA field so that it has a String object that contains only the appropriate length.

For the sake of demonstration, I'm printing the XML document to the screen so that you can see that you can successfully retrieve the full XML response from the RPG program. Of course, in a production program, you would parse the data and then use it as appropriate to the application.

Code Download

You can download the GIVEDESC sample program and its corresponding PCML file from the following link:
http://www.pentontech.com/IBMContent/Documents/article/53274_120_GiveDesc.zip

If you're interested in using the Expat XML parser, you can download a System i port from my Web site at the following link:
http://www.scottklement.com/expat

ProVIP Sponsors

ProVIP Sponsors