String Lengths, Bits Testing, and Time Values

Article ID: 21180
Learn to use the strlen API, test specific bits within a field, and more

As you may recall, in "Structures, Data Types, and Error Notification" (February 2008, article ID 21181 at SystemiNetwork.com), we developed sample program READ_DIR to read the contents of a directory. Our next step now is to look at how to process a stream file within a directory.

But first, remember that when you call READ_DIR with a parameter such as '/', that in addition to stream files (*STMFs), other objects (such as subdirectories) are also returned when using the readdir API (for a recap of the fundamentals, see "Basic Rules for Calling an Industry Standard API" below). You could, within your application environment, ensure that only *STMFs are stored in a given directory; however, knowing how to distinguish between various object types may be a useful capability. For that reason, we'll first look at another API: Get File Information, or stat.

The stat API

Figure 1 shows the prototype for the stat API. The stat API requires two parameters and returns a return value. The first parameter, path, is a pointer to a null-terminated character string representing the path of the file we want to gather information about. The second parameter, buf, is a pointer to a structure known as stat. This is an output parameter where the stat API will return information about the file identified by the path parameter. The return value from the stat API is a four-byte integer, where a value of 0 indicates a successful API call and a value of -1 denotes an error. In the case of an error, errno is updated to reflect the type of error.

By looking at the source member STAT of file QSYSINC/SYS, we can find the definition of the stat structure. Figure 2 shows the default structure for stat. The stat structure, as provided in source member STAT, is actually much more complex than the default structure shown here. This is because the C compiler can dynamically alter structure definitions based on constants defined during the compilation process. Describing this capability is beyond the scope of this article, so I won't discuss that here. You can also use the Information Center documentation for the stat API to determine the default definition in Figure 2.

To convert the stat structure's C definition, we'll follow the same steps we used for the dirent structure in "Structures, Data Types, and Error Notification" (see Other Articles in This Series, below). However, we immediately run into a problem. In Figure 2, the first field of the stat structure, st_mode, is defined as a type mode_t, but there's no typedef statement in source member STAT that provides the equivalent basic data type of int, char, pointer, and so on. So we have to expand our search.

The statement #include <sys/stat.h> tells us in which source file and member to look. Within the source member STAT, there are more #include statements than Figure 2 shows. In particular, there is the statement #include <sys/types.h>. If we search in member TYPES of QSYSINC/SYS, we will find the necessary typedef statements. The data type mode_t, for instance, is defined as an unsigned integer or, in CL, a *UInt. (For a review of why the C language provides the ability to use a type of mode_t rather than the standard unsigned integer, see "Structures, Data Types, and Error Notification.")

A bit later in the stat structure, we also encounter the field st_rdev64 that, after analysis of the data type dev64_t, we find is defined as either an unsigned long long or a character array of eight bytes. Although we previously discussed the data type unsigned long, which in CL is a Type(*Uint) with Len(4), we haven't yet seen an unsigned long long. A long long is an extension to the C language indicating that st_rdev64 is an eight-byte unsigned integer value. If the extension isn't used, which is the case when a constant named _LONG_LONG is not defined, st_rdev64 is treated as an eight-byte character field. Because CL does not support an eight-byte integer variable type, we'll use an eight-byte character field to represent the variable st_rdev64. (In a future article, we'll explore how to convert these eight-byte integer values to a Type(*Dec) CL variable so that the numeric value can be available to your application program. That is, assuming that the eight-byte integer value does not exceed 999,999,999,999,999 — the maximum value that can be represented by a CL *DEC variable.)

The &Stat Structure

Figure 3 shows the full source for our enhanced program READ_DIR2. You can see the CL equivalent of the stat structure at B in Figure 3. But before reviewing the code at B, you may want to try to build your own version of the stat structure, and then compare the &Stat structure at B with your results.

Within the structure &Stat are many interesting facts about the object and four fields that we'll focus on: &St_ObjType, &St_Size, &St_Mode, and &St_MTime. The variable &St_ObjType is of interest to the application because it tells us the type of object that the variable &Path is identifying, and we want to determine whether the object being referenced is a *STMF so that we can further process the data in the *STMF.

The variable &St_Size tells us whether any data exists within the *STMF that is to be processed. The use of variable &St_Mode requires a stretch of the imagination in terms of our application, but by using it, we can see how to test specific bits within a field — a necessary capability when using some industry standard APIs.

To rationalize using &St_Mode, let's assume that the user running the READ_DIR2 program is using *PUBLIC authority when accessing the *STMF, and that we're testing specific bits within the variable &St_Mode to determine whether *PUBLIC has read rights to the *STMF. A more production-oriented approach is to use the Determine File Accessibility API access, which checks the authority of the current job (as opposed to *PUBLIC). The access API, however, doesn't require the testing of bits but instead that the caller set specific bits (see "API Parameters and Data Types," December 2007, article ID 21102, for a refresher on this topic).

The variable &St_MTime is of interest because it represents the time the file data was last modified. Though our sample program doesn't actually need this time-related information to successfully process the *STMFs found within a directory, it's important that we understand how to work with industry standard time formats. So I'll use this variable for demonstration purposes.

The DoWhile loop (at F in Figure 3) shows the mainline changes made to the initial READ_DIR program. The preceding processing is the same as that used in the initial READ_DIR program in "Structures, Data Types, and Error Notification."

Because the stat API requires the full path name of the object, the first change from the READ_DIR program is related to correctly setting the value for the variable &Path. The &Path variable is defined as a character variable with a length of 673 bytes (at A). The length of 673 is calculated based on the maximum size of a directory in the READ_DIR2 program (32 bytes), plus the maximum size of an object name returned by the readdir API (640 bytes), plus a '/' delimiter between the directory name and the object name (one byte). The 640-byte &Name variable returned by the readdir API already has one byte allocated for a null terminator, so there is no need to add one byte for another null-termination character.

The variables &Dir and &Name currently contain the directory name and object name, respectively, so we need to concatenate these two fields with a '/' delimiter in between them. To do this, we can use the CL command in Figure 4 — but using this approach, we could end up with &Path names such as '/test_dir/my_file.txt' and '/test_dir//my_file.txt', based on the initial value of &Dir_In that was passed to the READ_DIR2 program. These are both equivalent (as is '/test_dir//////my_file.txt' for that matter) in terms of referring to the object my_file.txt in directory test_dir, but multiple '/'s certainly look strange on displays and reports, so we'll code the application to avoid them.

Rather than using the ChgVar command in Figure 4, the READ_DIR2 program checks to see whether the last byte of the &Dir variable is a '/' and, if not, insert the /. This lets us introduce another industry standard API: Determine String Length, or strlen. This is a simple API that should be a welcomed relief after working through the stat structure in Figure 2. Figure 5 shows the prototype for strlen.

The strlen API

The strlen API requires one parameter and provides a return value. The parameter string is defined as a pointer to a null-terminated character string that will not be modified by the API. The API returns a return value of type size_t that, as we know from source member STRING in file QSYSINC/H, is equivalent to a four-byte unsigned integer or, in CL, a *UInt. This return value is the length of the string parameter, not counting the null byte terminating the string. At C in Figure 3, the variable &Size is defined to reflect this return value.

At F, the READ_DIR2 program uses the strlen API to determine the length of the directory name stored in the variable &Dir. READ_DIR2 then examines the last byte of the value stored in &Dir to determine whether it is a '/'. Depending on what it finds, READ_DIR2 then sets the variable &Path to the appropriate value (where "appropriate" is subjectively interpreted as meaning that the '/' character is added only if the last character of &Dir is currently not a '/').

Having set the variable &Path, READ_DIR2 now calls the stat API. If the API encounters an error, which is indicated by the return value &RtnCde being set to -1, the perror API is called to display error-related text to the user. If no error occurs when calling the stat API, READ_DIR2 checks to see whether the object type is a stream file and, if so, calls the subroutine ProcesFile (at H). The remainder of the DoWhile processing (at F) is the same as that used in READ_DIR.

The ProcesFile subroutine first determines whether the *STMF contains any data by checking whether the variable &St_Size is greater than 0 (at H). If the variable is greater than 0, READ_DIR2 then checks to verify that the file can be accessed by a program running with *PUBLIC authority.

The variable &St_Mode, which is returned by the stat API, is documented as being a bit string, indicating the permissions and privileges (better known as authorities to most i5/OS users) associated with the file. From the stat API documentation, we also see that the C constants for &St_Mode are defined in member STAT of file QSYSINC/SYS, and the textual descriptions of these constants reside in the documentation for the Change File Authorizations API, or chmod.

Before delving into &St_Mode, let's first review how we set a bit string in "API Parameters and Data Types." Figure 6 at A shows how we were able to construct the bit string &OFlag by adding the variables &O_RDONLY and &O_SHR_NONE to set "on" the desired attributes, or bits, when opening a file. In the case of &St_Mode, the stat API returns a bit string such as that at B, representing various attributes that are on. Here, we need to determine which specific attributes or bits are on. One way to accomplish this is with the AND String API _ANDSTR. Figure 7 shows the parameters for the _ANDSTR API. The _ANDSTR API is not an industry standard API; rather, it is an MI instruction (a future article will address using an MI in a CL program). And in case you're wondering, you don't need an industry-defined API to test bits because languages such as C have standard operators for this type of operation.

The _ANDSTR API

The _ANDSTR API has four required parameters and no return value. The first parameter, result, is a pointer to something named void. You may recall that the data type void can represent nothing or anything, depending on the context in which it is used. Up to this point, all uses of void have been to represent nothing — for instance, the use of void in the __errno API prototype to indicate that there are no required parameters, and the use of void in the perror API prototype to denote that there is no return value from the API.

In the case of the _ANDSTR API, though, the function prototype isn't simply using the void data type — rather, it's using a pointer to void. In this context, void represents anything. Specifically, the _ANDSTR API is declaring in the prototype that it needs a pointer to a buffer used to receive the results of the API call, and that the API doesn't care how that buffer is defined. In the READ_DIR2 program, no CL variable is declared for the result pointer parameter. Rather, using what we learned in "API Parameters and Data Types" concerning the equivalence of a pointer being passed *ByVal or pointing to the parameter being passed *ByRef, the CL variable &Result is passed by reference. The variable &Result is defined as a *UInt field that, as you will see, matches the definitions for the second and third parameters.

The second parameter, string1, is also a pointer to void (or anything). This parameter represents the first of two bit strings that are to be ANDed together. ANDing is the process of comparing two bit strings and then returning a bit string, where the only bits on in the result are those bits that were on in both of the input bit strings. If either or both of the corresponding bits are "off" in the input bit strings, the result field bit will also be off. In READ_DIR2, string1 is represented by the variable &St_Mode and is defined as a *UInt (at B in Figure 3). You can see an arbitrary possible setting of &St_Mode at B in Figure 6.

The third parameter in Figure 7, string2, is likewise a pointer to void and represents the second input bit string to be ANDed. In READ_DIR2, this is represented by variable &S_IROTH and is defined as a *UInt (at E in Figure 3). With &S_IROTH, you can see the read authority for others (or public) and several other bit constants associated with &St_Mode at C in Figure 6. (See "API Parameters and Data Types" as a refresher on how to determine these values.)

The fourth parameter, length, can be either a four-byte or eight-byte unsigned integer value. This parameter represents the number of bytes (eight-bit values) to AND across string1 and string2. CL does not support eight-byte unsigned integer values, so READ_DIR2 uses the variable &Length, defined at D in Figure 3, as *UInt with a value of 4. We use 4 because that is the number of bytes allocated for each of the fields &Result, &St_Mode, and &S_IROTH.

At D in Figure 6, you can see the results of ANDing an arbitrary &St_Mode variable with the &S_IROTH variable. If the on bit(s) associated with &S_IROTH are on in variable &St_Mode, the resulting value &Result is the same as the original &S_IROTH value. So at H in Figure 3, READ_DIR2 ANDs &St_Mode and &S_IROTH, checks to see whether &Result is equal to &S_IROTH and, if true, knows that *PUBLIC has sufficient authority to further process the file identified by variable &Path. If additional bits must be checked (e.g., both *PUBLIC read and write authority), the application can add &S_IROTH and &S_IWOTH to a temporary working variable and then AND the result of this addition to &St_Mode to determine whether both bits are on.

Having verified that *PUBLIC has read access to the file, READ_DIR2 then displays the name of the file being processed, along with the date and time that the file's contents were last updated. The stat API uses variable &St_MTime to provide this date and time information. But before the date and time can be displayed, we need to format the time information into a human-readable form.

The ctime and gmtime APIs

In industry standard APIs, time is often defined as a time_t data type, a typedef for a signed integer, and stored as the number of seconds since an epoch of 00:00:00 January 1, 1970 UTC. So a value of 0 represents Thu Jan 1 00:00:00 1970, -2147483648 (the largest negative number a signed integer can hold), Fri Dec 13 20:45:52 1901, and 2147483647 (the largest positive number that can be stored), Tue Jan 19 03:14:07 2038. Most users would not equate a value such as 1192283160 to Sat Oct 13 13:46:00 2007, so some formatting is generally necessary.

The Convert Time to Character String API, ctime, accepts a UTC-based time_t value and returns a 26-byte character string, such as that in the previous paragraph. Figure 8 shows the ctime API prototype.

The ctime API has one required parameter (time) and provides a return value. The return value is a pointer to a character string with a formatted value of time. READ_DIR2 uses ctime (at H in Figure 3). There are several other time-related C runtime APIs. One that is particularly useful for performing your own date and time formatting is the Convert Time API, or gmtime. The gmtime API accepts a time parameter and returns a structure with individual fields for seconds, minutes, hours, day of month, month of year, years since 1900, day of week, and day of year.

When working with time_t values, keep in mind that this industry standard method of tracking time cannot represent time before December 13, 1901, or after January 19, 2038. For your own application tracking of time-related information, you should use the i5/OS DB2 support. The i5/OS database supports a range of dates from January 1, 0001, to December 31, 9999.

Having displayed the name of the file being processed and the time the file was last updated, READ_DIR2 now does any processing of the *STMF required by the application (at H). After the file is processed, READ_DIR2 then increments the variable &Nbr_Files by 1. This variable is used to display the number of files processed (at G in Figure 3) when all stream files within the directory have been read. If the file was unsuccessfully processed due to either an authority problem or an empty file, an informational message is displayed, and the variable &Nbr_Missed is incremented by 1 to reflect the number of unprocessed files.

The ProcesFile subroutine then ends, and control is returned to the DoWhile loop (at F in Figure 3) so that the next directory entry can be read. When all of the directory entries have been processed, READ_DIR2 displays summary information (at G) and then ends.

Figure 9 provides the CL commands to create the READ_DIR2 program, make a test environment, and then test READ_DIR2. Assuming that the CL source for READ_DIR2 resides in source file QCLSRC of *LIBL, the two commands at A are necessary to create the READ_DIR2 program. To make a test environment for READ_DIR2, the CL command (at B) creates a directory named '/test'. The two CL commands at C then create the *STMF '/test/read_dir2.cl' using the source in member READ_DIR2 of VINING/QCLSRC, with the public data authority set to read and write. The CL command at D creates the *STMF /test/empty.cl with no data in it. The two commands at E create the *STMF '/test/not_auth.cl' with the public data authority set to none. In the commands at C and E, you will want to replace the library name of VINING with your own library name. The command at F shows how to call READ_DIR2 and have it process the contents of the '/test' directory.

More Still to Learn . . .

You now know how to follow embedded include files, determine the length of null-terminated character strings using the strlen API, test specific bits within a field, and work with time values. In the next article, we'll look at how to open and read a text *STMF to do processing of the file data.

Bruce Vining is president and co-founder of Bruce Vining Services, a consulting and contract programming firm located in Rochester, Minnesota. He spent 27 years working for IBM, initially with the IBM System/38 and most recently the IBM System i. He was responsible for IBM Design Control for System APIs from V2R3 through V6R1, in addition to having i5/OS design responsibilities in areas such as globalization and software serviceability.


Other Articles in This Series

For previous articles in this series, go to SystemiNetwork.com.

"API Parameters and Data Types" (December 2007, article ID 21102)

"Structures, Data Types, and Error Notification" (February 2008, article ID 21181)


Basic Rules for Calling an Industry Standard API

Previous articles in this series introduced many of the fundamentals involved in calling industry standard APIs, such as those for the ILE C language runtime and the IFS. Here's a recap of the basic rules:

  1. API names are case-sensitive.
  2. APIs expect all parameters to be passed *ByVal, though in the case of a pointer parameter, passing the pointed-to variable *ByRef also works.
  3. APIs expect character variables to be null terminated unless the API specifies otherwise (which is not the case with most APIs).
  4. Common types of data are encountered when using industry standard APIs.
  5. You can use these common data types to define additional types of data (both structures and individual fields) by employing a typedef statement.
  6. You can declare constants by using a #define statement.
  7. Error handling is typically done via an errno value.

— B.V.

ProVIP Sponsors

ProVIP Sponsors