Having only ever watched about 5 minutes of Star Trek, the phrase "beam me up" is the only data point about the show that I'm aware of. In this article, the "beaming" applies to data that is not currently on your IBM i but that you desperately want to get into the system. About the time Star Trek was originally aired, beaming data into your system was done with punched cards, or the new-fangled 8" floppy disks (I recently had to explain to somebody that the term "floppy disk" originated with 8" disks, which were, indeed, floppy). If you were lucky, you worked in a shop where beaming data into the system might have involved the majestic 5251 twinax terminal.
Nowadays, we're getting more data than ever that needs to be beamed. The problem is, some of the data sets are quite large, in the order of tens or hundreds of thousands of records. As soon as you start uploading a large amount of data, your attention is immediately riveted on one data point: how fast is it? Some recent reports from the field indicate that several shops are having problems getting reasonable performance from their uploads.
A possible solution for uploads, that seems to perform well (compared to the field reports), is provided with Microsoft SQL Server. Lest you moan and groan, the technique I'll show in this article can be accomplished with the no-charge Microsoft SQL Server Express Edition, if you do not already have or do not plan to acquire one of the licensed versions of Microsoft SQL Server. In my example, I am using the SQL Server 2008 Enterprise Edition, which I acquired through my extensive contacts deep inside Microsoft (well, through my MSDN subscription).
Spoiler alert: using this technique, I uploaded 498,998 records (record length 456, 53 fields) in 18:07 minutes (1087 seconds), yielding an average upload speed of 459 records/second. The scoreboard is shown in Figure 1.
The June 2009 issue of My i-.NET includes the article Create a Linked Server with SQL Server and your IBM i. Rather than repeat all of the steps required to create the linked server, I will refer you to that article. The linked server technique uses the IBM i OLE DB Provider, which is a no-charge component of System i Access for Windows.
To use the linked server for an upload, the file that you're uploading to must already exist on your IBM i and it must be journaled. If the file is not journaled, the linked server technique reports an error and will not run. Having read through the field reports, I had some serious doubts about how well the linked server upload would work, based upon the journaling requirement.
In Figure 1, you can see the INSERT statement that I used. There are three significant sections of the statement:
The INTO clause uses the four-part identifier, described in the June 2009 article. The identifier is specified as:
system_name.catalog_name.schema_name.table_name
Translated into AS/400ease, the system_name is the TCP/IP name of your IBM i. The catalog_name is the name of the relational database directory entry. Use the Work with Relational Database Directory Entries (WRKRDBDIRE) command on your IBM i to view the names and select the correct name for your system. The schema_name is the name of the library and table_name is the file name to upload to.
The columns list is simply the fields that you want to insert into the table.
In Figure 1, the SELECT statement uses the * ("select all") shortcut, which I feel will be typical for most bulk upload operations. If the table that you are uploading from contains more columns than you will actually be uploading, you will need to specify a column list on the SELECT statement. The FROM clause of the SELECT statement specifies the SQL Server table that contains the data you want to upload. Since you are supplying a SELECT statement as the source of the data to be uploaded, you can also specify a WHERE clause, if you need to select a subset of data from the SQL Server table. For an INSERT operation to the IBM i, including an ORDER BY clause would be pointless and detrimental to performance.
Figure 1 is a screen capture from the SQL Server Management Studio. As shown in the figure, SQL Server earnestly believes that the entire statement is incorrect (shown by all of the squiggly underlines). Nevertheless, the statement runs when you execute it within the Management Studio. Upon completion, the statistics for the upload are shown in the Messages panel, with the elapsed time shown in the lower right corner.
In another test, I uploaded a table of 118,048 rows, 20 columns, record length 95 bytes, in 3:43 (223 seconds), for an average upload speed of 529 rows/second.
The tests were performed on an AS/400 model 270, rated at 370/CPW, with 1GB memory. There was no other work of any significance running on the machine at the time. The LAN connection uses 1Gb Ethernet. The SQL Server was running in a Microsoft Virtual PC 2007 environment using Windows XP with 1.5GB RAM and a 1Gb Ethernet connection. In short, the AS/400 is an obsolete, underpowered model being served by a non-server class workstation. I would expect better performance from more current systems, although I suspect that at some point the network speed becomes an issue. If you run similar tests on your system and get noticeably better (or worse) results, I would be interested in hearing from you. Please post your results as comments to this article so that everybody will benefit from your experience.
Many IBM i developers have studiously avoided "anything" to do with Microsoft products. Others feel a need to be somewhat surreptitious and not draw attention to themselves, and some have simply decided to use a tool when it fits. I don't know if this solution will be right for your shop, but here are some things to consider about using Microsoft SQL Server linked servers for data upload to your IBM i:
Regardless of your feelings or current experience with Microsoft SQL Server, it will take only a few hours of your time to investigate this as a possible solution to your upload needs. Feel free to comment on your experience, either good or bad!