Database Management Essentials

Article ID: 21039

DB2 for i5/OS is known for not requiring a database administrator. It earned this reputation by eliminating many of the low-level database management tasks such as storage allocation and spreading data across multiple devices. However, there are higher-level tasks and guidelines that you should use with your System i databases. This article describes some of the recommended practices for managing existing databases and creating new databases on System i servers.

Setting Up Databases

One unique aspect of DB2 for i5/OS is that it requires no installation or initial configuration. Just power on your System i server, and DB2 is automatically ready to store and process your business data.

By default, DB2 is configured to behave as a single systemwide database. There is no database object type; the term database is used to describe a group of DB2 objects on a system. A schema (or library) is the only DB2 container available to group or partition related database objects within this default configuration.

V5R2 delivered a new database setup option with enhanced support for Independent Auxiliary Storage Pools (IASP). An IASP is also known as an independent disk pool. These enhancements expand the capabilities of DB2 in two primary ways:

  • the ability to switch a DB2 database across multiple System i servers to increase availability
  • the ability to create multiple DB2 user databases on a single System i server
  • Switchable databases. To increase the availability of your stored business data, you can create a single user database in an IASP and then switch it across multiple systems (Figure 1). That flexibility is not available to databases created in the systemwide database (SYSBAS) that resides in the system ASP. If system outages occur on your primary system, the IASP and user database housed within the pool can be switched from one system to another. Higher availability is achieved because the independent disk pool can be made available (varied on) and unavailable (varied off) to a server without restarting the system.

    Multiple databases. Multiple database support provided by IASPs does not mean multiple versions or instances of the DB2 engine are running on the server. A single instance of DB2 is used by all of the databases on the server.

    From a DB2 viewpoint, IASPs provide the ability to create multiple user databases along with the systemwide database on a single System i server. Effectively, this means that you can create a schema with the same name multiple times on a single server. A schema name must be unique only within a user database (i.e., IASP), so schema DBLIB can exist three different times on a single server (Figure 2).

    This usage of multiple user databases is most beneficial when consolidating several branch offices or dealerships to run on a single server. Instead of separately purchasing and installing application software at each branch office, you can share a single copy of the application across all user databases. Figure 2 shows an example of a single copy of application code residing in the system ASP that can be run against any of the user databases stored in IASPs.

    Now that you understand the benefits of multiple databases, you may be wondering how the application decides which of the three DBLIB schemas to use. The following interfaces let you specify which user database to use during execution of the application:

    • INLASPGRP parameter on Job Description
    • SETASPGRP command
    • SQL CONNECT statement

    Creating multiple user databases on a server is not as simple as just specifying an extra keyword or two on an SQL statement or a CL command. For example, performance concerns can arise when the application program being shared across multiple IASPs uses SQL. Thus, you need to carefully plan and test your programs in addition to configuring the independent disk pools' hardware. For more details on implementing IASPs, consult the IBM Redbook IBM eServer iSeries Independent ASPs: A Guide to Moving Applications to IASPs (SG24-6802).

    Setting Up Schemas

    Now that you know the different places where a schema can reside on a System i server, let's briefly cover some of the setup details.

    As I mentioned earlier, an SQL schema is essentially the same as an i5/OS library. The difference is that the library created with the CREATE SCHEMA statement includes some extra objects created by DB2. Two of these are the journal and the journal receiver object; DB2 creates these objects to enable the automatic journaling of DB2 tables created with SQL. (I cover the recommendations on journal setup in the next section.)

    DB2 also creates a set of catalog views (e.g., SYSCOLUMNS, SYSTABLES). The catalog views created within the schema are not particularly interesting — they simply point to the systemwide catalog views contained in the QSYS2 and SYSIBM schemas. The catalog views within the schema have an extra filter defined to return metadata for only those tables, columns, and so forth, defined within the schema itself from the systemwide repositories. Some users make the mistake of specifying the WITH DATA DICTIONARY clause on the CREATE SCHEMA statement to create these catalog views; the catalog views are always created within the schema despite whether this clause is specified.

    Only those customers who need an Interactive Data Definition Utility (IDDU) data dictionary created for migration of old System/36 programs and objects should use the WITH DATA DICTIONARY clause. The majority of System i customers should not specify this option. In fact, IBM removed this clause from its SQL documentation and iSeries Navigator interfaces in V5R4 — too many customers were incorrectly specifying this clause and ran into trouble later when they wanted to create a table with a BLOB or CLOB column. A table containing columns with the LOB or DATALINK data types cannot be created in a schema that contains an IDDU data dictionary.

    Setting Up Journals

    Journaling is critical for database recovery and availability because all of the changes to your DB2 objects are logged. As you learned in the previous section, DB2 tries to automatically journal SQL-created tables by creating a journal and journal receiver object within the schema. Every time DB2 creates an SQL table, it looks in the schema for journal name QSQJRN. If DB2 finds a journal with that name, journaling automatically starts for the newly created table.

    Management options. If you're using a high-availability solution that requires the table be logged to a journal with a different name or the journal resides in a different schema, the V5R4 support for the QDFTJRN data area will be of interest to you. During the creation of a table, DB2 first looks in the target schema for the existence of the QDFTJRN data area before looking for the QSQJRN journal object. You can use this data area to specify which journal to use, which operation types should cause automatic journaling, and which types of objects to protect with journaling. You can also use QDFTJRN to have i5/OS automatically journal physical files, data areas, and data queues as they are created. For details on how to configure the contents of the QDFTJRN data area, refer to the IBM Technote "Journaling at object creation on DB2 for iSeries" (www.redbooks.ibm.com/Redbooks.nsf/portals/systemiTips).

    Many shops do not use journaling to protect their business data because of concerns over added management and performance overhead. Unfortunately, these shops are making their journaling decisions based on journaling capabilities that existed at the time the AS/400 was first introduced. Needless to say, IBM has vastly improved the management and performance of journals over the past 15 years.

    For simpler management, use the system-managed receiver option when creating the journal. When a journal receiver object hits its size threshold, the operating system will automatically create a new journal receiver object and replace the full receiver with this object. This system-managed processing eliminates the need for manual intervention from a system administrator.

    Performance options. Similar tuning options are available for streamlining journal performance. One classic option has been to create the journal objects in a private user ASP instead of in the system ASP along with the journal DB2 objects. The journal receivers were placed in the private user ASP so that a set of private disk arms could be dedicated to journal processing. By reducing disk arm contention, journal activity was able to complete faster and minimize the performance impact on application programs. However, the recommendation of private user ASPs to improve journal performance has waned as server hardware has evolved — in particular, the availability of large write caches for I/O adapters (IOA). Refer to the IBM Technote "Journaling — User ASPs Versus the System ASP" for further discussion on using user ASPs for journal objects.

    Reducing the amount of data written to the journal receiver object is another common way to decrease journal overhead. This technique not only reduces the amount of disk space required by journal receiver objects on your server, but by decreasing the size and frequency of journal entries, it also speeds up performance. In V5R1, IBM provided the capability to minimize journal entries. This minimal journal capability allows i5/OS to write only the data that was changed during a database update, instead of writing the entire record to the journal receiver. Obviously, writing 10 bytes of data to the journal is faster than writing a couple hundred bytes to move the entire record. To enable the minimal journal capability, specify *FILE or *FLDBDY for the MINENTDTA parameter on the Create Journal (CRTJRN) or Change Journal (CHGJRN) commands.

    Reducing the number of entries written to the journal receiver also improves performance. In addition to writing entries for each insert, update, and delete operation made to a DB2 table, i5/OS also writes an entry for each open and close. The Open and Close journal entries are not needed for database recovery operations such as an SQL rollback or the Apply Journal Changes (APYJRNCHG) command. Sometimes these entry types are useful when analyzing application behavior or debugging a problem, but that's not a frequent occurrence. Thus, these journal entry types are good candidates for elimination to speed up journal performance and reduce the size of journal receivers.

    The easiest way to remove the Open and Close entry types without starting and ending journaling is to use the Change Journal Object (CHGJRNOBJ) command. Here's an example of using this command to eliminate the Open and Close entries for all the tables within a specific schema:

    CHGJRNOBJ OBJ((MYSCHEMA/*ALL *FILE))
       ATR(*OMTJRNE) OMTJRNE(*OPNCLOSYN)

    For information on additional journal performance settings, such as journal caching and the *RMVINTENT journal attribute, see the IBM Redbook Striving for Optimal Journal Performance (SG24-6286).

    Setting Up Tables

    Now, let's focus the spotlight on the individual DB2 objects themselves. These sections concentrate on table setup and configuration on DB2 for i5/OS; however, the majority of this content applies to all DB2 object types.

    Naming standards. Before you create any database objects, it's recommended that you first define a naming convention for your programmers and staff to use. Though this recommendation seems like a theoretical concern that will just slow you down, a consistent naming convention helps your team be more productive in the long run. If column names are assigned at the whim of a user, analyzing and changing application code that references these random columns can be time-consuming. However, finding all of the primary key references in a program, for example, is much easier if all of the primary key field names contain an ID suffix (e.g, CUSTID). A naming standard is especially important when you're using the native (i.e., non-SQL) interfaces, where objects and field names cannot be longer than 10 characters. With SQL interfaces, you have more leeway, because object and column names can be up to 128 characters long.

    A good first step in defining your naming standards is to list all of the business terms that appear in your applications and data. Next, identify two- and three-character abbreviations for each term; for example, you can define "ADR" and "AD" as abbreviations that can be part of column names used to store address values. Once the naming standards are defined, you can house them in a word-processing document or use a more sophisticated design and modeling tool. One resource that can assist in defining naming conventions is the book SQL/400 Developer's Guide (29th Street Press, 2000).

    One other i5/OS-unique naming behavior that should not be overlooked when you create a standard is that objects created with SQL on i5/OS are actually assigned two names. As I mentioned previously, SQL interfaces enable you to create more descriptive object names by supporting a maximum length of 128 characters. However, the operating system supports only 10-character names. Hence, when an SQL object with a name longer than 10 characters is created, DB2 for i5/OS automatically generates a 10-character system name for that object. The system name is used on i5/OS CL commands or programs wanting to perform native record-level access on the SQL-created table. Let's look at an example of this behavior with the following table definitions:

    CREATE TABLE myschema/customer_orders (
      OrderID INTEGER,
      CustomerID INTEGER,
      OrderStatusCode CHAR(1),
      OrderDate DATE)
    CREATE TABLE myschema/customer_addresses(
      CustomerID INTEGER,
      AddressID INTEGER,
      AddressTypeCode CHAR(2),
      DateValidAddrStart DATE,
      DateValidAddrEnd DATE)

    For each table, DB2 generates a system name. The first tables created in the myschema schema will have the system names CUSTO00001 and CUSTO00002, respectively. The reason that dual names need to be part of your naming convention standard is that they are not guaranteed to be the same each time a table is created. For example, if these two tables were created in a different order on another system, the system names would flip-flop, with a system name value of CUSTO00001 for the customer_addresses table.

    The recommended approach is to use the RENAME TABLE statement to override the default name generated by DB2 so that you always know the system name for a particular SQL object. Here's an example of using the RENAME TABLE statement to assign a specific system name for the tables:

    RENAME TABLE myschema/customer_orders
      TO SYSTEM NAME custord
    RENAME TABLE myschema/customer_addresses
      TO SYSTEM NAME custadr

    You can use the FOR COLUMN clause to control the system column names generated by DB2 for column names greater than 10 characters in length.

    Relational table definitions. After you define a naming standard, the next step is to use those naming standards in the construction of the table definition. Again, many people want to rush in and create table objects as they need them in their application, without stopping to think about the overall table design and their relationship to other tables in the database. As a result, many System i customers will admit they are using DB2 tables as flat files instead of true relational tables.

    Designing a true relational table model is best accomplished by completing a logical data model layout before carrying out the physical implementation of the DB2 objects. Whether you start with the logical or physical model, your first step is to identify the primary key for a set of related values, defining a separate table for each related set of attributes and then defining the primary key. For example, you might designate the EmployeeID column as the primary key for uniquely identifying those attributes associated with a single employee. The primary key can be a single column or multiple columns.

    The next steps of the data normalization process involve eliminating redundant data from the table definition. For example, you shouldn't include the department name and mailing address within the employee table definition because they would have to be repeated for each employee in a department. Instead, you should store the constant facts for a department in a different table. The employee table definition stores the department number for each employee and looks up information from the department table as required.

    It is possible to get carried away with database table normalization. Although it's recommended to design your database to fifth normal form (5NF) in the logical data modeling process, most companies do not use that level of normalization in their physical database implementation. Most business applications use a physical database design with third normal form (3NF). The key here is to apply some basic normalization before creating tables for your application so that table design is not problematic as your application grows.

    Data type considerations. If performance is a top concern, consider the System i processor and instruction set when you choose column data types for your DB2 tables. An integer data type is the best-performing option when creating a column to store whole number values such as those for generated keys or quantity values. This is because DB2 can perform integer processing directly on the hardware. Decimal values — even those with 0 scale — require some level of software on top of the hardware capabilities of the System i processor.

    Regarding the storage and processing of decimal values, the i5/OS packed decimal data type is always a better-performing option than the zoned decimal data type. To understand the performance difference, let's look at how the DB2 engine implements the addition of zoned values. To add together two zoned decimal values, DB2 first has to convert each of these zoned decimal values into packed decimal values. The addition is then performed with the packed decimal values before the final step of converting the packed result back to a zoned decimal value. The performance cost of these extra conversions is why the packed decimal data type should be your choice out of the gate.

    Performance gains from minimizing conversions can also apply to the character data type. One of the most common character conversions occurs when using Java and WebSphere applications. Java string processing is based on the Unicode character set. Thus, when a Java application retrieves data from a character column in a DB2 for i5/OS table, the character string must be converted from its single-byte EBCDIC encoding to double-byte Unicode encoding. The same expensive conversion process occurs when a Java program writes data to a DB2 table. If Java or WebSphere programs primarily use a DB2 table, you can improve performance by storing the data with Unicode encoding. To accomplish this in DB2 for i5/OS, use the GRAPHIC or VARGRAPHIC data type and specify a CCSID value for one of the following Unicode character sets:

    • UTF-16, CCSID value of 1200
    • UTF-8, CCSID value of 1208
    • UCS-2, CCSID value of 13488

    For example, the following syntax would use the UTF-16 encoding for character strings stored in this company name column:

    ,,,    CompanyName GRAPHIC(20) CCSID 1200  …

    The graphic data types require twice the storage of a character string, so you trade more disk space for faster I/O performance.

    Common References

    Once you decide on a naming convention and standardize on data type usage, DDS field reference files and SQL column reference tables can help enforce these standards (for more information, see "The Essential Guide to SQL," September 2006, article ID 20636 at SystemiNetwork.com). Both of these mechanisms let you define a common set of columns and their attributes in a central repository. Then, when creating new DB2 objects, you can simply reference the column definitions in the reference table instead of duplicating the column attributes in every new table definition.

    Database Management with iSeries Navigator

    Now that you understand some key issues regarding creating new DB2 objects, let's look at the tools available for managing and administering existing databases. iSeries Navigator is the primary interface for DB2 management and interface. Although this tool is designed primarily for use with SQL-created objects, you can also use it to review the attributes for single-member physical files and logical files. (Both keyed and non-keyed logical files will be classified as SQL views within the various iSeries Navigator interfaces, such as the Schema view.)

    Figure 3 shows the available iSeries Navigator actions for a DB2 table when you right-click the object. The Description task returns the table-level settings, similar to the information returned by the Display File Description (DSPFD) command. The Definition task shows the specifications for each column in the table along with the constraint definitions. You can use the Generate SQL task to produce the actual SQL statements needed to re-create the object. If the table was created with DDS, this tool also converts your DDS specifications into the closest matching SQL statement.

    You have a couple of options if you are thinking about changing a table and want to use iSeries Navigator to figure out the related objects that might be affected by the change. The Show Related task (Figure 3) produces a textual list of the objects (e.g., indexes, constraints, views) related to the current table. This task is similar to using the Display Database Relations (DSPDBR) command. If you want to see these relationships in picture form, such as the tables associated with a journal object or the indexes over a particular table, use the Database Navigator Maps function (Figure 4).

    If you have some large DB2 objects that might frequently approach some of the DB2 maximum size limits, you can use the Health Center function to track and monitor these objects as they approach the limit. Access this feature by right-clicking the server name in the Database component of the iSeries Navigator navigation tree.

    Performance Management with iSeries Navigator

    Many performance tooling enhancements have been added to iSeries Navigator in the past two releases to simplify the process of DB2 performance tuning and management. These enhancements focus specifically on helping the tuning process for SQL-based applications and workloads.

    Before using any of these tools, you should be aware that the query optimization process that all SQL requests run through is heavily influenced by the size of your tables and the configuration of your server. Thus, even if you successfully use these tools to tune performance on your test server, you can see different performance behavior on your production server. This is because the test and production servers have differences in table sizes, memory pool configuration, number of disk arms, and processors — just some of the factors that influence the optimizer's decision making. Sound DB2 and SQL performance management should involve analyzing your SQL applications running on the production server before rolling out the new application to all of your end users.

    With this backdrop, let's review some of the performance tools available to help you manage DB2 performance on i5/OS. Current SQL and Plan Cache are two tools that look at live database performance data. You launch the Current SQL function by right-clicking the Databases object and selecting the "Current SQL for Job" action. This function shows the SQL statement that is currently running in a connection (or the last SQL statement executed, if none are currently running). Figure 5 shows the SQL statement text and additional data that is returned. You can also perform detailed analysis on the optimizer's implementation with Visual Explain.

    The SQL Plan Cache is a new addition to V5R4. With the appropriate filter, SQL Plan Cache lets you view all the SQL statements currently running on the system. You can use this tool several times a day to check on the top 10 longest-running SQL statements or the most frequently executed statements. With these filters, you can focus your detailed analysis on those statements that have the greatest impact on your overall DB2 performance. However, statements processed by Classic Query Engine (CQE) are not stored in the Plan Cache.

    DB2 also provides a way to dump the contents of the Plan Cache into a permanent object known as a snapshot. The snapshot functionality can capture Plan Cache data regularly and then compare the snapshot data when there's a change in your system performance. If you want to schedule the snapshot creation, IBM provides a stored procedure — QSYS2/DUMP_PLAN_CACHE — that can be called from a program on a scheduled basis.

    No coverage of DB2 for i5/OS performance management would be complete without a discussion about indexing. Figure 3 shows the task Show Indexes, which is also known as Index Evaluator. This evaluation tool was added to let you determine which indexes the query optimizer is actually using and which ones are just taking up space. The Index Evaluator listing includes SQL indexes, keyed logical files, and the indexes used to implement primary and foreign key constraints.

    The Index Evaluator provides input on existing indexes. In contrast, the Index Advisor provides feedback on suggested indexes from the query optimizer. The Index Advisor was enhanced in V5R4 so that it is always running. In the past, a tool such as the database monitor had to be started to get index advice back from the optimizer. Now, the index advice is available on demand; just right-click the database server name, schema, or table object, and select the Advisor task.

    Setting the Course with DB2

    An understanding of the up-front considerations for designing and implementing a database will ensure that you are fully using the relational capabilities and strengths of DB2 for i5/OS. And by leveraging iSeries Navigator's powerful graphical DB2 management interfaces, you simplify the management and care of your databases going forward. For a list of helpful database management resources, see Find Out More, below.

    Kent Milligan is a Senior DB2 for i5/OS Specialist on IBM’s ISV Enablement team for System i. Kent spent the first eight years of his IBM career as a member of the DB2 development group in Rochester. He speaks and writes regularly about relational database topics. You can reach him at kmill@us.ibm.com.


    Find Out More

    Here's a handy list of resources that have the information you need to keep your databases fit and trim.

    Articles at SystemiNetwork.com

    "5 Essential Ways to Use iSeries Navigator — SQL Plan Cache " (February 2007, article ID 20803)

    "5 Essential Ways to Use iSeries Navigator — Visual Explain " (January 2007, article ID 20791)

    "A Little Feedback Is a Good Thing " (January 2006, article ID 20380)

    "The Essential Guide to Business Intelligence " (August 2006, article ID 20633)

    "OpsNav Database: Getting a Clearer Picture " (October 2001, article ID 11438)

    "V5R4 DB2 UDB: Keeping It Simple " (April 2006, article ID 20477)

    IBM Redbooks

    DB2 Universal Database for iSeries Administration: The Graphical Way on V5R3 (SG24-6092)

    SQL Performance Diagnosis on IBM DB2 Universal Database for iSeries (SG24-6654)

    Education

    DB2 for i5/OS SQL Performance Workshop
    ibm.com/servers/eserver/iseries/service/igs/db2performance.html

    iSeries Navigator Downloadable Tutorials
    ibm.com/servers/enable/site/education/ibo/view.html?oc

    ProVIP Sponsors

    ProVIP Sponsors