If you've gotten knee-deep into XML parsing on IBM i, you've probably discovered that XML documents that have byte order marks (BOMs) cause trouble. IBM recently released some PTFs to cure that malady.
Licpgm 57xx-999, 5.4: MF45480 5.4.5: MF45502 6.1: MF45479
BOMs are special characters optionally found at the start of a Unicode document. Their purpose is to identify how many bytes per character the document was encoded with and to specify whether the document is big endian or little endian. An XML parser should use BOMs to determine the encoding of the document. BOMs are part of Unicode, and when data is translated to another encoding (e.g., EBCDIC), they should be removed as part of that process. A bug in IBM i was preventing that from working properly--and the BOM characters were being translated into garbage characters in the new encoding. The result was that XML parsing would fail, stating that invalid characters existed in the XML document.
These PTFs update the underlying MI instruction, named XLATEMB, used by many areas of the operating system, including RPG's XML-INTO and XML-SAX opcodes.
If you had to write code to strip these characters, you no longer have to! Just apply the PTFs, and life will be good again.
Paris & Gantner recently blogged about these PTFs.