-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug or not, when using xmltoolkit/xmlservice with invalid characters in IBM i database field #178
Comments
@ChriMech You could specify the binary data type, which returns data as a hex string. In your PHP code, you could then convert the hex characters into something usable. IbmiToolkit/ToolkitApi/Toolkit.php Line 1631 in 5ce2382
Try this and let us know if you receive hex characters that you can work with. |
If you are curious what XMLSERVICE does for binary type, here is where some of the relevant XMLSERVICE code for the binary type (return as hex) is done:
|
Thank you, but to be honest, I had something different in my mind, i.e. a suggestion for a future improvement of XMLSERVICE. Sometimes by mistake, text within character based Database fields (A-Z, etc.) contain some Bytes with EBCDIC Codes below x'40'. This is often caused by improper data import from non IBM i data sources to IBM i tables, when character conversion from UTF-8 to EBCDIC CCSID 1141 is not done correctly with special german or european characters like ä, ö, ü. In many cases this is not a big problem, because most IBM i related functions substitute “bad” characters with a replacement symbol (mainly a solid block or a square). This is true within all 5250 greenscreen applications, but also when using SQL to show table contents, or when using SQL from within PHP scripts. Even the remote program call APIs included in IBM i Client Access perform this substitution automatically, not to mention the old Net.Data scripting language on IBM i. Only XMLSERVICE does nothing like this, but produces invalid xml when such bad characters are encountered. This invalid xml eventually results in fatal errors within the toolkit. So my proposal for improvement of XMLSERVICE is: |
@ChriMech No guarantees, but would these options to ignore errors help on the PHP side? Perhaps you could test your XML example with this command: |
@ChriMech Have you had a chance to test a "bad" XML string with the command above? If it helps, we could incorporate these options into the toolkit. Thank you. |
@alanseiden I use the php xmltoolkit together with CW to call RPG programs. To do the test you suggested, i modified the php code in .../xmltoolkit/1.8.5/ToolkitServiceXML.php on line 807 (within function getParamsFromXml() ) to read like this: With this, the "ugly" PHP warnings don't come up anymore, but the function call to getParamsFromXml() returns false and produces the bad.xml file. (see attached zip file bad.zip). Consequently the toolkit program call fails. The invalid character within the xml string is a x'1A' shown in lines 36 and 50 of bad.xml. |
Thanks! We'll review your results. |
Although I understand the intent behind this issue, the difficulty of deciding what is "bad" data in a particular character set will make this request unlikely to be implemented unless someone submits a PR that can be tested under various character encodings. |
I don't object closing the issue. It really is a rare case.
|
@ChriMech I also encountered this issue when creating the CW and calling various "Q*" system APIs. Certain data structure fields produced binary data. I handled them using the "hole" or "binary" data types. The difficulty would be if we could not predict the binary data in advance, as in your case. @ChriMech I don't mind keeping the issue open if we want to resolve it on the PHP side, but if it's going to be an xmlservice enhancement then we should move it to the xmlservice project. Am I correct in thinking you feel it should be part of xmlservice? I do recall, for comparison, that the Easycom toolkit replaced binary characters with spaces. In the xmlservice project, we need to get momentum to improve the test suite structure. |
I'm going to reopen this issue. We have some ideas. |
I am using php xmltoolkit with xmlservice to fetch text from an ibm i database file through a call to an RPG programm. Everything works fine, unless the character database field contains "invalid" characters below EBCDIC x'40' (i.e x'3F'). This surely should not be, but somehow the database has thousands of records with one or more of this non-text characters within a text field, which should include an article description.
When the RPG Program call fetches such a text field through an RPG program parameter, i get this :
Warning: simplexml_load_string(): Entity: line 36: parser error : CData section not finished Articl in /QOpenSys/pkgs/share/zendphp/libraries/xmltoolkit/1.8.5/ToolkitServiceXML.php on line 807
Warning: simplexml_load_string(): <data var='TEXT' type='50a' ><![CDATA[Article �« description]]></data> in /QOpenSys/pkgs/share/zendphp/libraries/xmltoolkit/1.8.5/ToolkitServiceXML.php on line 807
Warning: simplexml_load_string(): ^ in /QOpenSys/pkgs/share/zendphp/libraries/xmltoolkit/1.8.5/ToolkitServiceXML.php on line 807
Warning: simplexml_load_string(): Entity: line 36: parser error : PCDATA invalid Char value 26 in /QOpenSys/pkgs/share/zendphp/libraries/xmltoolkit/1.8.5/ToolkitServiceXML.php on line 807
The invalid characters show as
�«
My question: Is this a bug in xmlservice or ToolkitServiceXML.php or do you have any suggestions, on how i should deal with the situation?
Thank you very much,
Christian
The text was updated successfully, but these errors were encountered: