Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank fields are not recognized. #707

Open
dlbfriend opened this issue Sep 2, 2024 · 3 comments
Open

Blank fields are not recognized. #707

dlbfriend opened this issue Sep 2, 2024 · 3 comments
Labels
question Further information is requested

Comments

@dlbfriend
Copy link

dlbfriend commented Sep 2, 2024

Background [Optional]

Hello
when I parse data file with cobirx, the data file and copybook was not match, if there is a null or empty,the data file does not retain space,and the following column was move backward .there is a column name data length,I use this record fixed length option ,when comes to position 40 there is incorrect values.any suggestions to solve null value was not retain space issues?

Question

if there is no space or null character in hex, can I parse data file with cobrix?

Tasks

Preview Give feedback
No tasks being tracked yet.

Tasks

Preview Give feedback
No tasks being tracked yet.
@dlbfriend dlbfriend added the question Further information is requested label Sep 2, 2024
@dlbfriend dlbfriend changed the title null was dectected in data file Blank fields are not recognized. Sep 2, 2024
@yruslan
Copy link
Collaborator

yruslan commented Sep 4, 2024

You can add .option("debug", "true") so that a debug field is added to each field of your data. This way you can see which values were wrongly converted, and can make the copybook match the data. null is usually used when the value cannot be parsed according to the specified field definition (w.g. wrong COMP-3 number, etc).

@dlbfriend
Copy link
Author

I use option as below,
val df = spark.read
.format("cobol")
.option("copybook", copybookPath)
.option("record_format", "F")
.option("record_length_field", "L-LENGTH")
.option("debug",true)
.load(inputFilePath)
L-LENGTH Is a length filed.


	"L_TR_XCHG_IND": "N",
	"L_TR_XCHG_IND_debug": "D5",
	"L_TR_DEP_DATE": 0,
	"L_TR_DEP_DATE_debug": "000000000F"
},

-- "L_TR_XCHG_DATA": {
-- "L_TR_XCHG_CURRNCY": "CE1",
-- "L_TR_XCHG_CURRNCY_debug": "C3C5F1",
-- "L_TR_XCHG_AMT_DEC": "",
-- "L_TR_XCHG_AMT_DEC_debug": "40",
-- "L_TR_XCHG_AMT_debug": "020200610F0235959C",
-- "L_TR_XCHG_RATE_debug": "F2F0F04040404040",
-- "L_TR_XCHG_OVRD": "",
-- "L_TR_XCHG_OVRD_debug": "40"

-- },
"L_TR_CAPTURE_DATA": {
"L_TR_ORIGIN": "",
"L_TR_ORIGIN_debug": "40020210",
"L_TR_OPER": "| 1",
"L_TR_OPER_debug": "624F001F0000F100",
"L_TR_SYS_DATE_debug": "00002C001C",
"L_TR_ENTRY_TIME_debug": "02020061",
"L_TR_BATCH_debug": "0F0000",
"L_TR_SEQ_debug": "000004",
"L_TR_COMMENT": "INT01 S",
"L_TR_COMMENT_debug": "59249C00000CC9D5E3F0F140E20000",

The actual value of L_TR_XCHG_DATA {} should has no data, after pasring the datafile ,the value was move backward.
and there is not any delimiter between L_TR_DEP_DATE_debug and L_TR_XCHG_CURRNCY_debug. and each record has different position with null values.
is there a method to auto identify null or empty values for datafile?

@yruslan
Copy link
Collaborator

yruslan commented Sep 9, 2024

It seems L-LENGTH does not fully reflect record length. Maybe some parts of the record are not counted towards the record size. This is why the first record seems to be parsed properly, but the second one is corrupted.

You can adjust the size by specifying arichmetic expressions in the record length field definition. For example:

# Adjust the record length by adding 4 bytes
.option("record_length_field", "L_LENGTH + 4")

or

# Adjust the record length by subtracting 4 bytes
.option("record_length_field", "L_LENGTH - 4")

Note. It is very important to use '_' in the field name there. Otherwise Cobix might confuse the dash with the minus character (for example, L - LENGTH - 4).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants