Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spine should be able to process a value regex on output or input-output. #210

Open
godsyn opened this issue Jun 23, 2021 · 17 comments
Open
Milestone

Comments

@godsyn
Copy link

godsyn commented Jun 23, 2021

From a verbose query:

Click to show Data Query output for field 'volTotalSize2'
Total: 13.690000, Delta: 0.000000, Located input field 'volTotalSize2' [walk]
Total: 18.810000, Delta: 5.120000, Executing SNMP walk for data @ '.1.3.6.1.4.1.24681.2.2.19.1.3'
Found item [volTotalSize2='50.7 (50.7 TB)'] index: 1 [from regexp value parse]
Click to show Data Query output for field 'volUsedSize2'
Total: 18.810000, Delta: 0.000000, Located input field 'volUsedSize2' [walk]
Total: 21.320000, Delta: 2.510000, Executing SNMP walk for data @ '.1.3.6.1.4.1.24681.2.2.19.1.4'
Found item [volUsedSize2='23.0 (23.0 TB)'] index: 1 [from regexp value parse]

The XML used:

<interface>
        <name>QNAP Volumes via SNMP</name>
        <description>Queries a QNAP NAS for a list of monitorable volume</description>
        <oid_index>.1.3.6.1.4.1.24681.2.2.19.1.1</oid_index>
        <oid_num_indexes>.1.3.6.1.4.1.24681.2.2.18</oid_num_indexes>
        <index_order>volDescr:volStatus:volIndex</index_order>
        <index_order_type>numeric</index_order_type>
        <index_title_format>|chosen_order_field|</index_title_format>

        <fields>
                <volIndex>
                        <name>Index</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>input</direction>
                        <oid>.1.3.6.1.4.1.24681.2.2.19.1.1</oid>
                </volIndex>
                <volDescr>
                        <name>Description</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>input</direction>
                        <oid>.1.3.6.1.4.1.24681.2.2.19.1.2</oid>
                </volDescr>
                <volTotalSize2>
                        <name>Total Size</name>
                        <method>walk</method>
                        <source>VALUE/REGEXP:([0-9]*(\.[0-9]*)?) TB$</source>
                        <direction>input-output</direction>
                        <oid>.1.3.6.1.4.1.24681.2.2.19.1.3</oid>
                </volTotalSize2>
                <volUsedSize2>
                        <name>Used Size</name>
                        <method>walk</method>
                        <source>VALUE/REGEXP:([0-9]*(\.[0-9]*)?) TB$</source>
                        <direction>input-output</direction>
                        <oid>.1.3.6.1.4.1.24681.2.2.19.1.4</oid>
                </volUsedSize2>
                <volStatus>
                        <name>Status</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>input</direction>
                        <oid>.1.3.6.1.4.1.24681.2.2.19.1.5</oid>
                </volStatus>
        </fields>
</interface>

The results received from cacti.log using spine:

2021-06-23 18:12:03 - SPINE: Poller[Main Poller] PID[15243] Device[SAN6 - QNAP ((IP removed))] HT[1] DS[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0] Graphs[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0 ] TT[2326.70] SNMP: v2: (IP removed), dsname: volTotalSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.3.1, value: 50.7
2021-06-23 18:12:01 - SPINE: Poller[Main Poller] PID[15243] Device[SAN6 - QNAP ((IP removed))] HT[1] DS[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0] Graphs[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0 ] TT[1188.25] SNMP: v2: (IP removed), dsname: volUsedSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.4.1, value: U
2021-06-23 18:12:01 - SPINE: Poller[Main Poller] PID[15243] WARNING: Invalid Response, Device[SAN6 - QNAP ((IP removed))] HT[1] DS[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0] Graphs[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0 ] SNMP: v2: (IP removed), dsname: volUsedSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.4.1, value: 23.0 TB 

The results received from cacti.log using cmd.php:

2021-06-23 18:17:03 - POLLER: Poller[Main Poller] PID[17114] Device[SAN6 - QNAP ((IP removed))] DS[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0] Graphs[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0 ] TT[1237.91] SNMP: v2: (IP removed), dsname: volUsedSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.4.1, output: 23.0
2021-06-23 18:17:01 - POLLER: Poller[Main Poller] PID[17114] Device[SAN6 - QNAP ((IP removed))] DS[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0] Graphs[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0 ] TT[1302.64] SNMP: v2: (IP removed), dsname: volTotalSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.3.1, output: 50.7 

Cacti and spine version: 1.2.17

@TheWitness
Copy link
Member

Increase your timeouts, I'm sure the QNAP is just not keeping up. This is a common error. Also, watch your MAX OID's at the Device level. Some SNMP Agents don't handle settings over 1 well.

@godsyn
Copy link
Author

godsyn commented Jun 28, 2021

Upped timeouts to 100000. 1 OID was already set.
Same results.
Using cmd.php, no issue.
Using spine:

2021-06-28 13:27:01 - SPINE: Poller[Main Poller] PID[2645189] WARNING: Invalid Response, Device[SAN6 - QNAP ((IP removed))] HT[1] DS[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0] Graphs[SAN6 - QNAP ((IP removed)) - Space - Lun0San6_0 ] SNMP: v2: (IP removed), dsname: volUsedSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.4.1, value: 29.3 TB

@netniV
Copy link
Member

netniV commented Jun 28, 2021

What do you see if you run this in debug mode manually?

<path_to_spine>/spine -f <host_id> -l <host_id> -C <path to spine config>/spine.conf -R -V 5

@godsyn
Copy link
Author

godsyn commented Jul 6, 2021

user@cacti:~$ sudo /usr/local/bin/spine -f 39 -l 39 -C /etc/spine.conf -R -V 5
SPINE: Using spine config file [/etc/spine.conf]
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The path_php_server variable is /var/www/cacti/script_server.php
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The path_cactilog variable is /var/www/cacti/log/cacti.log
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The log_destination variable is 1 (FILE)
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The path_php variable is /bin/php
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The availability_method variable is 2
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The ping_recovery_count variable is 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The ping_failure_count variable is 2
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The ping_method variable is 2
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The ping_retries variable is 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The ping_timeout variable is 400
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The snmp_retries variable is 3
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The log_perror variable is 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The log_pwarn variable is 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The boost_redirect variable is 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The boost_rrd_update_enable variable is 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The log_pstats variable is 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The threads variable is 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The polling interval is 60 seconds
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The number of concurrent processes is 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The script timeout is 10
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The selective_device_debug variable is
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The spine_log_level variable is 2
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The number of php script servers to run is 5
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: StartDevice='39', EndDevice='39', TotalPHPScripts='1'
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The PHP Script Server is Required
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The Maximum SNMP OID Get Size is 10
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Total Connections made 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Creating Local Connection Pool of 1 threads.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Creating Local Connection 0.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Total Connections made 2
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Version 1.2.17 starting
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: MySQL is Thread Safe!
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Spine running as 0 UID, 0 EUID
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Spine is running as root.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Spine has got ICMP
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Initializing Net-SNMP API
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Issues with SNMP Header Version information, assuming old version of Net-SNMP.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Initializing PHP Script Server(s)
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[0] PHP Script Server Routine Starting
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[0] PHP Script Server About to FORK Child Process
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[0] PHP Script Server Child FORK Success
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[0] Confirmed PHP Script Server running using readfd[8], writefd[7]
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[1] PHP Script Server Routine Starting
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[1] PHP Script Server About to FORK Child Process
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[1] PHP Script Server Child FORK Success
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[1] Confirmed PHP Script Server running using readfd[10], writefd[9]
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[2] PHP Script Server Routine Starting
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[2] PHP Script Server About to FORK Child Process
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[2] PHP Script Server Child FORK Success
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[2] Confirmed PHP Script Server running using readfd[12], writefd[11]
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[3] PHP Script Server Routine Starting
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[3] PHP Script Server About to FORK Child Process
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[3] PHP Script Server Child FORK Success
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[3] Confirmed PHP Script Server running using readfd[14], writefd[13]
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[4] PHP Script Server Routine Starting
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[4] PHP Script Server About to FORK Child Process
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[4] PHP Script Server Child FORK Success
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: SS[4] Confirmed PHP Script Server running using readfd[16], writefd[15]
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Spine will support multithread device polling.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Initial Value of Active Threads is 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Valid Thread to be Created
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Active Threads is 1, Pending is 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: In Poller, About to Start Polling of Device for Device ID 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Traversing Local Connection Pool for free connection.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Checking Local Pool ID 0.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Allocating Local Pool ID 0.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[0] HT[1] Total Time: 0.00083 Seconds
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Freeing Local Pool ID 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Device[0] HT[1] DEBUG: HOST COMPLETE: About to Exit Device Polling Thread Function
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: The Value of Active Threads is 0 for Device ID 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Active Threads is 0, Pending is 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Valid Thread to be Created
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Active Threads is 1, Pending is 1
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: In Poller, About to Start Polling of Device for Device ID 39
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Traversing Local Connection Pool for free connection.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Checking Local Pool ID 0.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Allocating Local Pool ID 0.
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] INFO: SNMP Device '(IP removed)' has timeout 100000000 (100000), retries 3
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] DEBUG: Entering SNMP Ping
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] SNMP Result: Device responded to SNMP
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] Checking for System Information Update
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] Updating Full System Information Table
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] DEBUG: Device[39] HT[1] RECACHE: Processing 4 items in the auto reindex cache for '(IP removed)'
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DQ[4] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 128377288 < output: 128379905)
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DQ[6] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 128377288 < output: 128379905)
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DQ[7] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 128377288 < output: 128379905)
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DQ[11] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 128377288 < output: 128379905)
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] NOTE: There are '10' Polling Items for this Device
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] INFO: SNMP Device '(IP removed)' has timeout 100000000 (100000), retries 3
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[577] TT[0.19] SNMP: v2: (IP removed), dsname: load_15min, oid: .1.3.6.1.4.1.2021.10.1.3.3, value: 4.17
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[582] TT[1.66] SNMP: v2: (IP removed), dsname: mem_total, oid: .1.3.6.1.4.1.2021.4.5.0, value: 47853204
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[592] TT[0.19] SNMP: v2: (IP removed), dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.5, value: 1794772743
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[592] TT[0.15] SNMP: v2: (IP removed), dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.5, value: 3399406336
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[599] TT[0.15] SNMP: v2: (IP removed), dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.9, value: 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[599] TT[0.17] SNMP: v2: (IP removed), dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.9, value: 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[599] TT[0.26] SNMP: v2: (IP removed), dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.9, value: 0
2021-07-06 12:23:27 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[599] TT[0.51] SNMP: v2: (IP removed), dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.9, value: 0
2021-07-06 12:23:29 - SPINE: Poller[1] PID[3624258] WARNING: Invalid Response, Device[39] HT[1] DS[602] SNMP: v2: (IP removed), dsname: volUsedSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.4.1, value: 28.9 TB
2021-07-06 12:23:29 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[602] TT[1906.25] SNMP: v2: (IP removed), dsname: volUsedSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.4.1, value: U
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] DS[602] TT[3154.81] SNMP: v2: (IP removed), dsname: volTotalSize2, oid: .1.3.6.1.4.1.24681.2.2.19.1.3.1, value: 50.7
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: Setting up writes to local database
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] Device[39] HT[1] Total Time: 3.2 Seconds
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: Freeing Local Pool ID 0
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: Device[39] HT[1] DEBUG: HOST COMPLETE: About to Exit Device Polling Thread Function
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: The Value of Active Threads is 0 for Device ID 39
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] Active Threads is 0, Pending is 0
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] The Final Value of Threads is 0
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: Closing Local Connection Pool ID 0
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: Thread Cleanup Complete
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: SS[0] Script Server Shutdown Started
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: SS[1] Script Server Shutdown Started
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: SS[2] Script Server Shutdown Started
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: SS[3] Script Server Shutdown Started
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: SS[4] Script Server Shutdown Started
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: PHP Script Server Pipes Closed
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: Allocated Variable Memory Freed
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: MYSQL Free & Close Completed
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] DEBUG: Net-SNMP Close Completed
2021-07-06 12:23:31 - SPINE: Poller[1] PID[3624258] Time: 3.5355 s, Threads: 1, Devices: 2

@netniV netniV transferred this issue from Cacti/cacti Jul 6, 2021
@godsyn
Copy link
Author

godsyn commented Jul 8, 2021

Issue persists in 1.2.18 (cacti and spine)

@netniV
Copy link
Member

netniV commented Jul 9, 2021

Yes, we are working towards a patch this weekend that will resolve the issues.

@TheWitness
Copy link
Member

The issue is that the snmpagent is returning a suffix of "TB", which means the response is likely a string. Can you run the folloiwng command:

snmpget -c blah -v2c hostname .1.3.6.1.4.1.24681.2.2.19.1.4.1

Post the output.

@TheWitness
Copy link
Member

Spine itself does not process the regex to remove the numeric suffix. There has been a feature request for this in the very past, but no one delivered a pull request to adopt it. So, you should move your Data Query to a Script or Script server based one.

@TheWitness TheWitness changed the title Spine not returning results when polling. Failes to honor 2nd (identical) regex. Spine should be able to process a value regex on output or input-output. Sep 5, 2021
@netniV
Copy link
Member

netniV commented Sep 6, 2021

@TheWitness if you can remind me which regex feature/code block this relates to in the core, I can make an attempt. I'll do it on it's own branch for testing, as I don't want to mess too much with the stability for 1.2.19 since we have now got things settled down again.

@TheWitness
Copy link
Member

TheWitness commented Sep 6, 2021

The real issue is this in the XML file: <source>VALUE/REGEXP:([0-9]*(\.[0-9]*)?) TB$</source>. It implies that the output from the snmpget should be regexed. This is fine, and the pcre2 regex functions make it easy. However, the problem is that there is no column in the poller_item table currently used to hold a regex. There are two columns that could be repurposed for that. Those are arg2 and arg3 (we never ever used them) in all my 19 years of Cacti.

What I would support, from a design perspective, would be the following:

  1. Rename arg2 to regex
  2. Have the various functions that update the poller_item table to add the regex from the xml file or from the data input method, for simple scripts.
  3. Update spine to read the regex column into memory, and then if it's nonblank, execute the regex_replace on the output data.
  4. Update cmd.php to do the same.
  5. Drop column arg3.
  6. Possibly rename arg1 to something else like data/arguments/command.

The issue with not doing it in the poller_item table is that it will otherwise require a bunch of file open/close activities, and some caching algo for all the potentially millions of data sources that Cacti will need to process. Not pretty.

@netniV netniV added this to the v1.3.0 milestone Sep 7, 2021
@oijkn
Copy link

oijkn commented Feb 8, 2022

My case may be similar, spine returns a value: U using the Get SNMP Data (Indexed) method of Cacti.

On the other hand, with the snmpwalk command, the value returned is the expected one.

This problem is present on my two Cacti environments.
The first one uses a multi-poller architecture (FreeBSD 12/Cacti v1.2.16/Spine v1.2.15).
The second one uses the normal mode (FreeBSD 12/Cacti v1.2.19/Spine v1.2.17).

The XML used:

<interface>
	<name>Get SNMP Optical</name>
	<description>Queries a host for optical</description>
	<oid_index>.1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20</oid_index>
	<oid_index_parse>OID/REGEXP:.*\.([0-9]\.[0-9]\.[0-9]{1,3})</oid_index_parse>
	<index_order>Index</index_order>
	<index_order_type>alphabetic</index_order_type>
	<fields>
		<Index>
			<name>Index</name>
			<method>walk</method>
			<source>index</source>
			<direction>input</direction>
		</Index>
		<SlotCardPort>
			<name>Slot/Card/Port</name>
			<method>walk</method>
			<source>OIDVALUE/REGEXP:.*([0-9]+)\.([0-9]+)\.([0-9]+):\1/\2/\3</source>
			<direction>input</direction>
			<oid>.1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20</oid>
		</SlotCardPort>
		<OpticalTx>
			<name>Optical Power Tx</name>
			<method>get</method>
			<source>VALUE/REGEXP:(^[-]?[0-9]+\.[0-9]+) dBm$</source>
			<direction>input-output</direction>
			<oid>.1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20</oid>
		</OpticalTx>
		<OpticalRx>
			<name>Optical Power Rx</name>
			<method>get</method>
			<source>VALUE/REGEXP:(^[-]?[0-9]+\.[0-9]+) dBm$</source>
			<direction>input-output</direction>
			<oid>.1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21</oid>
		</OpticalRx>
	</fields>
</interface>

Spine v1.2.15 cli result:

Device[2940] HT[1] RECACHE: Processing 1 items in the auto reindex cache for 'xxx.xxx.xxx.xxx
Device[2940] HT[1] DQ[32] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 119055037 < output: 119074376)
Device[2940] HT[1] NOTE: There are '15' Polling Items for this Device
Device[2940] INFO: SNMP Device 'xxx.xxx.xxx.xxx' has timeout 2000000 (2000), retries 2
Device[2940] HT[1] DS[661093] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.1.1, value: U
Device[2940] HT[1] DS[661093] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.1.1, value: U
Device[2940] HT[1] DS[581509] SNMP: v2: xxx.xxx.xxx.xxx, dsname: ucd_sysUptime, oid: .1.3.6.1.2.1.1.3.0, value: 119074387
Device[2940] HT[1] DS[581516] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.1.2, value: U
Device[2940] HT[1] DS[581516] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.1.2, value: U
Device[2940] HT[1] DS[581517] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.2.1, value: U
Device[2940] HT[1] DS[581517] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.2.1, value: U
Device[2940] HT[1] DS[581518] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.2.2, value: U
Device[2940] HT[1] DS[581518] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.2.2, value: U
Device[2940] HT[1] DS[581519] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.1, value: U
Device[2940] HT[1] DS[581519] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.1, value: U
Device[2940] HT[1] DS[581520] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.2, value: U
Device[2940] HT[1] DS[581520] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.2, value: U
Device[2940] HT[1] DS[581521] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.3, value: U
Device[2940] HT[1] DS[581521] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.3, value: -0.06

Spine v1.2.17 cli result:

Total[0.6737] DEBUG: Device[91] HT[1] RECACHE: Processing 1 items in the auto reindex cache for 'xxx.xxx.xxx.xxx'
Total[0.6819] Device[91] HT[1] DQ[50] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 118136509 < output: 118140586)
Total[0.6849] Device[91] HT[1] NOTE: There are '15' Polling Items for this Device
Total[0.6850] Device[91] INFO: SNMP Device 'xxx.xxx.xxx.xxx' has timeout 2500000 (2500), retries 3
Total[0.7975] Device[91] HT[1] DS[1611] TT[112.36] SNMP: v2: xxx.xxx.xxx.xxx, dsname: ucd_sysUptime, oid: .1.3.6.1.2.1.1.3.0, value: 118140588
Total[0.7976] Device[91] HT[1] DS[1612] TT[112.46] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.1.1, value: U
Total[0.7976] Device[91] HT[1] DS[1612] TT[112.50] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.1.1, value: U
Total[0.7976] Device[91] HT[1] DS[1613] TT[112.52] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.1.2, value: U
Total[0.7977] Device[91] HT[1] DS[1613] TT[112.54] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.1.2, value: U
Total[0.7977] Device[91] HT[1] DS[1614] TT[112.56] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.2.1, value: U
Total[0.7977] Device[91] HT[1] DS[1614] TT[112.58] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.2.1, value: U
Total[0.7977] Device[91] HT[1] DS[1615] TT[112.60] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.2.2, value: U
Total[0.7977] Device[91] HT[1] DS[1615] TT[112.63] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.2.2, value: U
Total[0.7978] Device[91] HT[1] DS[1616] TT[112.66] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.1, value: U
Total[0.8602] Device[91] HT[1] DS[1616] TT[62.41] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.1, value: -5.05
Total[0.8605] Device[91] HT[1] DS[1617] TT[62.68] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.2, value: -4.61
Total[0.8607] Device[91] HT[1] DS[1617] TT[62.84] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.2, value: -5.12
Total[0.8608] Device[91] HT[1] DS[1618] TT[62.99] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.3, value: -14.5
Total[0.8609] Device[91] HT[1] DS[1618] TT[63.14] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.3, value: -0.01

Snmpwalk cli result:

root@cacti:/root#snmpwalk -v2c -c ***** xxx.xxx.xxx.xxx .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.20.1.1.1 = STRING: "-3.49 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.20.1.1.2 = STRING: "-0.38 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.20.1.2.1 = STRING: "-5.79 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.20.1.2.2 = STRING: "-4.23 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.20.1.6.1 = STRING: "-5.05 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.20.1.6.2 = STRING: "-5.12 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.20.1.6.3 = STRING: "-0.01 dBm"

root@cacti:/root#snmpwalk -v2c -c ***** xxx.xxx.xxx.xxx .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.21.1.1.1 = STRING: "-2.49 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.21.1.1.2 = STRING: "-6.61 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.21.1.2.1 = STRING: "-5.75 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.21.1.2.2 = STRING: "-5.51 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.21.1.6.1 = STRING: "-5.75 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.21.1.6.2 = STRING: "-4.61 dBm"
SNMPv2-SMI::enterprises.7737.7.2.9.2.1.8.1.1.21.1.6.3 = STRING: "-14.5 dBm"

@TheWitness
Copy link
Member

You may have to reduce your max_oid's a bit. What is it set to?

@oijkn
Copy link

oijkn commented Feb 8, 2022

You may have to reduce your max_oid's a bit. What is it set to?

It's set to 1

@TheWitness
Copy link
Member

What do you have that set at?

@oijkn
Copy link

oijkn commented Feb 9, 2022

What do you have that set at?

The configured value for the Maximum OIDs Per Get Request is 1.

Just for curiosity, I increased this value to 30 and now Spine returns all expected values.

Device[2940] HT[1] RECACHE: Processing 1 items in the auto reindex cache for 'xxx.xxx.xxx.xxx
Device[2940] HT[1] DQ[32] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 123314694 < output: 123334429)
Device[2940] HT[1] NOTE: There are '15' Polling Items for this Device
Device[2940] INFO: SNMP Device 'xxx.xxx.xxx.xxx' has timeout 2000000 (2000), retries 2
Device[2940] HT[1] DS[661093] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.1.1, value: -2.52
Device[2940] HT[1] DS[661093] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.1.1, value: -3.52
Device[2940] HT[1] DS[581509] SNMP: v2: xxx.xxx.xxx.xxx, dsname: ucd_sysUptime, oid: .1.3.6.1.2.1.1.3.0, value: 123334434
Device[2940] HT[1] DS[581516] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.1.2, value: -6.58
Device[2940] HT[1] DS[581516] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.1.2, value: -0.39
Device[2940] HT[1] DS[581517] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.2.1, value: -5.75
Device[2940] HT[1] DS[581517] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.2.1, value: -5.79
Device[2940] HT[1] DS[581518] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.2.2, value: -5.52
Device[2940] HT[1] DS[581518] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.2.2, value: -4.24
Device[2940] HT[1] DS[581519] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.1, value: -5.76
Device[2940] HT[1] DS[581519] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.1, value: -5.05
Device[2940] HT[1] DS[581520] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.2, value: -4.59
Device[2940] HT[1] DS[581520] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.2, value: -5.12
Device[2940] HT[1] DS[581521] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalRx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.21.1.6.3, value: -14.5
Device[2940] HT[1] DS[581521] SNMP: v2: xxx.xxx.xxx.xxx, dsname: OpticalTx, oid: .1.3.6.1.4.1.7737.7.2.9.2.1.8.1.1.20.1.6.3, value: -0.02

@TheWitness
Copy link
Member

That's quite off. Sounds like maybe an agent bug. Can you do a tcpdump with boat setup. You can setup filters to limit the downstream ip address and try to not use a community string that you are okay sharing. The you can email the to [email protected]. Once I review them I can get more information on what happening.

Can you mention the make and model of the downstream device?

@TheWitness
Copy link
Member

Okay QNAS. Second best in the SOHO NAS space. Let me know if it's still under support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants