Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the way the download of exernal database is working #224

Open
arno-st opened this issue Dec 23, 2024 · 11 comments
Open

Improve the way the download of exernal database is working #224

arno-st opened this issue Dec 23, 2024 · 11 comments

Comments

@arno-st
Copy link
Contributor

arno-st commented Dec 23, 2024

In 'slow' internet access, the download of the different database can be slow.
And since it's 'inside' cacti process it can provoke some timeout on the poller.
23/12/2024 00:08:01 - POLLER: Poller[Main Poller] PID[1021061] WARNING: Cactid/Cron is out of sync with the Poller Interval! The Poller Interval is '60' seconds, with a maximum of a '60' second Cactid/Cron, but 419.4 seconds have passed since the last poll! 23/12/2024 00:07:15 - FLOWVIEW IRR UPDATE: Downloading arin.db.gz 23/12/2024 00:07:15 - FLOWVIEW IRR UPDATE: IRR Source:arin, Current Serial:4202309, Last Serial:4157315 23/12/2024 00:07:08 - FLOWVIEW IRR UPDATE: Downloading apnic.db.rtr-set.gz 23/12/2024 00:07:03 - FLOWVIEW IRR UPDATE: Downloading apnic.db.route6.gz 23/12/2024 00:07:01 - MAILER INFO: Mail successfully sent via SMTP from 'Cacti 1.2.x Prod <EMAIL>', to 'Administrator <EMAIL>', cc '', bcc '', and took 0.07 seconds, Subject 'Cacti System Warning' 23/12/2024 00:07:01 - POLLER: Poller[Main Poller] PID[1021029] WARNING: Cactid/Cron is out of sync with the Poller Interval! The Poller Interval is '60' seconds, with a maximum of a '60' second Cactid/Cron, but 359.4 seconds have passed since the last poll! 23/12/2024 00:06:58 - FLOWVIEW IRR UPDATE: Downloading apnic.db.route.gz 23/12/2024 00:06:53 - FLOWVIEW IRR UPDATE: Downloading apnic.db.route-set.gz 23/12/2024 00:06:48 - FLOWVIEW IRR UPDATE: Downloading apnic.db.role.gz 23/12/2024 00:06:43 - FLOWVIEW IRR UPDATE: Downloading apnic.db.peering-set.gz 23/12/2024 00:06:38 - FLOWVIEW IRR UPDATE: Downloading apnic.db.organisation.gz 23/12/2024 00:06:33 - FLOWVIEW IRR UPDATE: Downloading apnic.db.mntner.gz 23/12/2024 00:06:28 - FLOWVIEW IRR UPDATE: Downloading apnic.db.key-cert.gz 23/12/2024 00:06:23 - FLOWVIEW IRR UPDATE: Downloading apnic.db.irt.gz 23/12/2024 00:06:18 - FLOWVIEW IRR UPDATE: Downloading apnic.db.inetnum.gz 23/12/2024 00:06:13 - FLOWVIEW IRR UPDATE: Downloading apnic.db.inet6num.gz 23/12/2024 00:06:08 - FLOWVIEW IRR UPDATE: Downloading apnic.db.inet-rtr.gz 23/12/2024 00:06:03 - FLOWVIEW IRR UPDATE: Downloading apnic.db.filter-set.gz 23/12/2024 00:06:01 - MAILER INFO: Mail successfully sent via SMTP from 'Cacti 1.2.x Prod <EMAIL>', to 'Administrator <EMAIL>', cc '', bcc '', and took 0.06 seconds, Subject 'Cacti System Warning' 23/12/2024 00:06:01 - POLLER: Poller[Main Poller] PID[1020990] WARNING: Cactid/Cron is out of sync with the Poller Interval! The Poller Interval is '60' seconds, with a maximum of a '60' second Cactid/Cron, but 299.4 seconds have passed since the last poll! 23/12/2024 00:05:58 - FLOWVIEW IRR UPDATE: Downloading apnic.db.domain.gz 23/12/2024 00:05:53 - FLOWVIEW IRR UPDATE: Downloading apnic.db.aut-num.gz 23/12/2024 00:05:48 - FLOWVIEW IRR UPDATE: Downloading apnic.db.as-set.gz 23/12/2024 00:05:43 - FLOWVIEW IRR UPDATE: Downloading apnic.db.as-block.gz 23/12/2024 00:05:43 - FLOWVIEW IRR UPDATE: IRR Source:apnic, Current Serial:12769501, Last Serial:12760161 23/12/2024 00:05:01 - MAILER INFO: Mail successfully sent via SMTP from 'Cacti 1.2.x Prod <EMAIL>', to 'Administrator <EMAIL>', cc '', bcc '', and took 0.06 seconds, Subject 'Cacti System Warning'

Maybe if we can do a exec in the background, and have a flag to let flowview_poller_bottom, the donwload is done, the import can be done inline. Even if on this log I don't wee the import!

Maybe keep in settings table, the imported serial number, and the new serial number, so if the number is not the same, that an import has to be done.

or something like that.

@arno-st arno-st changed the title Improve the way th download of exernal database is working Improve the way the download of exernal database is working Dec 23, 2024
@TheWitness
Copy link
Member

It's supposed to be in background already.

@arno-st
Copy link
Contributor Author

arno-st commented Feb 4, 2025

I come back on this one, maybe the title is wrong, but I do have an issue, and can't find where it is.
Every night between 00:00 and 00:10 my poller is screw up, and I'm getting a lot of system error telling me that the poller output table is not empty, poller is out of synch. like those one (3 different messages, but same night, same 10 minutes windows):
WARNING: There are 47 processes detected as overrunning a polling cycle for poller id 1, please investigate. . WARNING: Cactid/Cron is out of sync with the Poller Interval for poller id 1! The Poller Interval is 60 seconds, with a maximum of a 60 seconds, but 482 seconds have passed since the last poll! . Maximum runtime of 58 seconds exceeded for poller id 1. Exiting.

If I disable flowview, every thing is fine, so it's look like something is happening at midnight, but I can't figure out wat exactly.
is the download of external DB that is taking forever to insert into the DB that lock something for too long ?
Is it that another process from flowview is doing some cleaning and lock the table ?

How can I find where is the glitch ?

@TheWitness
Copy link
Member

How large is your system (cores, threads, memory)? At midnight, it's downloading the Internet Router Registry databases from around the world. This process does a lot of I/O to the database. So, if you are on a tiny VM, you don't have enough memory or threads, the system will hang up. Also, if you have a "out of band" database back taking place, and it's not done in a suitable way, your Cacti system will additionally hang. If you are using mariabackup, you need to time that backout outside of the time that the IRR Database is being synced.

@arno-st
Copy link
Contributor Author

arno-st commented Feb 10, 2025

Nothing is under VM.

My front end is:
CPU 2x10 core @ 2.2 GHz
RAM 256 GiB
System DISk : 1*440TB ( os )
DATA disk : 2 drive mirrored 2.91 TB

The database servers, 2 system with MariaDB 10.6.17, and Galera
CPU 2x10 core @ 2.2 GHz
RAM 256 GiB
system Disk : 1*440TB ( os )
DATA Disk: 4 drive mirrored 2x2.91 TiB, 5.9TB usable.

Cacti access the DB via MaxScale configured with 1 services for Cacti, 1 for FlowView:

[CactiService]
type=service
connection_keepalive=1000ms
max_connections=16384
router=readwritesplit
servers=lslmysp12,lslmysp11
max_slave_connections=0

[FlowViewService]
type=service
max_connections=16384
router=readwritesplit
servers=lslmysp11,lslmysp12
master_accept_reads=true

@TheWitness
Copy link
Member

Okay, so with the updates I made yesterday and this morning, you can see already that the load should drop appreciably the next time you run. Just search for IRR in the Cacti log and you should get something like this. If the serials are the same, the update will be skipped, and now thanks to the UPSERT, there will be much less I/O.

Image

@TheWitness
Copy link
Member

Please report back @arno-st. Thanks!

@arno-st
Copy link
Contributor Author

arno-st commented Feb 16, 2025

I just download your fix, I will see tonight.
I got this error, but not sure it's related:
16/02/2025 10:36:24 - DBCALL DEVEL: SQL Save on table 'parallel_database_query': 'a:11:{s:2:"id";i:0;s:6:"md5sum";s:32:"5aeee44906a265d445ae55b4d40ddec5";s:13:"md5sum_tables";s:32:"0d50f648a8536504e0a5a6213189185e";s:7:"user_id";i:16;s:12:"total_shards";i:2;s:9:"map_query";s:404:"{"sql_query":"SELECT src_addr, dst_addr, SUM(flows) AS flows, SUM(bytes) AS bytes, SUM(packets) AS packets, src_domain, dst_domain","sql_where":"WHERE ((dst_addr& INET6_ATON(?) = INET6_ATON(?) ORdst_addr & INET_ATON(?) = INET_ATON(?)))","sql_having":"","sql_order":"","sql_limit":"","sql_groupby":"GROUP BY src_addr, dst_addr","sql_params":["255.255.255.0","10.0.24.0","255.255.255.0","10.0.24.0"]}";s:9:"map_range";s:30:"(start_timeBETWEEN ? AND ?)";s:16:"map_range_params";s:45:"["2025-02-15 10:24:00","2025-02-16 10:24:00"]";s:12:"reduce_query";s:352:"{"sql_query":"SELECT INET6_NTOA(src_addr) AS src_addr, INET6_NTOA(dst_addr) AS dst_addr, SUM(flows) AS flows, SUM(bytes) AS bytes, SUM(packets) AS packets, src_domain, dst_domain","sql_where":"","sql_having":"","sql_groupby":"GROUP BY INET6_NTOA(src_addr), INET6_NTOA(dst_addr)","sql_order":"ORDER BY bytes DESC","sql_limit":"LIMIT 50","sql_params":[]}";s:7:"created";s:19:"2025-02-16 10:36:24";s:12:"time_to_live";i:1739720184;}' 16/02/2025 10:36:19 - CMDPHP SQL Backtrace: (/plugins/flowview/flow_collector.php[812]:process_fv10(), /plugins/flowview/flow_collector.php[1703]:flowview_db_execute(), /plugins/flowview/database.php[70]:db_execute(), /lib/database.php[420]:db_execute_prepared()) 16/02/2025 10:36:19 - CMDPHP ERROR: A DB Exec Failed!, Error: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ' 0, 0, 'spr-itunes', '', 75, '2025-02-16 10:36:19.0', '2025-02-16 10:36:19.00...' at line 1 16/02/2025 10:36:19 - CMDPHP PHP ERROR WARNING Backtrace: (/plugins/flowview/flow_collector.php[812]:process_fv10(), /plugins/flowview/flow_collector.php[1623]:unpack(), CactiErrorHandler()) 16/02/2025 10:36:19 - ERROR PHP WARNING in Plugin 'flowview': unpack(): Type N: not enough input, need 4, have 1 in file: /usr/share/cacti/plugins/flowview/flow_collector.php on line: 1623

You wan't a new issue for it ?

@TheWitness
Copy link
Member

Yup

@arno-st
Copy link
Contributor Author

arno-st commented Feb 19, 2025

So here is my latest output:
19/02/2025 00:12:59 - FLOWVIEW IRR UPDATE: IRR Source:arin, Current Serial:4780813, Last Serial:4211078 19/02/2025 00:12:52 - FLOWVIEW IRR UPDATE: Downloading apnic.db.rtr-set.gz 19/02/2025 00:12:47 - FLOWVIEW IRR UPDATE: Downloading apnic.db.route6.gz 19/02/2025 00:12:42 - FLOWVIEW IRR UPDATE: Downloading apnic.db.route.gz 19/02/2025 00:12:37 - FLOWVIEW IRR UPDATE: Downloading apnic.db.route-set.gz 19/02/2025 00:12:32 - FLOWVIEW IRR UPDATE: Downloading apnic.db.role.gz 19/02/2025 00:12:27 - FLOWVIEW IRR UPDATE: Downloading apnic.db.peering-set.gz 19/02/2025 00:12:22 - FLOWVIEW IRR UPDATE: Downloading apnic.db.organisation.gz 19/02/2025 00:12:17 - FLOWVIEW IRR UPDATE: Downloading apnic.db.mntner.gz 19/02/2025 00:12:12 - FLOWVIEW IRR UPDATE: Downloading apnic.db.key-cert.gz 19/02/2025 00:12:07 - FLOWVIEW IRR UPDATE: Downloading apnic.db.irt.gz 19/02/2025 00:12:02 - FLOWVIEW IRR UPDATE: Downloading apnic.db.inetnum.gz 19/02/2025 00:11:57 - FLOWVIEW IRR UPDATE: Downloading apnic.db.inet6num.gz 19/02/2025 00:11:52 - FLOWVIEW IRR UPDATE: Downloading apnic.db.inet-rtr.gz 19/02/2025 00:11:47 - FLOWVIEW IRR UPDATE: Downloading apnic.db.filter-set.gz 19/02/2025 00:11:42 - FLOWVIEW IRR UPDATE: Downloading apnic.db.domain.gz 19/02/2025 00:11:37 - FLOWVIEW IRR UPDATE: Downloading apnic.db.aut-num.gz 19/02/2025 00:11:31 - FLOWVIEW IRR UPDATE: Downloading apnic.db.as-set.gz 19/02/2025 00:11:26 - FLOWVIEW IRR UPDATE: Downloading apnic.db.as-block.gz 19/02/2025 00:11:26 - FLOWVIEW IRR UPDATE: IRR Source:apnic, Current Serial:12918836, Last Serial:12771613 19/02/2025 00:07:17 - FLOWVIEW IRR UPDATE: Downloading altdb.db.gz 19/02/2025 00:07:17 - FLOWVIEW IRR UPDATE: IRR Source:altdb, Current Serial:138363, Last Serial:135847 19/02/2025 00:07:10 - FLOWVIEW IRR UPDATE: Downloading afrinic.db.gz 19/02/2025 00:07:10 - FLOWVIEW IRR UPDATE: IRR Source:afrinic, Current Serial:1524828, Last Serial:1499612

And my problem is during the first 2 download/import

@TheWitness
Copy link
Member

It looks like it's downloading some things for the first time. Any warnings in the log about being unable to reach a site?

@TheWitness
Copy link
Member

When everything is running correctly, you will see every day "Current Serial and Last Serial", and when they are different, it will re-update the database for that IRR Provider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants