Enhancements - Performance with one thousand of hosts #12

ellis2323 · 2016-11-21T18:03:33Z

Hello,

I'm trying to use snap and your plugin to collect metrics into an influxDB. I have succeeded with one host to get the few metrics i need but now to collect 500 metrics on a thousand nodes.
My current approach is to create a task by host. I have boosted snapd.conf to allow many hundred of plugin loading... After 150 hosts, my snap telemetry has crashed. I suspect that it is not the way to do it.
I suspect that the good approach is to modify this plugin to load an array of hosts.

Best regards,

ellis2323 · 2016-11-21T18:12:30Z

In my case, i have few groups of hosts with the same metric. My idea is to create one task per group. So in my personal case, i would implement something like an array for the "snmp_agent_address" key.

kindermoumoute · 2016-11-29T18:25:12Z

Hi @ellis2323,

now to collect 500 metrics on a thousand nodes.

Can you detail more this part. Do you actually want 500 metrics * 1000 nodes?

My current approach is to create a task by host.

Are you using Tribe?

I have boosted snapd.conf to allow many hundred of plugin loading...

Do you load hundred of plugin on one node?

ellis2323 · 2016-11-29T19:19:06Z

Hello,

I'm working for an operator with thousand of routers/switchs/optical equipments, no real computer and i can't install anything on them... I'm currently testing many solutions like libreNMS, Shinken ... So when i discovered Snap Telemetric, i was hoping to use it to collect many snmp states of all equipments.
Also, Tribe is not the solution. I tried to load 300 instances of snmp + influx plugins but my VM used too much memory and crashed.

My current solution is to use telegraf, which works well with many nodes.

otsuarez · 2016-12-09T20:24:48Z

Hi,
I'm having the same issues, is there any roadmap on a solution for this scenario?
Best,

ellis2323 · 2016-12-10T09:45:09Z

I've read the code and it is possible to solve with few lines of code. The main difficulty is the error management. Today, when a snmp target doesn't respond (timeout & tries), the task stops. In my scenario, we don't want this behaviour but i'm not sure if it's the philosophy of this tool.

nanliu · 2016-12-12T17:47:20Z

@ellis2323, so you can disable this behavior by setting max-failures: -1 in the task configuration per:
https://github.com/intelsdi-x/snap/blob/master/docs/TASKS.md#max-failures

What are the other issues you've seen?

kindermoumoute added the question label Nov 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancements - Performance with one thousand of hosts #12

Enhancements - Performance with one thousand of hosts #12

ellis2323 commented Nov 21, 2016

ellis2323 commented Nov 21, 2016

kindermoumoute commented Nov 29, 2016

ellis2323 commented Nov 29, 2016 •

edited

Loading

otsuarez commented Dec 9, 2016

ellis2323 commented Dec 10, 2016

nanliu commented Dec 12, 2016

Enhancements - Performance with one thousand of hosts #12

Enhancements - Performance with one thousand of hosts #12

Comments

ellis2323 commented Nov 21, 2016

ellis2323 commented Nov 21, 2016

kindermoumoute commented Nov 29, 2016

ellis2323 commented Nov 29, 2016 • edited Loading

otsuarez commented Dec 9, 2016

ellis2323 commented Dec 10, 2016

nanliu commented Dec 12, 2016

ellis2323 commented Nov 29, 2016 •

edited

Loading