Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gluon-statistics-mcast: adding initial package #2367

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

T-X
Copy link
Contributor

@T-X T-X commented Jan 1, 2022

This package collects disected statistics on multicast
and management traffic and provides them via respondd in
"statistics-extended". Such information can be helpful
for further development and comparison of mesh routing
protocols and troubleshooting network issues.

The statistics gathered disect into batman-adv
packets as well. Statistics are collected both on client
bridge ports and br-client as well as mesh interfaces.

This introduces a performance penalty but tries to
keep the impact on unicast traffic small.

Note that respondd replies are rather large
(several UDP datagrams, even when LZMA compressed).
Therefore you might want to use slower intervals for
querying respondd "statistics-extended".

Signed-off-by: Linus Lüssing <[email protected]>


This PR requires / depends on the libpcap batman-adv support patch provided in a separate PR here: #2366

@T-X T-X force-pushed the pr-gluon-statistics-mcast branch 2 times, most recently from 5fba865 to 3e07c5f Compare January 5, 2022 20:21
@T-X
Copy link
Contributor Author

T-X commented Jan 5, 2022

Changelog v2:

  • adding some more double quotes to setup and teardown shell scripts to make the linter happy

Note: The linter still complains about the following:

package/gluon-statistics-mcast/files/lib/gluon/core/mesh/teardown.d/40-bpfcountd:3:7: warning: Possible misspelling: IFNAME may not be assigned, but ifname is. [SC2153]

Which is bogus, that's exactly what this line there is supposed to check: https://github.com/freifunk-gluon/gluon/pull/2367/files#diff-808af07a6378d71a5b65b5614fe138b14d17d21f8d4c068ecc31e97813b97592R3. And the linter seems to get confused about these two distinct variables, the global "IFNAME" vs. the function local "ifname".

@neocturne
Copy link
Member

@T-X You can suppress the warning using # shellcheck disable=SC2153, or choose more distinct variable names.

@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 3e07c5f to 8e22b67 Compare January 9, 2022 19:10
@T-X
Copy link
Contributor Author

T-X commented Jan 9, 2022

Changelog v3:

  • renamed a shell script variable from "ifname" to "uci_ifname" to work around the following, wrong linter warning:

package/gluon-statistics-mcast/files/lib/gluon/core/mesh/teardown.d/40-bpfcountd:3:7: warning: Possible misspelling: IFNAME may not be assigned, but ifname is. [SC2153]

@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 8e22b67 to 06d95f2 Compare January 18, 2022 13:04
@T-X
Copy link
Contributor Author

T-X commented Jan 18, 2022

Changelog v4:

  • Added more protocols to track for their multicast packets:
  • Fixed gluon-bpfcountd cleanup for a wifi AP interfaces
    • before tried to find bridge ports of br-client via /etc/config/network/.ifname, which would leave out the wifi AP interface(s)) -> fixing this by checking current bridge ports and storing it for later cleanup on ifdown (on ifdown, the bridge is already gone in hotplug script, so we need to store that information in advance)
  • Added mmfd0 interface tracking support
  • Added ip(6) protocols MANET and OSPF to allowed unicast pre-filter (PREFILTER_IP{4,6}_UC)

Tested in BABEL VMs that babel/l3roamd/mmfd multicast packets are tracked fine (note that for instance mmfd seems to forward via individual unicast transmissions? - so for overall protocol overhead evaluation the multicast packet counts might not be complete for BABEL networks).

@neocturne
Copy link
Member

neocturne commented Jan 24, 2022

The bpfcountd package could be added to https://github.com/freifunk-gluon/packages instead. It might also make sense to submit it for https://github.com/openwrt/packages, it looks quite useful. Please also add a link to the tool's README to the Makefile's description field.

set bpfcountd.${SECTION_OUT}.filterfile='/lib/gluon/bpfcountd/mesh.filters'
set bpfcountd.${SECTION_OUT}.disabled='0'

commit bpfcountd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing files to flash from setup.d scripts should be avoided. Better options would be to either run bpfcountd directly without procd, or to create a separate initscript that uses configuration stored in /var (with or without UCI).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually stored in /var, more precisely in /var/run/bpfcountd/gluon-config :-). See UCI_CONFIG_DIR.

Copy link
Member

@neocturne neocturne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a very superficial review - I'd like to have a closer look soon.

package/gluon-statistics-mcast/src/respondd.c Outdated Show resolved Hide resolved
package/gluon-statistics-mcast/src/respondd.c Outdated Show resolved Hide resolved
package/gluon-statistics-mcast/src/respondd.c Outdated Show resolved Hide resolved
[CAP_OUT] = "out",
};

static int check_socket_suffix(char *socketname)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use const char *. There are lots more arguments that should be const in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgh, I just realized how evil strchr() is. Why the hell can it return a "char *" from a "const char *"??

For socketname it seems okay to change it to const (as long as I'm not missing something). But with parse_line() I probably shouldn't change "char *line" to "const char *line", as there is a *start = '\0'; in there, which modifies *line.

package/gluon-statistics-mcast/src/respondd.c Outdated Show resolved Hide resolved
package/gluon-statistics-mcast/src/respondd.c Outdated Show resolved Hide resolved
@neocturne
Copy link
Member

Note that respondd replies are rather large
(several UDP datagrams, even when LZMA compressed).
Therefore you might want to use slower intervals for
querying respondd "statistics-extended".

Is respondd really the right tool to provide this data? UDP is lossy, even more so when many nodes send long, fragmented packets more or less at the same time... How long does the data generation take? I'm wondering if the data might be provided through uhttpd instead...

@T-X
Copy link
Contributor Author

T-X commented Mar 10, 2022

Is respondd really the right tool to provide this data? UDP is lossy, even more so when many nodes send long, fragmented packets more or less at the same time... How long does the data generation take? I'm wondering if the data might be provided through uhttpd instead...

Yes, had been wondering about this, too, relying on fragments is not really nice... At least we have support for delayed responses since freifunk-gluon/packages@e7fbcf7.

For an HTTP / uhttpd provider I was wondering whether we should put more thought on a unified format before adding that. As it might be interesting to add other detailed statistics later for all kinds of data. Or maybe we might want to move the existing statistics provided in respondd to HTTP at some point, too?

An alternative I was thinking about:

Adding TCP support to respondd. For all the respondd providers we have so far a respondd querier should of course keep using UDP. But then for more verbose providers, like statistics-extended, a respondd querier could then just cherry-pick it via a unicast+TCP query from the few, specific nodes which offer this provider. We could add a list of available respondd providers to the "nodeinfo" provider, so that a respondd statistics-extended querier knows in advance which few nodes it can query for it via unicast/TCP.

This should basically be a similar packet overhead as HTTP in the end?

@T-X
Copy link
Contributor Author

T-X commented Mar 10, 2022

Or another, easy way would be to add a small, generic wrapper to http://.../cgi-bin/respondd. Which could receive a respondd query encoded as a HTTP POST. And cgi-bin-respondd could proxy it as UDP to "[::1]:1001", which could take advantage of the higher, 65536 bytes MTU on the "lo" interface. And the UDP reply from [::1]:1001 would then be sent back via HTTP in one or more TCP segments.

@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 06d95f2 to e06c926 Compare March 17, 2022 00:51
@T-X
Copy link
Contributor Author

T-X commented Mar 17, 2022

Changelog v5:

  • rebased to current master (no changes)
  • adding Reticulum default multicast peer discovery port 29716 (https://github.com/markqvist/Reticulum, https://github.com/markqvist/Reticulum/blob/master/RNS/Interfaces/AutoInterface.py#L12)
  • removing "bpfcountd." prefix from socket files for simplified parsing
  • changing __CAP_COUNT to CAP_MAX, adding CAP_UNDEF for error return code
  • changing "unsigned long long __num" to "unsigned long long value"
  • removing some pointers for int64_t bytes/packets to function passing
  • adding some "const" for socketname to function passing
  • adding/updating some code comments
  • instead of skipping "." and "..", skip all files/directories starting with "."
  • moved "check_socket_suffix()" check
  • simplified parse_socket_dir() by using regex.h
  • simplified parse_socket_iface() by removing iface_{start,end} pointers

Still ToDo:

  • more reliable transfer way than UDP via IP fragments
  • also remeasured statistics-extended message sizes on an actual 2.4+5ghz node, which has more interfaces than my previous VM tests:
compression algorithm size
uncompressed 416 kB
deflate compressed 21 kB

With xz instead of deflate the following results would in theory be possible:

compression algorithm size
xz (-6 / default) compressed 5.7 kB
xz -6e compressed 5.2 kB
xz -9 compressed 5.7 kB
xz -9e compressed 5.2 kB
xz -3 compressed 6.2 kB
xz -3e compressed 5.2 kB
xz -0 compressed 9.2 kB
xz -0e compressed 5.2 kB

(results will vary with different multicast traffic patterns in the network)

@T-X
Copy link
Contributor Author

T-X commented Mar 22, 2022

Changelog v6:

  • adding commit "gluon-status-page: add generic respondd cgi-bin"

This allows to query bulky respondd providers like the "statistics-extended" introduced in this PR over HTTP instead of UDP, to avoid retrieving the UDP data in multiple IP fragments.

Together with freifunk-gluon/packages#256 this would allow a respondd querier to retrieve the multicast statistics as follows:

Every n minutes:
  send UDP respondd query "GET nodeinfo respondd-providers" -> [ff05::2:1001]:1001

  for each reply:
    if reply:respondd-providers->statistics-extended:
      HTTP GET "http://<nodeip>/cgi-bin/respondd?statistics-extended"

@T-X T-X mentioned this pull request Mar 22, 2022
4 tasks
@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from d01840d to 0559868 Compare March 24, 2022 00:01
@T-X
Copy link
Contributor Author

T-X commented Mar 24, 2022

Changelog v7:

  • rebased to master (no changes)
  • fixed a typo in the label for RETICULUM (vs. RETICULIM) protocol
  • added OLSRv1 UDP port (698) for counting (OLSRv2 already present under the MANET labels)

@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 0559868 to 0b61595 Compare April 4, 2022 14:24
@T-X
Copy link
Contributor Author

T-X commented Apr 4, 2022

Changelog v8:

  • use colon separator for BLA, too (BLA-CLAIM, BLA-UNCLAIM, etc. to BLA:CLAIM, BLA:UNCLAIM, etc.)

@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 0b61595 to 0f3ee91 Compare June 26, 2022 19:48
@T-X
Copy link
Contributor Author

T-X commented Jun 26, 2022

Changelog v9:

  • rebased to current master (no changes/conflicts)
  • bpfcountd: updating bpfcountd to add support for libpcap buffer size configuration option, adding buffersize option to init script
  • gluon-statistics-mcast: reducing libpcap buffer size from 2MB (default) to 128KB -> reduces overall memory usage of gluon-statistics-mcast to ~13MB, now all bpfcountd instances start fine on a 64MB device like the the Gl.inet GL-USB150 without causing an instant out-of-memory

@@ -0,0 +1,34 @@
include $(TOPDIR)/rules.mk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this package to OpenWrt packages (I See that LICENCE identifies is missing).

bpfcountd was created to obtain packet statistics in larger networks
without stressing the cpu resources. bpfcountd will count the amount
of packages and bytes over time (for each defined rule). The rules
are defined using the tcpdump filter syntax (bpf). The collected
data is provided on a unix socket in plaintext.

Signed-off-by: Linus Lüssing <[email protected]>
@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 0f3ee91 to 003b8d9 Compare December 7, 2022 12:53
@T-X
Copy link
Contributor Author

T-X commented Dec 7, 2022

Changelog v10:

  • rebased to current master (no changes/conflicts)
  • updated bpfcountd package to the version that was merged upstream (bpfcountd: add initial package openwrt/packages#19975 + bpfcountd: remove incomplete/broken namespace feature openwrt/packages#20027):
    • added licenses (MIT)
    • added PKG_MIRROR_HASH
    • removed PKG_BUILD_DEPENDS
    • minor whitespace changes
    • added service_triggers() to init file
    • removed instance/namespace variable in init script -> fixes start/stop with no instance specified
    • removed some debug echos from init script
  • updated gluon-statistics-mcast package's setup and teardown scripts to the new bpfcountd init script version
    • use named instead of anonymous UCI section names now, with a "gluon_" prefix

@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 003b8d9 to 04b13c6 Compare December 7, 2022 13:59
@T-X
Copy link
Contributor Author

T-X commented Dec 7, 2022

Changelog v11:

  • fix gluon-statistics-mcast socket symlink in setup script (regression introduced in v10)
  • explicitly use /bin/ash in gluon-statistics-mcast setup and teardown scripts to make linter happy about string replacements

This ignores the following warning:

  warning: In POSIX sh, string replacement is undefined. [SC3060]

As string replacements are supported by ash and we are only going to use
ash.

This allows us to use something short like ${QUERY_STRING//+/ } to
replace all '+' to a whitespace in an URL or to make an ${IFNAME//-/_}
UCI compatible in the future, for example. Instead of needing to use
something long like "echo ${IFNAME} | tr '-' '_'.

Signed-off-by: Linus Lüssing <[email protected]>
This package collects disected statistics on multicast
and management traffic and provides them via respondd in
"statistics-extended". Such information can be helpful
for further development and comparison of mesh routing
protocols and troubleshooting network issues.

The statistics gathered disect into batman-adv
packets as well. Statistics are collected both on client
bridge ports and br-client as well as mesh interfaces.

This introduces a performance penalty but tries to
keep the impact on unicast traffic small.

Note that respondd replies are rather large
(several UDP datagrams, even when LZMA compressed).
Therefore you might want to use slower intervals for
querying respondd "statistics-extended".

Signed-off-by: Linus Lüssing <[email protected]>
Adding a cgi-bin script which allows to query a node's respondd over HTTP.
This is especially useful when the response is expected to be large and
a UDP response would lead to an unreliable transfer of multiple IP
fragments, like for the upcoming "statistics-extended" respondd
provider. HTTP/TCP is then the more reliable approach.

Query examples with HTTP GET:

  $ curl --compressed "http://<IP6ADDR>/cgi-bin/respondd"
  -> defaults to nodeinfo

  $ curl --compressed "http://<IP6ADDR>/cgi-bin/respondd?nodeinfo+statistics"

Query examples with HTTP POST:

  $ curl --data "" --compressed "http://<IP6ADDR>/cgi-bin/respondd"
  -> defaults to nodeinfo

  $ curl --data "nodeinfo statistics" --compressed "http://<IP6ADDR>/cgi-bin/respondd"

Signed-off-by: Linus Lüssing <[email protected]>
@T-X T-X force-pushed the pr-gluon-statistics-mcast branch from 04b13c6 to c6693b7 Compare December 9, 2022 17:31
@T-X
Copy link
Contributor Author

T-X commented Dec 9, 2022

Changelog v12:

  • workaround for linter did not work, so reverting back to #!/bin/sh and instead adding a new commit with an exception to silence linter

@github-actions github-actions bot added the 3. topic: build This is about the build system label Dec 9, 2022
@AiyionPrime

This comment was marked as duplicate.

Copy link
Member

@AiyionPrime AiyionPrime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a complete review, but only the results of me checking for affected PRs in #2765.

Comment on lines +2 to +7
"client"|\
"mmfd")
;;
*)
exit 0
;;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"client"|\
"mmfd")
;;
*)
exit 0
;;
"client"|\
"mmfd")
;;
*)
exit 0
;;

Comment on lines +58 to +65
"ifup")
bpfcountd_start
;;
"ifdown")
bpfcountd_stop
;;
*)
;;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"ifup")
bpfcountd_start
;;
"ifdown")
bpfcountd_stop
;;
*)
;;
"ifup")
bpfcountd_start
;;
"ifdown")
bpfcountd_stop
;;
*)
;;

Comment on lines +10 to +17
GET)
;;
POST)
QUERY_STRING="$(cat)"
;;
*)
badrequest
;;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GET)
;;
POST)
QUERY_STRING="$(cat)"
;;
*)
badrequest
;;
GET)
;;
POST)
QUERY_STRING="$(cat)"
;;
*)
badrequest
;;

Comment on lines +1 to +24
/*
Copyright (c) 2021, Linus Lüssing <[email protected]>
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*
Copyright (c) 2021, Linus Lüssing <linus.luessing@c0d3.blue>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/* SPDX-FileCopyrightText: 2021, Linus Lüssing <[email protected]> */
/* SPDX-License-Identifier: BSD-2-Clause */

}

static int parse_socket_iface(const char *socketname, enum capdir dir,
char *iface)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char *iface)
char *iface)

}

static int add_num_to_json(struct json_object *dir_obj, const char *key,
int64_t num)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int64_t num)
int64_t num)

}

static int add_line_to_json(struct json_object *iface_obj, enum capdir dir,
const char *rule, int64_t bytes, int64_t packets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const char *rule, int64_t bytes, int64_t packets)
const char *rule, int64_t bytes, int64_t packets)

* And adds it to the provided json object.
*/
static int parse_line(struct json_object *obj, enum capdir dir,
char *line)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char *line)
char *line)

strncpy(addr.sun_path, socketname, sizeof(addr.sun_path) - 1);

ret = connect(sd, (const struct sockaddr *)&addr,
sizeof(addr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sizeof(addr));
sizeof(addr));


/* skip files not ending in ".sock" */
if (check_socket_suffix(dp->d_name) < 0)
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
continue;
continue;

@AiyionPrime AiyionPrime added the 2. status: waiting-on-author Waiting on some action from the author label Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2. status: waiting-on-author Waiting on some action from the author 3. topic: build This is about the build system 3. topic: package Topic: Gluon Packages 3. topic: respondd 3. topic: status-page
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants