Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hostname resolution of API client is not updated timely #2872

Closed
moritzj29 opened this issue Mar 6, 2024 · 4 comments
Closed

Hostname resolution of API client is not updated timely #2872

moritzj29 opened this issue Mar 6, 2024 · 4 comments
Labels

Comments

@moritzj29
Copy link

moritzj29 commented Mar 6, 2024

What happened?

My issue is with cs-firwall-bouncer-iptables, but I guess its is a general API client topic.
I have a setup where crowdsec LAPI is running on a different host than the bouncer. Therefore the bouncer needs to connect to the LAPI via URL (not just 127.0.0.1).

I had some issues with hostname resolution on my machine. Consequently, also the crowdsec LAPI could not be reached anymore as apparent from the 404 error code:

time="06-03-2024 08:59:54" level=error msg="http code 404, invalid body: invalid character 'p' after top-level value"

But even after fixing my DNS issues on the machine and name resolution is working fine again (tested with nslookup), the crowdsec bouncer still gives the same 404 error code.

When I restart the crowdsec-firewall-bouncer everything is working immediately.

I suppose the API client still uses the wrong IP and does not try to resolve the hostname again (at least within 1h).
Only upon restart the hostname is resolved again.

What did you expect to happen?

crowdsec API client should try to resolve the hostname for every request (at least if it is failing) and use the most recent resolved IP.

How can we reproduce it (as minimally and precisely as possible)?

actually I can just reproduce it the other way around, but it still demonstrates that changes to the DNS config are not picked up timely:

  1. have your bouncer running
  2. change your dns config to point your crowdsec URL to a wrong IP
  3. crowdsec API will still connect to the old host (so still working, despite it should fail)

Anything else we need to know?

No response

Crowdsec version

# LAPI on different host
$ cscli version
2024/03/06 10:11:26 version: v1.6.0-4192af30
2024/03/06 10:11:26 Codename: alphaga
2024/03/06 10:11:26 BuildDate: 2024-01-31_12:35:08
2024/03/06 10:11:26 GoVersion: 1.21.6
2024/03/06 10:11:26 Platform: docker
2024/03/06 10:11:26 libre2: C++
2024/03/06 10:11:26 Constraint_parser: >= 1.0, <= 3.0
2024/03/06 10:11:26 Constraint_scenario: >= 1.0, <= 3.0
2024/03/06 10:11:26 Constraint_api: v1
2024/03/06 10:11:26 Constraint_acquis: >= 1.0, < 2.0

$ crowdsec-firewall-bouncer -version
version: v0.0.28-debian-pragmatic-af6e7e25822c2b1a02168b99ebbf8458bc6728e5
BuildDate: 2023-10-02_11:36:24
GoVersion: 1.20.1

OS version

# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ uname -a
Linux node1 5.15.0-97-generic #107-Ubuntu SMP Wed Feb 7 13:26:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Enabled collections and parsers

$ cscli hub list -o raw
# paste output here

Acquisition config

```console # On Linux: $ cat /etc/crowdsec/acquis.yaml /etc/crowdsec/acquis.d/* # paste output here

On Windows:

C:> Get-Content C:\ProgramData\CrowdSec\config\acquis.yaml

paste output here

Config show

$ cscli config show
# paste output here

Prometheus metrics

$ cscli metrics
# paste output here

Related custom configs versions (if applicable) : notification plugins, custom scenarios, parsers etc.

$ cat /etc/crowdsec/bouncers/crowdsec-firewall-bouncer.yaml
mode: iptables
pid_dir: /var/run/
update_frequency: 10s
daemonize: true
log_mode: file
log_dir: /var/log/
log_level: info
log_compression: true
log_max_size: 100 # = 100 MB
log_max_backups: 3
log_max_age: 30
api_url: https://crowdsec.example.com:443/
api_key: <API_KEY>
insecure_skip_verify: false
disable_ipv6: false
deny_action: DROP
deny_log: false # set to true to enable log in kern.log
supported_decisions_types:
  - ban
#to change log prefix
#deny_log_prefix: "crowdsec: "
#to change the blacklists name
blacklists_ipv4: crowdsec-blacklists
blacklists_ipv6: crowdsec6-blacklists
#type of ipset to use
ipset_type: nethash
#if present, insert rule in those chains
iptables_chains:
  - INPUT
  - FORWARD
#  - DOCKER-USER

## nftables
nftables:
  ipv4:
    enabled: true
    set-only: false
    table: crowdsec
    chain: crowdsec-chain
    priority: -10
  ipv6:
    enabled: true
    set-only: false
    table: crowdsec6
    chain: crowdsec6-chain
    priority: -10

nftables_hooks:
  - input
  - forward

# packet filter
pf:
  # an empty string disables the anchor
  anchor_name: ""

prometheus:
  enabled: true
  listen_addr: 0.0.0.0
  listen_port: 60601
@moritzj29 moritzj29 added the kind/bug Something isn't working label Mar 6, 2024
Copy link

github-actions bot commented Mar 6, 2024

@moritzj29: Thanks for opening an issue, it is currently awaiting triage.

In the meantime, you can:

  1. Check Crowdsec Documentation to see if your issue can be self resolved.
  2. You can also join our Discord.
  3. Check Releases to make sure your agent is on the latest version.
Details

I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository.

@LaurenceJJones
Copy link
Contributor

LaurenceJJones commented Mar 6, 2024

What happened?

My issue is with cs-firwall-bouncer-iptables, but I guess its is a general API client topic. I have a setup where crowdsec LAPI is running on a different host than the bouncer. Therefore the bouncer needs to connect to the LAPI via URL (not just 127.0.0.1).

I had some issues with hostname resolution on my machine. Consequently, also the crowdsec LAPI could not be reached anymore as apparent from the 404 error code:

time="06-03-2024 08:59:54" level=error msg="http code 404, invalid body: invalid character 'p' after top-level value"

But even after fixing my DNS issues on the machine and name resolution is working fine again (tested with nslookup), the crowdsec bouncer still gives the same 404 error code.

When I restart the crowdsec-firewall-bouncer everything is working immediately.

I suppose the API client still uses the wrong IP and does not try to resolve the hostname again (at least within 1h). Only upon restart the hostname is resolved again.

What did you expect to happen?

crowdsec API client should try to resolve the hostname for every request (at least if it is failing) and use the most recent resolved IP.

How can we reproduce it (as minimally and precisely as possible)?

actually I can just reproduce it the other way around, but it still demonstrates that changes to the DNS config are not picked up timely:

0. have your bouncer running

1. change your dns config to point your crowdsec URL to a wrong IP

2. crowdsec API will still connect to the old host (so still working, despite it should fail)

Anything else we need to know?

No response

Crowdsec version

OS version

Enabled collections and parsers

Acquisition config

### Config show
### Prometheus metrics
### Related custom configs versions (if applicable) : notification plugins, custom scenarios, parsers etc.

from my very low knowledge around the subject, by default golang uses the local system dns resolver to find the ip address (and will inherit any caching from the dns resolver itself), however, if you are storing the names within /etc/hosts file for golang to be more performant it will only read this file on the first http request and then cache it for the rest of the life cycle.

if it is the latter then I would be wary to change this behaviour as reading a file per request depending on the size will cause extreme CPU/RAM overhead for very little gain.

@moritzj29
Copy link
Author

thanks for your reply! actually I did not set any entry in /etc/hosts but relied on the system's DNS resolver.

also clearing the systems DNS cache had no effect.

For me it looks like that the hostname is not resolved for every http request but just retried with the same IP over and over again (which would also make sense in some cases, but maybe not for a 404).

For now I added a static hosts entry in /etc/hosts to be independent of any DNS issues. I just have to remember to manually update/restart in case of any changes.

@LaurenceJJones
Copy link
Contributor

further investigation shows there is no caching on golang side unless the developers have implemented it, we use the default http handler. The only thing it could be is when we hit a retry loop since the resolution has already been done, then this may cause the issue but I have been trying to replicate it over the weekend to no avail.

classing this as not planned until a reproducible scenario can be achieved

@LaurenceJJones LaurenceJJones closed this as not planned Won't fix, can't repro, duplicate, stale Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants