Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can nacos replicas be 1 in cluster mode? #5662

Closed
zcz3313 opened this issue May 10, 2021 · 14 comments
Closed

can nacos replicas be 1 in cluster mode? #5662

zcz3313 opened this issue May 10, 2021 · 14 comments

Comments

@zcz3313
Copy link

zcz3313 commented May 10, 2021

Describe the bug
naming module of nacos does not work well when replicas = 1 in cluster mode

Expected behavior
work well

Acutally behavior
got 503 when using naming module api

How to Reproduce
detail message as below
nacos-group/nacos-k8s#214 (comment)

Desktop (please complete the following information):

  • OS: [Centos]
  • Version [nacos-server 1.4.2]
  • Module [naming]
  • SDK [spring-cloud-alibaba-nacos]

Additional context
in k8s environment

@KomachiSion
Copy link
Collaborator

It should be ok. You can try to start with 1 replicas directly but no start with 3 and short to 1.

@zcz3313
Copy link
Author

zcz3313 commented May 11, 2021

It should be ok. You can try to start with 1 replicas directly but no start with 3 and short to 1.

I'm sorry, it's still not working after I have another try.
I have test this case in both docker and k8s environment, the results of those are same.

my docker-compose file:
image
env file:
image

@zcz3313
Copy link
Author

zcz3313 commented May 11, 2021

It should be ok. You can try to start with 1 replicas directly but no start with 3 and short to 1.

my k8s case is just as same sa you said, which is started with 1 replicas.

@zcz3313
Copy link
Author

zcz3313 commented May 11, 2021

It should be ok. You can try to start with 1 replicas directly but no start with 3 and short to 1.

my k8s case is just as same sa you said, which is started with 1 replicas.

naming module api return 503 in both docker and k8s case.
image

nacos client in my service also throws an error
image

@realJackSun
Copy link
Collaborator

@zcz3313
I think two factors may bring this problem:
(1)、data cache is not cleaned
(2)、The modification in ISSUE #5350

Could you use the following procedure to deploy your Nacos cluster and help us find the exact reason

1、stop all of the nacos node in the cluster
2、clean the cache data in path ~/nacos/data
3、restart the nacos
4、Retry the request, and see if the problem is solved.

@zcz3313
Copy link
Author

zcz3313 commented May 12, 2021

@zcz3313
I think two factors may bring this problem:
(1)、data cache is not cleaned
(2)、The modification in ISSUE #5350

Could you use the following procedure to deploy your Nacos cluster and help us find the exact reason

1、stop all of the nacos node in the cluster
2、clean the cache data in path ~/nacos/data
3、restart the nacos
4、Retry the request, and see if the problem is solved.

My docker-compose file does not mount ~/nacos/data on host path.
My k8s case also do clean nfs volumes before each test.
So i think there is nothing to cache in my case.
Have you had a try using my docker-compose file?
I would like to know whether it works fine.
Btw, does the version of nacos client side have to be same with server side ?
My client version is 1.2.1 while server version is 1.4.2.

@realJackSun
Copy link
Collaborator

@zcz3313 there should be a directory named 'data' under your ${nacos_home} directory, I mean delete that directory and retry.

@zcz3313
Copy link
Author

zcz3313 commented May 13, 2021

@zcz3313 there should be a directory named 'data' under your ${nacos_home} directory, I mean delete that directory and retry.

I have found that directory on my host machine and recreate nacos docker after delete that dir, still not working.
I wonder if this phenomenon has anything to do with this directory?
I use docker and do not mount any volume on it.
The cache on my host machine should have nothing to do with this.
Or do you mean let me delete the cache inside docker and restart it?

@realJackSun
Copy link
Collaborator

realJackSun commented May 13, 2021

@zcz3313 Yes, delete the cache inside docker and restart it. If it still not work, I will try to figure out if this problem is caused by another factor.

I have found that directory on my host machine and recreate nacos docker after delete that dir, still not working.
I wonder if this phenomenon has anything to do with this directory?

The directory on host machine has nothing to do with that problem, but the cache inside docker may cause it.

@zcz3313
Copy link
Author

zcz3313 commented May 13, 2021

@zcz3313 Yes, delete the cache inside docker and restart it. If it still not work, I will try to figure out if this problem is caused by another factor.

I have found that directory on my host machine and recreate nacos docker after delete that dir, still not working.
I wonder if this phenomenon has anything to do with this directory?

The directory on host machine has nothing to do with that problem, but the cache inside docker may cause it.

still not working. here are my steps:
1.input 'docker-compose up -d' on host to create nacos container.
2.input 'docker exec -it nacos bash' on host to attach container.
3.input 'rm -rf ~/nacos/data' cmd inside container, then input 'exit' cmd to exit container.
4.input 'docker-compose stop' on host to stop the container.
5.input 'docker-compose start' on host to start the container.

btw, i can't delete dirs inside container when it is stopped.
that is why my step 3 and 4 came into being.

@zcz3313
Copy link
Author

zcz3313 commented May 20, 2021

@zcz3313 Yes, delete the cache inside docker and restart it. If it still not work, I will try to figure out if this problem is caused by another factor.

I have found that directory on my host machine and recreate nacos docker after delete that dir, still not working.
I wonder if this phenomenon has anything to do with this directory?

The directory on host machine has nothing to do with that problem, but the cache inside docker may cause it.

still not working. here are my steps:
1.input 'docker-compose up -d' on host to create nacos container.
2.input 'docker exec -it nacos bash' on host to attach container.
3.input 'rm -rf ~/nacos/data' cmd inside container, then input 'exit' cmd to exit container.
4.input 'docker-compose stop' on host to stop the container.
5.input 'docker-compose start' on host to start the container.

btw, i can't delete dirs inside container when it is stopped.
that is why my step 3 and 4 came into being.

@JackSun-Developer Can you reproduce this problem by following my steps? Is this a bug?

@KomachiSion
Copy link
Collaborator

I see the error seems the distro problem. Need research.

@realJackSun
Copy link
Collaborator

@zcz3313 I can reproduce your problem.
If you set replica number to 1, you can set the configuration properly to avoid this problem.

In application.properties, set

nacos.naming.data.warmup=false

Then this problem will not occur, and nacos work correctly.

@zcz3313
Copy link
Author

zcz3313 commented May 24, 2021

If you set replica number to 1, you can set the configuration properly to avoid this problem.

In application.properties, set

nacos.naming.data.warmup=false

Then this problem will not occur, and nacos work correctly.

@JackSun-Developer
I have test it, and it's work. Also looks good in 3 replicas mode.
Is this a bug? warmup shouln't cause this problem.
Anyway, Thks.

@zcz3313 zcz3313 closed this as completed May 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants