Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue of fetchpgd #14

Open
wangclin opened this issue Apr 28, 2023 · 8 comments
Open

issue of fetchpgd #14

wangclin opened this issue Apr 28, 2023 · 8 comments

Comments

@wangclin
Copy link

hi, sorry to bother you again.
When I want to only implement fetchpgd on commefficient-attack, to test the accuracy difference between SpaseFed and fetchSGD methods. I encounter a problem.
My hyperparameter is --dataset_dir data/cifar10 --tensorboard --dataset_name CIFAR10 --model ResNet9 --mode fetchpgd --
k 10000 --num_blocks 1 --num_rows 1 --num_cols 325000 --num_clients 200 --num_workers 10 --error_type virtual --local_momentum 0.0 --virtual_momentum 0.9
the K, rows, cols value I use is same with fetchSGD.
But I got: CommEfficient-attacks\CommEfficient-attacks\CommEfficient\fed_worker.py", line 177, in worker_loop
sum_g += g
RuntimeError: The size of tensor a (500000) must match the size of tensor b (6568640) at non-singleton dimension 1
Could you please help me to fix it? THX a lot!

@kiddyboots216
Copy link
Owner

It looks like in fetchpgd we're trying to add together a full gradient and a sketch. The sketching step should come after the pgd step.

@wangclin
Copy link
Author

Does fetchpgd is the implementation of paper SparseFed? I study the source code and found out that true_topk is actually in line with SparseFed paper in communication efficient.
I think fetchpgd may include more robustness than true_topk, is that?

@kiddyboots216
Copy link
Owner

I think fetchpgd is some unfinished code, actually. It's supposed to be evaluating the adaptive attack combination between SparseFed and my other paper Neurotoxin. We actually have results for it, so I might need to check whether the finished implementation is on another server.

But the idea is basically that the attacker does multiple steps of PGD according to Neurotoxin, where they project the update onto the bottom-k gradients at each iteration. Then the server implements SparseFed by doing the overall top-k operation.

As you noted, the robustness defenses in SparseFed are just top-k and then the sketching.

@wangclin
Copy link
Author

Thanks for the giant helping.
In fact, I am going to make some attempts in the communication efficiency field. So I need to compare with some Sota methods. Hence I want to make comparisons with fetchSGD and SparseFed. Actually, I use your source code sketch (FetchSGD) and true_topk (SparseFed), and I don't know is that suitable?
I try fetchpgd but find out that the processing is not matched with spaseFed, so I use true_topk (to calculate transmission bytes and accuracy).

@kiddyboots216
Copy link
Owner

Oh, I think that for communication efficiency you should be using the main branch and not the attacks branch. In case you mean communication efficiency with robustness: FetchSGD is in the main branch, and SparseFed is just per-user-gradient-clipping + either top-k or sketching.

@wangclin
Copy link
Author

yeah, I know. In fact, I want to reference true_topk because it can provide a novel perspective on server compression (I find there are so many papers focusing on client compression), but I cannot find a proper paper that proposes this method.
So, I want to reference SparseFed to introduce this idea in the communication efficiency field (of course I will explain that SparseFed focuses on robustness instead of efficiency).
Or could you please provide some intuition source papers (true_topk)? It helps me a lot with paper writing, thx~

@kiddyboots216
Copy link
Owner

Sure, so SparseFed doesn't introduce top-k. Top-k is introduced by some of the papers that we cite in FetchSGD, in particular; (the particular mechanism that we use with memory)

https://arxiv.org/abs/1809.07599

I would note that the FetchSGD work does compare to true top-k for the server compression.

@wangclin
Copy link
Author

ok, thank you a lot~ how kind of you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants