-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update existing lists #144
base: main
Are you sure you want to change the base?
Conversation
This is quite a large diff. Happy to iterate if there are things you want to change. |
I'm not sure list patching can work reliably. A domain can be in another list instead of the previous list after blocklist update and would mess up the patching process. For example: We have two lists and the first one is full. If a new domain is added to the first list after the blocklist update, the last domain would end up being in the second list. We wouldn't be able to check its existence without going through all the current lists. Before blocklist update
After blocklist update
Things become even more complicated if the next update is from a different set of blockklists. Therefore, to make it work reliably, we can only go through all the current lists but that would defeat the purpose to save time and requests because there isn't an API to get all items of all the lists. |
This change does check all existing lists. It is a reliable operation and idempotent such that should an error occur, the same command can be used again to complete any missing changes. The benefits of this change are not only (slightly) faster operations, but avoiding any period where the lists or rules are unapplied which would leave the network and users vulnerable to accessing otherwise blocked hosts. The maximum requests in this flow is 2x number_of_lists (GET + PATCH to every list) in the worst case. The requests in the current flow is 2x number_of_lists (DELETE + POST to recreate every list) in the normal case. |
To avoid the need to delete all lists and recreate them, we can update existing lists only when their contents had changed. This processes the diffs between the desired list of domains and the existing lists. Removing entires that are no longer in the desired lists and appending any new entries. This prefers to minimize the number of PATCH calls by appending entries to the lists we're already patching for the removals. The priority for additions is: 1. Add to lists we're already patching for removals. 2. Add to existing lists with fewer than LIST_ITEM_SIZE entries. 3. Create a new list.
I've been using this PR and the performance difference is pretty huge, using only OISD small it takes the average runtime from 2m45s to about 30s. (Comparing against a full delete+create & this PR to compare+patch). I have came across a bug though, the script errors when no lists currently exist, it only works where at least 1 list already exists. Example workflow:
|
Thanks for the feedback and bug report. Will get that bug patched today.
…On Mon, Dec 30, 2024 at 7:07 AM Kieran Brown ***@***.***> wrote:
I've been using this PR and the performance difference is pretty huge,
using only OISD small it takes the average runtime from 2m45s to about 30s.
(Comparing against a full delete+create & this PR to compare+patch).
I have came across a bug though, the script errors when no lists currently
exist, it only works where at least 1 list already exists.
Example workflow:
https://github.com/kieranbrown/dns/actions/runs/12520357125/job/34925911614
file:///home/runner/work/dns/dns/lib/api.js:78
const cgpsLists = lists.filter(({ name }) => name.startsWith("CGPS List"));
^
TypeError: Cannot read properties of null (reading 'filter')
at synchronizeZeroTrustLists (file:///home/runner/work/dns/dns/lib/api.js:78:27)
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
at async file:///home/runner/work/dns/dns/cf_list_create.js:143:3
—
Reply to this email directly, view it on GitHub
<#144 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTT2ED3CNFAVTVOVV2MKFT2IFORVAVCNFSM6AAAAABS2ELTJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRVGU4TQNZZGU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Can confirm that issue is resolved 🙏 I've just encountered another issue when adding a new blocklist. Running a full delete+create fixed the issue so it seems to be related to adding a new Cloudflare list during synchronisation. Should be able to reproduce by starting off with a small blocklist, let it create at least 1 Cloudflare list, then add another blocklist causing it to create more Cloudflare lists. Example workflow:
|
Where there are 0 existing CGPS List Chunks, but other existing list(s), the calculation for the next list number would result in NaN
Apologies for this oversight and thanks for the continued testing and feedback. |
Magnificent work; thank you for being patient and quickly fixing new problems as they come up. Because there are so many moving parts involved in this process, I'd continue testing the new code further before merging.
Aside from what I've mentioned, it works beautifully. Thanks again. |
Fixed your 3. comment since it's an easy change. For 1. and 2. the change would be more intrusive and other than organization, doesn't change the outcome much. |
I’m still coming across the Chuck NaN problem when the script wants to create a new list https://github.com/kieranbrown/dns/actions/runs/12702722844/job/35409402648 |
Do you have any manually named lists using the same prefix?
I’m not going to be able to look at this for a while. I’ve been affected by
the California wildfires.
…On Fri, Jan 10, 2025 at 1:00 AM Kieran Brown ***@***.***> wrote:
I’m still coming across the Chuck NaN problem when the script wants to
create a new list
https://github.com/kieranbrown/dns/actions/runs/12702722844/job/35409402648
—
Reply to this email directly, view it on GitHub
<#144 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTT2ED4KD5UDQ5JD5QXKST2J6D3LAVCNFSM6AAAAABS2ELTJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBSGEYDKOBYGM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I would like to test this out myself. ps: I'm very sorry to hear you are impacted by the CA fires. That is terrible and I wish you all the best. |
To avoid the need to delete all lists and recreate them, we can update existing lists only when their contents had changed.
This processes the diffs between the desired list of domains and the existing lists. Removing entires that are no longer in the desired list and appending any new entries. This prefers to minimize the number of PATCH calls by appending entries to the lists we're already patching for the removals.
The priority for additions is: