Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data inconsistencies in Octopoes: a proposal to retire celery and validate the model continuously #3585

Open
originalsouth opened this issue Sep 26, 2024 · 12 comments · Fixed by #3682
Labels
bug Something isn't working octopoes Issues related to octopoes xtdb

Comments

@originalsouth
Copy link
Contributor

Data inconsistencies in Octopoes: a proposal to retire celery and validate the model continuously.

Describe the bug
Since VisualOctopoesStudio several bugs regarding Octopoes' the data model have come to light.

A subset of these bugs (#3498, #3564, and #3577), have addressed the various mechanisms of dangling self-proving OOI's that can occur like:
image
causing all kinds of bugs, like #3205.

With fixes for these bugs merged we still sporadically obtain such a self-proving OOI on the current main:

{
  "KATFindingType/description": "This nameserver does not have an IPv6 address.",
  "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
  "KATFindingType/impact": "Some users may not be able to reach your website without IPv6 support.",
  "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
  "KATFindingType/recommendation": "Ensure that the nameserver has an IPv6 address that can be reached.",
  "KATFindingType/risk_score": 0,
  "KATFindingType/risk_severity": "recommendation",
  "KATFindingType/source": "https://www.rfc-editor.org/rfc/rfc3901.txt",
  "object_type": "KATFindingType",
  "user_id": null,
  "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
}

With its history:

[
  {
    "contentHash": "3edeec9adadc2bbb8a7fa5d28c577ba46ba20881",
    "doc": {
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 50,
    "txTime": "2024-09-26T08:24:18Z",
    "validTime": "2024-09-26T08:24:18Z"
  },
  {
    "contentHash": "3edeec9adadc2bbb8a7fa5d28c577ba46ba20881",
    "doc": {
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 51,
    "txTime": "2024-09-26T08:24:18Z",
    "validTime": "2024-09-26T08:24:18Z"
  },
  {
    "contentHash": "3edeec9adadc2bbb8a7fa5d28c577ba46ba20881",
    "doc": {
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 52,
    "txTime": "2024-09-26T08:24:18Z",
    "validTime": "2024-09-26T08:24:18Z"
  },
  {
    "contentHash": "0000000000000000000000000000000000000000",
    "doc": null,
    "txId": 93,
    "txTime": "2024-09-26T08:24:35Z",
    "validTime": "2024-09-26T08:24:26Z"
  },
  {
    "contentHash": "591cee413cbbb4bb17d003563afa0cd32e31dc41",
    "doc": {
      "KATFindingType/description": "This nameserver does not have an IPv6 address.",
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/impact": "Some users may not be able to reach your website without IPv6 support.",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/recommendation": "Ensure that the nameserver has an IPv6 address that can be reached.",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "recommendation",
      "KATFindingType/source": "https://www.rfc-editor.org/rfc/rfc3901.txt",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 66,
    "txTime": "2024-09-26T08:24:34Z",
    "validTime": "2024-09-26T08:24:27Z"
  }
] 

And the Origin's history (as there is only one transaction we show the Origin here implicitly):

[
  {
    "contentHash": "b7bd2966301c32f40b8e1dbdd36ccc8e40c3d540",
    "doc": {
      "method": "kat_kat_finding_types_normalize",
      "origin_type": "affirmation",
      "result": [
        "KATFindingType|KAT-NAMESERVER-NO-IPV6"
      ],
      "source": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "source_method": "kat-finding-types",
      "task_id": "481e45e5-6dad-4f1e-8e5e-67c8442989aa",
      "type": "Origin",
      "xt/id": "Origin|affirmation|kat_kat_finding_types_normalize|kat-finding-types|KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 66,
    "txTime": "2024-09-26T08:24:34Z",
    "validTime": "2024-09-26T08:24:27Z"
  }
]

(note that XTDB transaction can contain multiple entities.)

In the history of the OOI there is something odd, namely that OOI there 9 seconds lag between it's validTime and the the txTime. This is cause by several factors playing:

  1. The definition of now as valid time is often ambiguous, sometimes it is aggregated all the way from Rocky and sometimes it generated impromptu -- causing a lapse time differences on operations and transactions.
  2. Some transactions in Octopoes are done impromptu in the main thread, and some are handled by event manager which invokes Celery, the Celery transaction are queued by parallel workers, these workers are as typical for parallel workers not synchronized.
  3. Affirmation of objects takes a relatively long time as they are processed by Boefjes as controlled by the scheduler.

What seems to be happening graphically is:

---
title: Transaction graph
---
%%{init: { 'gitGraph': { 'mainBranchName': 'Octopoes'}} }%%
gitGraph:
   commit id: "Initial OOI 0"
   commit id: "Initial OOI 1" type: HIGHLIGHT
   branch "Affirmation"
   commit id: "Affirmation queued"
   checkout "Octopoes"
   commit id: "OOI delete queued"
   branch "Deletion"
   checkout "Affirmation"
   commit id: "Affirmation processed"
   checkout "Octopoes"
   merge "Affirmation"
   commit id: "Affirmed OOI"
   checkout "Deletion"
   commit id: "OOI delete at queue time"
Loading

where the the timing of the deletion event and the affirmation are such that after deletion queuing (given the validTime), the OOI is affirmed (and by the affirmation implicitly recreated), only after which the deletion is executed (for that previously mentioned validTime).

Proposed resolution(s)

  1. Retire celery
    The event manager in Octopoes uses Celery a worker thread pool. Celery has been a source of issues within Octopoes, see for instance Slow clearence level aggregation #2171 where the upstream Celery/Billiard issue remains untouched Long hangs when os.sysconf('SC_OPEN_MAX') is large celery/billiard#399. While Celery has nice features, it seems overkill for our case and a source of delay, in this case accumulating up to 9s. In order to mitigate the behavior we would like to have a fast thread pool that can work parallel but does not change the order of creation and deletion events on a similar "inference-spacetimeline" as this violates causality. In addition, we would like to Octopoes to be able to query the event queue, so it can block or reject certain finding based on issued deletion events. As far as we know Celery has no trivial way to query the queue as such. This can all be easily done with a custom thread pool implementation managed by Octopoes, retiring Celery, and thus we propose to do so.

  2. Validate the model continuously
    Similar to a filesystem, we ideally never have any errors but if errors occur we would like to have to tools to detect them, and possibly fix them. Currently we have neither in Octopoes. We propose to implement a thread that with low priority validates the current Octopoes state for (logical) inconsistencies, once found a user can opt to have them fixed automatically where possible or fix/mitigate the error. Such a tool within Octopoes will make OpenKAT both more reliable and transparent, additionally it is an excellent way for a OpenKAT system administrator to file well documented issues should such errors occur.

OpenKAT version
main

@originalsouth originalsouth added bug Something isn't working octopoes Issues related to octopoes xtdb labels Sep 26, 2024
@github-project-automation github-project-automation bot moved this to Incoming features / Need assessment in KAT Sep 26, 2024
@originalsouth originalsouth moved this from Incoming features / Need assessment to To be discussed in KAT Sep 26, 2024
@dekkers
Copy link
Contributor

dekkers commented Sep 26, 2024

If I understand it correctly the problem is that an affirmation and deletion can happen at the same time and the affirmation can overwrite the deletion. Deleting is probably only part of the problem, because as far as I can see this can also happen if the affirmation and an update conflict, because an affirmation saves the whole object and potentially overwrites the data of the update.

This a pretty common problem with databases and concurrency and the usual solution is to use transactions to make sure the saved data is consistent. With XTDB we can do that using match in v1 or using ASSERT in v2. The match/assert should guard against an earlier/concurrent transaction doing conflicting changes. This should prevent saving an affirmation for an already deleted object.

Other than that I disagree that what celery currently does can be easily done with a threadpool, because we also need to take into account race conditions, resilience against crashes and scalability. Maybe it can be done with a threadpool, but I don't think we should think about it as something that is easy to do. Also note that a "fast thread pool that can work parallel" does not exist with Python if what is meant is executing Python code in parallel because of the GIL. And it will still take a few years before there is a Python without GIL that we can use...

@originalsouth
Copy link
Contributor Author

Thanks @dekkers for you comment and concerns.

If I understand it correctly the problem is that an affirmation and deletion can happen at the same time and the affirmation can overwrite the deletion. Deleting is probably only part of the problem, because as far as I can see this can also happen if the affirmation and an update conflict, because an affirmation saves the whole object and potentially overwrites the data of the update.

Same time could somewhat be misleading, the point is more that causality, as in the order of transactions is not preserved by the mix of various mechanisms launched by Octopoes. Indeed affirmations resaves the whole OOI.

This a pretty common problem with databases and concurrency and the usual solution is to use transactions to make sure the saved data is consistent. With XTDB we can do that using match in v1 or using ASSERT in v2. The match/assert should guard against an earlier/concurrent transaction doing conflicting changes. This should prevent saving an affirmation for an already deleted object.

I am aware of the various "atomic" methods one can apply to prevent data being parallel modified. I do not see, however, how this solves our problem. Note that in this case the OOI is retroactively deleted in the past (from the future -- if that makes sense). It is more a problem of logic within Octopoes rather than putting a simple lock on a transaction, because an object can be legitimately deleted and then reintroduced. Fundamentally this logic has to be assessed by Octopoes.

Other than that I disagree that what celery currently does can be easily done with a threadpool, because we also need to take into account race conditions, resilience against crashes and scalability. Maybe it can be done with a threadpool, but I don't think we should think about it as something that is easy to do. Also note that a "fast thread pool that can work parallel" does not exist with Python if what is meant is executing Python code in parallel because of the GIL. And it will still take a few years before there is a Python without GIL that we can use...

While I agree that it is a terrible idea to write anything of this sort in Python, as stated many many times before. Celery has been a source of frustration throughout the Octopoes project -- other than my own experience -- this something I also gathered from various developers in the team. Apart from that, I do not see how can reduce the overhead in calls and the long delays in execution, query the queue, and manage the queue execution priority (as alluded to above) by transaction type.
That said, it is my opinion that the GIL concern is somewhat limited to what can be done to address the issue... sure, we will not be truly parallel our thread pool but we can definitely make it be parallel enough as we are making calls to XTDB. Alternatively, if we want true parallelism, we can spawn or use any normal modern language other than Python that is actually suited for the task; (as also Celery/Billiard does). See also https://superfastpython.com/threadpool-python or particularly https://superfastpython.com/threadpool-python/#What_About_the_Global_Interpreter_Lock_GIL.

Thanks.

@originalsouth
Copy link
Contributor Author

Making the following changes:

--- a/octopoes/octopoes/xtdb/client.py
+++ b/octopoes/octopoes/xtdb/client.py
@@ -197,7 +197,8 @@ class XTDBSession:
         self._operations.append(operation)

     def put(self, document: stltr | dict[str, Any], valid_time: datetime):
-        self.add((OperationType.PUT, document, valid_time))
+        self.add((OperationType.PUT, document, datetime.now(timezone.utc)))
+        self.commit()

     def commit(self) -> None:
         if self._operations:

Still yields these problems:

[
  {
    "contentHash": "3edeec9adadc2bbb8a7fa5d28c577ba46ba20881",
    "doc": {
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 50,
    "txTime": "2024-09-30T12:51:04Z",
    "validTime": "2024-09-30T12:51:04Z"
  },
  {
    "contentHash": "3edeec9adadc2bbb8a7fa5d28c577ba46ba20881",
    "doc": {
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 51,
    "txTime": "2024-09-30T12:51:04Z",
    "validTime": "2024-09-30T12:51:04Z"
  },
  {
    "contentHash": "3edeec9adadc2bbb8a7fa5d28c577ba46ba20881",
    "doc": {
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 54,
    "txTime": "2024-09-30T12:51:04Z",
    "validTime": "2024-09-30T12:51:04Z"
  },
  {
    "contentHash": "0000000000000000000000000000000000000000",
    "doc": null,
    "txId": 92,
    "txTime": "2024-09-30T12:51:23Z",
    "validTime": "2024-09-30T12:51:15Z"
  },
  {
    "contentHash": "591cee413cbbb4bb17d003563afa0cd32e31dc41",
    "doc": {
      "KATFindingType/description": "This nameserver does not have an IPv6 address.",
      "KATFindingType/id": "KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/impact": "Some users may not be able to reach your website without IPv6 support.",
      "KATFindingType/primary_key": "KATFindingType|KAT-NAMESERVER-NO-IPV6",
      "KATFindingType/recommendation": "Ensure that the nameserver has an IPv6 address that can be reached.",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "recommendation",
      "KATFindingType/source": "https://www.rfc-editor.org/rfc/rfc3901.txt",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NAMESERVER-NO-IPV6"
    },
    "txId": 61,
    "txTime": "2024-09-30T12:51:20Z",
    "validTime": "2024-09-30T12:51:16Z"
  }
]

Where the time lapse seems consistently smaller.

@originalsouth
Copy link
Contributor Author

As @Donnype proposed, for now a good solution for this particular problem will be to have affirmations not modify the validTime. This solution will be implemented and tested. Thanks!

@originalsouth
Copy link
Contributor Author

As @Donnype proposed, for now a good solution for this particular problem will be to have affirmations not modify the validTime. This solution will be implemented and tested. Thanks!

The behavior is even funnier than we imagined...

  1. Not changing the validTime is not an option because then the order of documents is not guaranteed
  2. So we add an ε>0, which in this case is 1 second to the object: then, there is so little time that the affirmation does not account for other object yielding the same OOI and history gets kind of superseded by the old unaffirmed object which does not get reaffirmed because it is already affirmed but actually not... (similar to the situation in Affirmations keep existing even after normalizers do not yield them anymore #3499).

Remedies:

  1. Make ε bigger and hope the situation converges: this is a poor solution because finding the object from another can in principle take arbitrarily long
  2. Keep ε = 1 second, and do not allow previously affirmed objects to updated: this yields a race condition...

∴ in this case, ideally, we want to have "to be affirmed OOI's" not to be created again from another source, so the correct mode of operation is then not to recreate OOI's that don't yield new information... (this does not solve expiration problems). Of course this only hold for KATFindingTypes... for a more generic approach we either affirm before object creation and assess whether the new information adds (or needs to be updated) -- or we could introduce something like shadow OOI that wait for affirmation and only the information yield of a merge is assessed after...

Thoughts @underdarknl @Donnype @dekkers @noamblitz? Thanks!

@originalsouth
Copy link
Contributor Author

Example with 1s validTime increase patch:

[
  {
    "contentHash": "3d9195c46754adbc1798e576737b5b9eee2a6ea6",
    "doc": {
      "KATFindingType/id": "KAT-NO-CAA",
      "KATFindingType/primary_key": "KATFindingType|KAT-NO-CAA",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NO-CAA"
    },
    "txId": 15,
    "txTime": "2024-10-02T06:37:12Z",
    "validTime": "2024-10-02T06:37:12Z"
  },
  {
    "contentHash": "c110668dd1c8a187aabb1c9d11436413773cea03",
    "doc": {
      "KATFindingType/description": "This zone does not carry at least one CAA record.",
      "KATFindingType/id": "KAT-NO-CAA",
      "KATFindingType/impact": "All Certificate Authorities may issue certificates for you domain.",
      "KATFindingType/primary_key": "KATFindingType|KAT-NO-CAA",
      "KATFindingType/recommendation": "Set a CAA record to limit which CA's are allowed to issue certs.",
      "KATFindingType/risk_score": 3.9,
      "KATFindingType/risk_severity": "low",
      "KATFindingType/source": "https://letsencrypt.org/docs/caa/",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NO-CAA"
    },
    "txId": 41,
    "txTime": "2024-10-02T06:37:27Z",
    "validTime": "2024-10-02T06:37:13Z"
  },
  {
    "contentHash": "8c05c12a41a300e083191bddd270f6b4c25c34c9",
    "doc": {
      "KATFindingType/description": "This zone does not carry at least one CAA record.",
      "KATFindingType/id": "KAT-NO-CAA",
      "KATFindingType/impact": "All Certificate Authorities may issue certificates for you domain.",
      "KATFindingType/primary_key": "KATFindingType|KAT-NO-CAA",
      "KATFindingType/recommendation": "Set a CAA record to limit which CA's are allowed to issue certs.",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "KATFindingType/source": "https://letsencrypt.org/docs/caa/",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NO-CAA"
    },
    "txId": 238,
    "txTime": "2024-10-02T06:41:12Z",
    "validTime": "2024-10-02T06:41:12Z"
  }
]

@originalsouth
Copy link
Contributor Author

  1. Not changing the validTime is not an option because then the order of documents is not guaranteed

To emphasize the issue with this solution:

[
  {
    "contentHash": "04a63d118eac4f1173fb82e5b61aafc27e6d7d35",
    "doc": {
      "KATFindingType/description": "This hostname does not have an SPF record.",
      "KATFindingType/id": "KAT-NO-SPF",
      "KATFindingType/impact": "E-mail from this domain can potentially be spoofed if DMARC is not (properly) implemented in combination with DKIM and SPF.",
      "KATFindingType/primary_key": "KATFindingType|KAT-NO-SPF",
      "KATFindingType/recommendation": "Set an SPF record to protect your domain.",
      "KATFindingType/risk_score": 6.9,
      "KATFindingType/risk_severity": "medium",
      "KATFindingType/source": "https://www.cloudflare.com/en-gb/learning/dns/dns-records/dns-spf-record/",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NO-SPF"
    },
    "txId": 44,
    "txTime": "2024-10-02T08:11:58Z",
    "validTime": "2024-10-02T08:11:41Z"
  },
  {
    "contentHash": "c85d1c16b7ca764bd8a626c539bc1623801ef040",
    "doc": {
      "KATFindingType/id": "KAT-NO-SPF",
      "KATFindingType/primary_key": "KATFindingType|KAT-NO-SPF",
      "KATFindingType/risk_score": 0.0,
      "KATFindingType/risk_severity": "pending",
      "object_type": "KATFindingType",
      "user_id": null,
      "xt/id": "KATFindingType|KAT-NO-SPF"
    },
    "txId": 18,
    "txTime": "2024-10-02T08:11:42Z",
    "validTime": "2024-10-02T08:11:41Z"
  }
]

Where XTDB reports that the OOI (at a validTime > "2024-10-02T08:11:41Z") is:

{
  "KATFindingType/id": "KAT-NO-SPF",
  "KATFindingType/primary_key": "KATFindingType|KAT-NO-SPF",
  "KATFindingType/risk_score": 0,
  "KATFindingType/risk_severity": "pending",
  "object_type": "KATFindingType",
  "user_id": null,
  "xt/id": "KATFindingType|KAT-NO-SPF"
}

@underdarknl
Copy link
Contributor

underdarknl commented Oct 2, 2024

Given the following expectations for an OOI lifecycle:

insert t1 (declaration)
update t2 (affirm)
update t3 (affirm )
delete t2.5 (not proven anymore)
== op t3.1 does not exists
insert t4 (declaration)
update t5 (affirm )
== on t5.1 does exists

To make this work, we probably need to check (on t2.5) if there are any future affirmations that we need to 'undo', up to the point where we have a new declaration. From t4, any affirmation is valid again.

Another scenario and its expectation:
insert t1 (declaration)
update t2 (affirmation)
delete t3
update t2.5 (affirmation)

== on t3.1 does not exists.

To make this work we probably need to assert on t2.5 that the object does in fact exists. which in this case it does not(ish?), or we should figure out how to make XTDB follow the T timeline instead of the transaction-log when consolidating these transactions.
deleted?

@dekkers
Copy link
Contributor

dekkers commented Oct 2, 2024

insert t1 (declaration)
update t2 (affirm)
update t3 (affirm )
delete t2.5 (not proven anymore)
== op t3.1 does not exists

XTDB2 seems to do this differently than XTDB1 (if I understand Benny's first example correctly). In XTDB2 the delete will cause the row to not exist anymore after t2.5:

xtdb=> INSERT INTO finding_types (_id, _valid_from) VALUES ('KAT-NAMESERVER-NO-IPV6', DATE '2024-01-01');
INSERT 0 0
xtdb=> INSERT INTO finding_types (_id, _valid_from, description, risk_score) VALUES ('KAT-NAMESERVER-NO-IPV6', DATE '2024-01-02', 'This nameserver does not have an IPv6 address.', 6);
INSERT 0 0
xtdb=> INSERT INTO finding_types (_id, _valid_from, description, risk_score) VALUES ('KAT-NAMESERVER-NO-IPV6', DATE '2024-01-03', 'This nameserver does not have an IPv6 address.', 8);
INSERT 0 0
xtdb=> SELECT *, _valid_from, _valid_to FROM finding_types FOR VALID_TIME ALL;
          _id           |                  description                   | risk_score |        _valid_from        |         _valid_to         
------------------------+------------------------------------------------+------------+---------------------------+---------------------------
 KAT-NAMESERVER-NO-IPV6 | This nameserver does not have an IPv6 address. |          8 | 2024-01-03 00:00:00+00:00 | 
 KAT-NAMESERVER-NO-IPV6 | This nameserver does not have an IPv6 address. |          6 | 2024-01-02 00:00:00+00:00 | 2024-01-03 00:00:00+00:00
 KAT-NAMESERVER-NO-IPV6 |                                                |            | 2024-01-01 00:00:00+00:00 | 2024-01-02 00:00:00+00:00
(3 rows)

xtdb=> DELETE FROM finding_types FOR VALID_TIME FROM TIMESTAMP '2024-01-02T12:00:00Z' WHERE _id = 'KAT-NAMESERVER-NO-IPV6';
DELETE 0
xtdb=> SELECT *, _valid_from, _valid_to FROM finding_types FOR VALID_TIME ALL;
          _id           |                  description                   | risk_score |        _valid_from        |         _valid_to         
------------------------+------------------------------------------------+------------+---------------------------+---------------------------
 KAT-NAMESERVER-NO-IPV6 | This nameserver does not have an IPv6 address. |          6 | 2024-01-02 00:00:00+00:00 | 2024-01-02 12:00:00+00:00
 KAT-NAMESERVER-NO-IPV6 |                                                |            | 2024-01-01 00:00:00+00:00 | 2024-01-02 00:00:00+00:00
(2 rows)

@originalsouth
Copy link
Contributor Author

To make this work we probably need to assert on t2.5 that the object does in fact exists. which in this case it does not(ish?), or we should figure out how to make XTDB follow the T timeline instead of the transaction-log when consolidating these transactions.
deleted?

Yes, this is the whole point, that we need to find the delete event from the queue -- because it has not been processed yet.

Hence, this:

# When an Origin is saved while the source OOI does not exist, reject saving the results
try:
self.ooi_repository.get(origin.source, valid_time)
except ObjectNotFoundException:
if (
origin.origin_type not in [OriginType.DECLARATION, OriginType.AFFIRMATION]
and origin.source not in origin.result
):
raise ValueError("Origin source of observation does not exist")
elif origin.origin_type == OriginType.AFFIRMATION:
logger.debug("Affirmation source %s already deleted", origin.source)
return
is not enough to prevent the mechanism from occuring.

@originalsouth
Copy link
Contributor Author

Ok... so with #3624 merged:

Let study this scenario, and let us make it a multiple choice question:

With this following script:

#!/usr/bin/env zsh

DIR=${0:a:h}
PREFIX="nl-kat-coordination"
XTDB="$PREFIX/octopoes/tools/xtdb-cli.py"
NODE="ShaCheDeChungKe"

cd $DIR

curl -s -H "Content-Type: application/edn" -H "Accept: application/edn" -X POST "http://localhost:3000/_xtdb/create-node" -d '{:node "'"$NODE"'"}'

DELTA=2s
$XTDB -n $NODE submit-tx '["put", {"xt/id": "fries"}]'
sleep $DELTA

NOW=$(TZ=UTC date --iso-8601=s | awk -F\+ '{print $1}')
sleep $DELTA

$XTDB -n $NODE submit-tx '["put", {"xt/id": "fries", "topping": "mayonnaise"}]'
sleep $DELTA

$XTDB -n $NODE submit-tx '["delete", "fries", "'"$NOW"'"]'

cd -

exit 0

Does the node "ShaCheDeChungKe" have fries?
A: Yes, with mayonnaise
B: No
C: Yes, without mayonnaise
D: What is ShaCheDeChungKe?

The answer might look familiar:

octopoes/tools/xtdb-cli.py -n "ShaCheDeChungKe" history --with-docs "fries" | jq

{
  "topping": "mayonaise",
  "xt/id": "fries"
}

octopoes/tools/xtdb-cli.py -n "ShaCheDeChungKe" history --with-docs "fries" | jq

[
  {
    "txTime": "2024-10-07T10:15:34Z",
    "txId": 0,
    "validTime": "2024-10-07T10:15:34Z",
    "contentHash": "1e03284fe7ea929a3e2b26026366eb394738e948",
    "doc": {
      "xt/id": "fries"
    }
  },
  {
    "txTime": "2024-10-07T10:15:41Z",
    "txId": 2,
    "validTime": "2024-10-07T10:15:37Z",
    "contentHash": "0000000000000000000000000000000000000000",
    "doc": null
  },
  {
    "txTime": "2024-10-07T10:15:39Z",
    "txId": 1,
    "validTime": "2024-10-07T10:15:39Z",
    "contentHash": "b0aebd3a3d4e001397fdf56f678dbefea31a1dd5",
    "doc": {
      "topping": "mayonnaise",
      "xt/id": "fries"
    }
  }
]

@originalsouth originalsouth moved this from To be discussed to Blocked in KAT Oct 8, 2024
@originalsouth
Copy link
Contributor Author

This issue will be bulk addressed with the transition to XTDB2, the reformulation of the data-model, and the introduction of nibbles; etc. and is therefore currently blocked.

@github-project-automation github-project-automation bot moved this from Blocked to Done in KAT Oct 16, 2024
@underdarknl underdarknl reopened this Oct 16, 2024
@github-project-automation github-project-automation bot moved this from Done to Backlog / To do in KAT Oct 16, 2024
@originalsouth originalsouth moved this from Backlog / To do to Blocked in KAT Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working octopoes Issues related to octopoes xtdb
Projects
Status: Blocked
Development

Successfully merging a pull request may close this issue.

3 participants