Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project status #37

Open
clrxbl opened this issue Aug 3, 2023 · 2 comments
Open

Project status #37

clrxbl opened this issue Aug 3, 2023 · 2 comments

Comments

@clrxbl
Copy link

clrxbl commented Aug 3, 2023

Just stumbled upon this project and I'm just wondering what is the state of the project currently, is it safe to be used in e.g. production environments?

This project seems like a good contender to replace our usage of Consul's key-value store which we currently abuse & CAS seems like a better solution for our usecase.

@thraxil
Copy link
Owner

thraxil commented Aug 4, 2023

Hi @clrxbl!

So, yeah, the status is in a bit of a weird spot. I've been running a 10 node cask cluster for about 9 years (the life of this codebase) and storing a few TB of my own data (my music collection, raw photos, DVD rips, backups, etc.) and it's been reliable and stable (I've lost and replaced hard drives but it recovers like it should and I haven't lost any data).

But, I never really promoted this project or pushed it beyond my own use-cases. I don't really know of anyone else using it for production.

So, I'd consider it stable and maintained, but really only for my own fairly narrow use-case. Eg, I'm not really using the S3 integration (and I dropped Dropbox and other integrations when their APIs changed and I didn't have bandwidth to set up new accounts, etc. just to keep them running). When I originally designed cask, security also wasn't a major concern of mine, so SSL support is pretty minimal and it all pretty much expects that it's going to be running on a private network. If your needs are similar, it might be fine.

Otherwise, I do keep it updated as far as security updates for dependencies and fixing any bugs that I encounter (that just hasn't happened in a while for me because my usage is so stable). I'm also happy to accept contributions and improvements.

I've done my share of abusing Consul KV in my day, so I have an idea where you're coming from. Relevant for comparison, I'd say:

  • restarting a cask cluster after a hardware/power failure, etc. and getting it back into a working state without losing data is way simpler than fixing a broken Consul cluster (there's no leader election or anything to complicate things). As long as you don't change the node IDs, you pretty much just have to restart any nodes that were down and it will do the rest. Even if you end up changing node IDs, it still shouldn't lose data; it will just potentially want to move a bunch of it around and that could take a while if you have a lot of data.
  • Consul KV ought to be faster in the common case though, especially for reads since it (usually, I think) keeps a lot more data in memory. Cask is going to read data off disk every time (in the best case; if you request a key from a node that it doesn't have stored locally, it will check other nodes in the cluster for it before returning a response and that can pretty slow). Cask is a bit more oriented towards cold storage and relatively infrequent access for large amounts of larger data (it doesn't have a 512k limit like Consul KV).
  • since it's a CAS system, there's no "delete" like Consul key-value has. Once you store something in cask, the only way to really delete it would be to stop the cluster, go in manually on each node and delete the underlying files, and then start it back up.
  • cask doesn't provide any kind of query/list equivalent. Likewise, there's no equivalent to Consul's watch.
  • cask doesn't support ACLs or really have any permissions model. If you can reach a cask node to read, you can also write to it.

@clrxbl
Copy link
Author

clrxbl commented Aug 4, 2023

I've already had to raise Consul KV's 512k limit to 1 or 2MB and it keeping more data in memory seems to have been a bit problematic over time as I'm currently dealing with it running out of memory and eating a lot of CPU when there's a lot of applications reading data from it at once.

I've also had to deal with Consul's cluster recovery being very annoying in the past aswell e.g. leader election never electing a leader, but I believe versions 1.14+ fixed quite a few issues regarding that & leader election.

I'd be using Cask internally only so the concerns about security I'm generally okay with and I don't intend on using S3 either.

Thanks for the in-depth reply, I'll be sure to look at cask again to re-evaluate my options, it seems like an interesting project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants