-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow disk-based persistence to be completely disabled #2228
Conversation
Thanks for the contribution! And Azurite not only support saving data in disk, Azurite also allows save blob metadata in SQL, will the change also disable save data in SQL? (IF so, the parameter name/description might need change) |
This is intended for CI testing scenarios where Azurite is started at the beginning of the pipeline, some tests are run, and then the CI pipeline ends. Generally, I find myself using Azurite for cases where the I don't care much if the storage is lost or, it even annoys me to have to clean up leftover storage containers. Today, in my project NuGet/Insights (which I've used as a general test for Azurite as a side-goal) I have a step that cleans up containers, queues, and tables based on a test name convention. This is workable, and necessary for a test with real Azure Storage, but for CI pipelines it would be nice just to stop the node process and know all clean-up is done. Additionally, I've found that some of my tests run surprisingly slow and found that it was related to how queue messages are persisted in Azurite. A disk write is done per enqueue and dequeue which means there's a lot more IO than if LokiJS was used to persist message content, as Loki only writes periodically -- and this is noticeable with the current perf difference between Table and Queue write-then-read flows.
In short, CI runs where we start Azurite, run tests, then (optionally) stop Azurite.
It should still work. It does allow
Yes, I did some test hacks to run all the tests I could with this change, but I'm not sure exactly how you'd like to get full coverage. Should we run all tests both with and without the
Great idea! I'll give that a try. Maybe this addresses the previous point.
I'll take this as a TODO. It's certainly related to the amount of memory available to the node process.
I'll take this as a TODO. I am not sure how node operates when it's low on memory. My guess is big allocs can fail and trigger GC and some severe memory shortage would cause the process to crash. |
I did testing. The approach is surprisingly scalable. When physical memory is consumed, virtual memory is provided by the OS. I was not able to get any "out of memory" errors. I included a screenshot in the design doc. |
Thanks for the contribution! Will update you later. |
Great, thanks @blueww! Let me know if there's any updates I should make or questions I can answer. I'm happy to jump on a Teams call too if you think that will help ease communication. |
Hi @joelverhagen, I have done some sanity check with the PR:
|
Hey @blueww! I believe I have addressed all of your comments. Please let me know if more is needed 😃 |
I have tested repeated upload and delete blobs with Azurite run with "--inMemoryPersistence". But the test result is not expected, it looks the deleted blob won't be clear up from memory. Would you please help to look? Test Scenario:Upload 100MB * 10 blobs, then delete all 10 blobs. Then upload 100MB* 10 blobs again, and delete them again. ... Test Result:After upload/delete 100MB*10 blobs for ~15 times, will meet following error :
The memory usage chart (I wait > 15 minutes after error happen) : For you reference, my test script: $ctx = New-AzStorageContext -ConnectionString "AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;DefaultEndpointsProtocol=http;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;QueueEndpoint=http://127.0.0.1:10001/devstoreaccount1;TableEndpoint=http://127.0.0.1:10002/devstoreaccount1;"
$containerName = "test"
New-AzStorageContainer -Name $containerName -Context $ctx
for ($i =0 ; $i -le 1000; $i++ )
{
Echo "create Blobs"
for ($j=0 ; $j -le 10; $j++)
{
Set-AzStorageBlobContent -Container $containerName -Blob "testblob_$($i)_$($j)" -File C:\temp\testfile_102400K_0 -Context $ctx -Force
}
echo "Delete blobs"
for ($j=0 ; $j -le 10; $j++)
{
Remove-AzStorageBlob -Container $containerName -Blob "testblob_$($i)_$($j)" -Context $ctx -PassThru -Force
}
} |
@blueww - the default blob GC is 10 minutes ( Azurite/src/blob/utils/constants.ts Line 21 in 49d1065
BlobGCManager runs. If you were to attempt the upload after 10 minutes from process start, you should see success. I see a spike in CPU right around the 10 minute mark so I'm guessing the blob GC actually happened but your uploads did not continue after you faced an error (~3-4 minute mark).
Regarding the memory consumption, I wasn't able to reproduce this behavior. I changed the GC period to 5 minutes to ease testing and set an extent limit of 5 GB. A debug log will shed light on this. You'll want to look for a line like this:
It's possible that Node isn't releasing the memory even after the blob GC occurs. I am not an expert in Node.js GC but after a bit of reading it seems like it could be related to the default A way you could run a similar test for the existing disk-based persistence. I tried with a ramdisk on Ubuntu WSL (steps: https://askubuntu.com/a/453755) and similarly encountered failures (HTTP 500 not 409 of course) at my 5 GB ramdisk limit. However in that case the blob GC didn't find any extents at all. I tried many versions of Azurite (historical tags like
So an alternate explanation is that there is an existing bug in the blob GC manager that you encountered. But we need a debug log to know. |
Thanks for the investigation! I use the tool Perfmon to show the chart of memory/cpu. I have tested following 3 scenario, and this time looks GC works in all scenarios.
Besides that, to avoid customer issues like why deleted blob content still occupy memory, do you think we should mention how GC works in Readme.md section "Use in-memory storage"? Scenario 1
Scenario 2
Scenario 3
|
Sure, I'll do that.
Generally I've seen this behavior in several GC'd runtimes. In my head it's basically "GC sometimes does the hard work only when memory pressure is faced" (i.e. lazily) -- which is when the next round of blobs come in for this case. But this I my gut check and not specific expertise on the Node GC. |
I've updated the README with these details. |
Thanks for the quick update! |
PR Branch Destination
This adds a new
--inMemoryPersistence
option which allows the disk-based extent storage to become memory based and the LokiJS persistence settings to use only memory.I'm happy to break this up into smaller PRs if you'd like.
This addresses the following feature request (also by me): #2227
Summary of changes:
docs
structure mimicking how we do public specs on NuGet team.--inMemoryPersistence
option and plumb it through the codeMemoryExtentStore
as an alternative toFSExtentStore
QueueServer
andTableServer
creation to a test factory, allowing centralized configuration.--inMemoryStorage
right after disk-based.Always Add Test Cases
All test cases (except 1) work with the
--inMemoryPersistence
option. The one test failing is skipped for--inMemoryPersistence
. It makes sense that it fails because it requires reading a checked-in legacy LokiJS disk persistence.Design spec
The render design spec is avalable here: https://github.com/joelverhagen/Azurite/blob/in-memory/docs/designs/2023-10-in-memory-persistence.md
For comments, it's available along with the rest of the PR here:
https://github.com/Azure/Azurite/pull/2228/files#diff-638244b70a10f551a0a6a66d691f8d88cf555fa0b9da1710e3d3bac119c05c39
Performance
Unsurprisingly, blob and queue are much faster without writing extents to disk.