Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory exception when export a content type that has more than 80,000 content items #11406

Open
HengzheLi opened this issue Mar 19, 2022 · 11 comments

Comments

@HengzheLi
Copy link

Describe the bug

When use remote deployment feature and deployment a recipe that export the content items of a content type than import those items to a remote instance. The Out-of-memory exception is thrown.

"exceptions": [
      {
        "Depth": 0,
        "ClassName": "System.OutOfMemoryException",
        "Message": "Exception of type 'System.OutOfMemoryException' was thrown.",
        "Source": "System.Private.CoreLib",
        "StackTraceString": "   at System.Collections.Generic.Dictionary`2.Resize(Int32 newSize, Boolean forceNewHashCodes)\r\n   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)\r\n   at Newtonsoft.Json.Linq.JPropertyKeyedCollection.InsertItem(Int32 index, JToken item)\r\n   at System.Collections.ObjectModel.Collection`1.Insert(Int32 index, T item)\r\n   at Newtonsoft.Json.Linq.JContainer.InsertItem(Int32 index, JToken item, Boolean skipParentCheck)\r\n   at Newtonsoft.Json.Linq.JObject.InsertItem(Int32 index, JToken item, Boolean skipParentCheck)\r\n   at Newtonsoft.Json.Linq.JContainer.TryAddInternal(Int32 index, Object content, Boolean skipParentCheck)\r\n   at Newtonsoft.Json.Linq.JContainer.Add(Object content)\r\n   at Newtonsoft.Json.Linq.JContainer.ReadProperty(JsonReader r, JsonLoadSettings settings, IJsonLineInfo lineInfo, JContainer parent)\r\n   at Newtonsoft.Json.Linq.JContainer.ReadContentFrom(JsonReader r, JsonLoadSettings settings)\r\n   at Newtonsoft.Json.Linq.JContainer.ReadTokenFrom(JsonReader reader, JsonLoadSettings options)\r\n   at Newtonsoft.Json.Linq.JProperty.Load(JsonReader reader, JsonLoadSettings settings)\r\n   at OrchardCore.ContentManagement.ContentItemConverter.ReadJson(JsonReader reader, Type objectType, Object existingValue, JsonSerializer serializer)\r\n   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.DeserializeConvertable(JsonConverter converter, JsonReader reader, Type objectType, Object existingValue)\r\n   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.Deserialize(JsonReader reader, Type objectType, Boolean checkAdditionalContent)\r\n   at Newtonsoft.Json.JsonSerializer.DeserializeInternal(JsonReader reader, Type objectType)\r\n   at OrchardCore.Abstractions.Pooling.PoolingJsonSerializer.Deserialize(String content, Type type)\r\n   at OrchardCore.Data.PoolingJsonContentSerializer.Deserialize(String content, Type type)\r\n   at YesSql.Session.Get[T](IList`1 documents, String collection)\r\n   at YesSql.Services.DefaultQuery.Query`1.ListImpl()\r\n   at YesSql.Services.DefaultQuery.Query`1.ListImpl()\r\n   at OrchardCore.Contents.Deployment.ContentDeploymentSource.ProcessDeploymentStepAsync(DeploymentStep step, DeploymentPlanResult result)\r\n   at OrchardCore.Deployment.Core.Services.DeploymentManager.ExecuteDeploymentPlanAsync(DeploymentPlan deploymentPlan, DeploymentPlanResult result)\r\n   at OrchardCore.Deployment.Remote.Controllers.ExportRemoteInstanceController.Execute(Int32 id, String remoteInstanceId, String returnUrl)\r\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)\r\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)\r\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)\r\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)\r\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)\r\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeInnerFilterAsync()\r\n--- End of stack trace from previous location ---\r\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextExceptionFilterAsync>g__Awaited|25_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)",
        "RemoteStackTraceString": null,
        "RemoteStackIndex": 0,
        "HResult": -2147024882,
        "HelpURL": null
      }
    ],

Expected behavior

The remote deployment can be finish with succeed.

Screenshots

If applicable, add screenshots to help explain your problem.
N/A

@Piedone
Copy link
Member

Piedone commented Mar 21, 2022

Would it be possible for your to share the offending export file?

@HengzheLi
Copy link
Author

Do you mean the deployment recipe or the exported result file? If the exported file, I think it can't be. The exception was thrown before the exported file was generated.

@HengzheLi
Copy link
Author

I follow the StrackTrace of the exception and found that in ContentDeploymentSource.cs OC loads all content to local memory before flush them to the exported zip file.

foreach (var contentItem in await _session.Query<ContentItem, ContentItemIndex>(x => x.Published && x.ContentType.IsIn(contentStep.ContentTypes)).ListAsync())
            {
                var objectData = JObject.FromObject(contentItem);

                // Don't serialize the Id as it could be interpreted as an updated object when added back to YesSql.
                objectData.Remove(nameof(ContentItem.Id));

                if (contentStep.ExportAsSetupRecipe)
                {
                    objectData[nameof(ContentItem.Owner)] = "[js: parameters('AdminUserId')]";
                    objectData[nameof(ContentItem.Author)] = "[js: parameters('AdminUsername')]";
                    objectData[nameof(ContentItem.ContentItemId)] = "[js: uuid()]";
                    objectData.Remove(nameof(ContentItem.ContentItemVersionId));
                    objectData.Remove(nameof(ContentItem.CreatedUtc));
                    objectData.Remove(nameof(ContentItem.ModifiedUtc));
                    objectData.Remove(nameof(ContentItem.PublishedUtc));
                }
                data.Add(objectData);
            }

I think it makes sense why the Out-of-memory exception was thrown based on the code snippet which I capture from ProcessDeploymentStepAsync method. The memory would run out eventually if the exporting list is large enough.

@HengzheLi HengzheLi changed the title Out of memory exception when export a content type that has move than 80,000 content items Out of memory exception when export a content type that has more than 80,000 content items Mar 22, 2022
@Piedone
Copy link
Member

Piedone commented Mar 22, 2022

Sorry, I misinterpreted this being an issue with import... Yeah, of course you can't share the export file then.

This indeed looks like a risky code. Would you venture into fixing it in a pull request?

@HengzheLi
Copy link
Author

I don't know. Seems maybe a stream object can help on this. But as I see, for now, it looks like a little hard to pass a stream out from a DeloymentSource to an upper caller.

@jtkech
Copy link
Member

jtkech commented Mar 25, 2022

Would need to be done in batches of lower size, each batch being processed in an isolated shell scope (we have helpers for this) that will commit its own transaction on disposing.

@Skrypt
Copy link
Contributor

Skrypt commented Mar 29, 2022

What if the actual batches fail in the middle of the process. Maybe it would be worth investing time in making this more robust.

@jtkech
Copy link
Member

jtkech commented Mar 29, 2022

What if the actual batches fail in the middle of the process

Yes, I thought about that, was just for info, in fact related to custom imports (not exports but still with batches) we do from old databases. In our case if it fails we have UI notifications and logged errors. Then on a new try, items already imported are not processed again (idempotency).

@sebastienros
Copy link
Member

The solution is to follow the TODO here: https://github.com/OrchardCMS/OrchardCore/blob/main/src/OrchardCore.Modules/OrchardCore.Contents/Deployment/ContentDeploymentSource.cs#L22

Batch the query, and create a separate file for each batch, with a step that links to the file.

@sebastienros sebastienros added this to the 1.x milestone Apr 7, 2022
@MikeAlhayek
Copy link
Member

There is also a problem when trying to execute a large recipe file. The request times out. batching the request may solve the timeout issue too.

@Piedone Piedone added perf and removed needs triage labels Apr 25, 2024
@Piedone Piedone modified the milestones: 2.x, 2.1 May 3, 2024
@MikeAlhayek MikeAlhayek modified the milestones: 2.1, 2.x Nov 13, 2024
Copy link
Contributor

We triaged this issue and set the milestone according to the priority we think is appropriate (see the docs on how we triage and prioritize issues).

This indicates when the core team may start working on it. However, if you'd like to contribute, we'd warmly welcome you to do that anytime. See our guide on contributions here.

@Piedone Piedone added the P2 label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants