Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug/Support]: elpaca<-create is potentially non-atomic causing some order to be processed twice #421

Open
2 of 3 tasks
Alan-Chen99 opened this issue Feb 9, 2025 · 5 comments

Comments

@Alan-Chen99
Copy link

Confirmation

  • I have checked the documentation (README, Wiki, docstrings, etc)
  • I am checking these without reading them.
  • I have searched previous issues to see if my question is a duplicate.

Elpaca Version

forked from 141b2f5

Operating System

ubuntu

Description

elpaca<-create calls elpaca-menu-functions which calls the async url-retrieve-synchronously. During that another package will potentially depend on the package and cause elpaca<-create to be called again for the same id.

This comes up after using the new lock file feature where some package does not search inside menus, so it is no longer the case that all the menus are fetched up front.

@progfolio
Copy link
Owner

progfolio commented Feb 9, 2025

Thanks for the report.

elpaca<-create calls elpaca-menu-functions which calls the async
url-retrieve-synchronously.

Typo? url-retrieve-synchronously is synchronous.

During that another package will potentially
depend on the package and cause elpaca<-create to be called again for the same
id.

This comes up after using the new lock file feature where some package does not
search inside menus, so it is no longer the case that all the menus are fetched
up front.

Can you provide a test using the elpaca-test macro?
It may help me understand the issue more clearly.

The way the lock file is intended to work is that the recipes for all init packages and their dependencies are written. Then they are used as the first menu. This should prevent other menus from being checked during init altogether.

@Alan-Chen99
Copy link
Author

Unfortunately the only logs I have are at
https://github.com/Alan-Chen99/dotfiles3/actions/runs/13228588924/job/36922616492#logs
This doesnt reproduce locally; Im not sure why.
In the logs compat failed with "Unable to find main elisp file for \"compat\"" which can be seen from the stacktrace to occur during the invocation of elpaca<-create(compat). Im not sure if Unable to find main elisp file is caused by this or not; It might be due to another problem.
Eventually this leads to dependents of compat not being failed properly. Im not sure why though.

@progfolio
Copy link
Owner

It's hard to tell exactly what's going on based off of those logs alone.
It looks like you've introduced some advice in the system, which I have no way to evaluate.
If you're able to find a reliable way to reproduce the issue, feel free to comment here and we can look into it more.

@progfolio
Copy link
Owner

I've found a reliable way to reproduce the issue.
This will fail for Emacs 29 and below:

(elpaca-test
  :interactive t
  :init
  (elpaca transient)
  (elpaca magit))

Your initial analysis seems probable. Compat is queued twice.

elpaca<-create calls elpaca-menu-functions which calls the async
url-retrieve-synchronously.

Typo? url-retrieve-synchronously is synchronous.

After digging into the source of url-retrieve-synchronously, I see it calls accept-process-output, which will allow subprocesses to run. I think that may have something to do with it, but I still don't see how the race could occur, considering queuing the packages all takes place on the main elisp thread.

Adding an explicit declaration for compat works around the issue:

(elpaca-test
  :interactive t
  :init
  (elpaca compat)
  (elpaca transient)
  (elpaca magit))

And the issue doesn't occur when the menu caches are present.
I'll have to dig into this more.

@Alan-Chen99
Copy link
Author

After digging into the source of url-retrieve-synchronously, I see it calls accept-process-output, which will allow subprocesses to run. I think that may have something to do with it, but I still don't see how the race could occur, considering queuing the packages all takes place on the main elisp thread.

I believe during accept-process-output other process filters got ran during which can cause a package to go to the next step and queue its dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants