Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unauthorized Error when Loading Large Datasets in Metadata Editor #2547

Open
vchendrix opened this issue Oct 9, 2024 · 5 comments
Open

Unauthorized Error when Loading Large Datasets in Metadata Editor #2547

vchendrix opened this issue Oct 9, 2024 · 5 comments
Assignees
Labels
bug ESS-DIVE Issues associated with the ESS-DIVE project

Comments

@vchendrix
Copy link
Collaborator

vchendrix commented Oct 9, 2024

When loading a dataset with over 100 data files in the Metadata Editor, users may encounter an error message "You are not authorized to edit this data set." However, this error message is not always displayed, and the page may appear to load normally. Upon inspecting the JavaScript console, a 401 Unauthorized error is visible.

Steps to Reproduce:

  1. Load a dataset with over 100 data files in the Metadata Editor.
  2. Observe the error message "You are not authorized to edit this data set." (if displayed).
  3. Check the JavaScript console for a 401 Unauthorized error.

Expected Behavior:

The Metadata Editor should load the dataset without errors, regardless of the number of data files.

Actual Behavior:

The Metadata Editor displays an error message or fails to load the dataset, with a 401 Unauthorized error visible in the JavaScript console.

Additional Context:

  • The issue may be related to the concurrent access of the meta endpoint for all 100 files.
  • The error is intermittent, and the page may load normally on some attempts.

Screen shot of error
Screenshot 2024-10-09 at 4 39 24 PM

Screencast of Issue
https://drive.google.com/file/d/1gMWnKKXP0esWlOtjtaNQHS67xHUJ3YQ4/view?usp=sharing

Example Error Information:

@vchendrix vchendrix added the bug label Oct 9, 2024
@mburrus
Copy link

mburrus commented Oct 11, 2024

I'm seeing this error on another private dataset with over 100 files: https://data.ess-dive.lbl.gov/view/ess-dive-161d0c0f88a0849-20240815T185407332338

Error messages I'm getting:
It's a mix of 200, 401, 501, and (failed)net::ERR_HTTP2_SERVER_REFUSED_STREAM errors across the data file PIDs.

Screenshot:
Screenshot 2024-10-11 at 3 12 16 PM

More Context

  • I was able to open an edit session for this dataset on September 30th and did so without hitting this error. On that attempt, I was trying to re-save the access policy from the share panel as a way for the files to inherit the metadata access policy. Not all the data files were updated and a number of them responded with a red exclamation point.
  • Before this attempt, I had tried to update the rightsHolder from the backend and hit some kind of concurrency error.

@mbjones mbjones added the ESS-DIVE Issues associated with the ESS-DIVE project label Oct 15, 2024
@mbjones
Copy link
Member

mbjones commented Oct 15, 2024

@vchendrix @mburrus Can you describe the access policies on all of the objects in this package?

  • Does your logged in ORCID have write access to all of the objects in this package?
  • Are all of the objects set with identical access policies?
    • if yes, then does the logged in ORCID have write access to all objects (metadata, resource map, data)?
    • If no, then:
      • are the metadata file and resource map file writable by the logged in ORCID?
      • are all of the data files the same policy, and what permissions does your ORCID have?

@robyngit @rushirajnenuji the http2 error may be a new side-effect of enabling HTTP/2 as a more efficient protocol on servers lately. Maybe @vchendrix can let us know if they support clinet requests for only HTTP/1.1 or also HTTP/2. Some of our test servers at NCEAS use HTTP/2, but AFAIK none of our production servers have it enabled yet.

@vchendrix
Copy link
Collaborator Author

@vchendrix @mburrus Can you describe the access policies on all of the objects in this package?

  • Does your logged in ORCID have write access to all of the objects in this package?
    Yes.
  • if yes, then does the logged in ORCID have write access to all objects (metadata, resource map, data)?
    Yes.
curl -H "Authorization: Bearer $ESS_DIVE_AUTH_TOKEN" "https://data.ess-dive.lbl.gov/catalog/d1/mn/v2/query/solr?q=id:ess-dive-3a48ab5f69ecf8d-20240108T174327967&fl=id,writePermission&wt=json" 
{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"id:ess-dive-3a48ab5f69ecf8d-20240108T174327967",
      "fl":"id,writePermission",
      "wt":"javabin",
      "version":"2"}},
  "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[
      {
        "id":"ess-dive-3a48ab5f69ecf8d-20240108T174327967",
        "writePermission":["CN=ess-dive-admins,DC=dataone,DC=org",
          "CN=watershed-function-sfa-admin,DC=dataone,DC=org",
          "CN=urn:node:ESS_DIVE,DC=dataone,DC=org"]}]
  }}
  • If no, then:

    • are the metadata file and resource map file writable by the logged in ORCID?
    • are all of the data files the same policy, and what permissions does your ORCID have?

@robyngit @rushirajnenuji the http2 error may be a new side-effect of enabling HTTP/2 as a more efficient protocol on servers lately. Maybe @vchendrix can let us know if they support clinet requests for only HTTP/1.1 or also HTTP/2. Some of our test servers at NCEAS use HTTP/2, but AFAIK none of our production servers have it enabled yet.

Our services support HTTP/2 and HTTP/1.1

% curl -s -I -X HEAD  https://data.ess-dive.lbl.gov 
HTTP/2 200 
date: Tue, 15 Oct 2024 18:21:02 GMT
content-type: text/html
content-length: 10352
set-cookie: INGRESSCOOKIE=19cdfff91257311df6e1f2f92cc10ee1|47d24e7c0dbc2412b1cf3a747b30e59a; Path=/; Secure; HttpOnly
x-frame-options: SAMEORIGIN
last-modified: Fri, 23 Aug 2024 21:21:03 GMT
etag: "2870-620605a3a8dc0"
accept-ranges: bytes
access-control-allow-origin: 
access-control-allow-headers: Authorization, Content-Type, Origin, Cache-Control
access-control-allow-methods: GET, POST, PUT, OPTIONS
access-control-allow-credentials: true
strict-transport-security: max-age=15724800; includeSubDomains

(base) val@vchendrix ~ % curl -s -I -X HEAD --http1.1 https://data.ess-dive.lbl.gov
HTTP/1.1 200 OK
Date: Tue, 15 Oct 2024 18:21:53 GMT
Content-Type: text/html
Content-Length: 10352
Connection: keep-alive
Set-Cookie: INGRESSCOOKIE=29a1440a13a68111b1a3b64412631550|47d24e7c0dbc2412b1cf3a747b30e59a; Path=/; Secure; HttpOnly
X-Frame-Options: SAMEORIGIN
Last-Modified: Fri, 23 Aug 2024 21:21:03 GMT
ETag: "2870-620605a3a8dc0"
Accept-Ranges: bytes
Access-Control-Allow-Origin: 
Access-Control-Allow-Headers: Authorization, Content-Type, Origin, Cache-Control
Access-Control-Allow-Methods: GET, POST, PUT, OPTIONS
Access-Control-Allow-Credentials: true
Strict-Transport-Security: max-age=15724800; includeSubDomains

@mburrus
Copy link

mburrus commented Oct 17, 2024

Hi @mbjones I have a follow up on the dataset that Val provided details for.

Considering that the unauthorized error is intermittently appearing and sometimes the user can edit the dataset, I told the user that they should go ahead and reload the edit session until it works. They were able to load the edit session eventually, but then they encountered an unexpected error message when they attempted to submit changes and their dataset was corrupted.

Here are the steps they took:

  1. Reload the submit URL until it works
  2. Change some metadata fields
  3. Upload 6 new files. 4/6 files uploaded successfully with a check mark. 2/6 files failed to upload and had a red exclamation mark.
  4. Click "Submit Dataset"
  5. Submission failed. See error message on top of webpage that says: The requested identifier <PID> is already used by another data object and therefore can not be used for this object...
  6. Go back to view landing page and see that files are no longer listed in the file table. I confirmed that the resource map is now missing.

Here's the quote from the user:

While it seemed that refreshing a few times worked to load the dataset, upon submission of my edits, I received an error for two of the new files I was attempting to upload (see screenshots). It now appears that some of the edits were saved (e.g., abstract, title), however, I am no longer able to see the files at the top of my dataset under "Files in this dataset" ( I see these listed at the bottom of my dataset).

Screenshots:
Screenshot 2024-10-16 at 5 07 48 PM
Screenshot 2024-10-16 at 5 07 53 PM

@vchendrix
Copy link
Collaborator Author

vchendrix commented Oct 18, 2024

@mbjones Looking a little more into this issue when loading the data table in the editor. Looked at the /meta calls that were returning 401 errors and they don't seem to be authenticating the token correctly. The token is there and it is valid but the following Metacat error is logged.

Error for https://data.ess-dive.lbl.gov/catalog/d1/mn/v2/meta/ess-dive-5c5a631453d321e-20231130T213147717572

2024-10-18T21:41:24.459477371Z metacat 20241018-21:41:24: [ERROR]: D1ResourceHandler: Serializing exception with code 401: READ not allowed on ess-dive-0c7f0edc620a810-20231130T213140216564 for subject[s]: public; authenticatedUser; http://orcid.org/0000-0001-9061-8952;  [edu.ucsb.nceas.metacat.restservice.D1ResourceHandler:serializeException:591]
org.dataone.service.exceptions.NotAuthorized: READ not allowed on ess-dive-0c7f0edc620a810-20231130T213140216564 for subject[s]: public; authenticatedUser; http://orcid.org/0000-0001-9061-8952; 
2024-10-18T21:41:24.459482481Z 	at edu.ucsb.nceas.metacat.dataone.D1AuthHelper.prepareAndThrowNotAuthorized(D1AuthHelper.java:461) ~[metacat.jar:?]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ESS-DIVE Issues associated with the ESS-DIVE project
Projects
None yet
Development

No branches or pull requests

5 participants