Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate Alternate Repository Location URLS #16817

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ewdurbin
Copy link
Member

@ewdurbin ewdurbin commented Oct 1, 2024

User behavior for this configuration has shown that most URLs supplied are not simple indexes at all, predominately they have been source repository links.

This is a bare-bones approach to validating that the URL is actually a simple index.

project_name=self.project.name,
)
)
return self.manage_project_settings(add_alternate_repository_form=form)

Check warning

Code scanning / CodeQL

Reflected server-side cross-site scripting Medium

Cross-site scripting vulnerability due to a
user-provided value
.

Copilot Autofix AI about 9 hours ago

To fix the reflected server-side cross-site scripting vulnerability, we need to ensure that any user-provided data is properly escaped before being rendered in the template. In this case, we can use the html.escape() function from Python's standard library to escape the form data before it is included in the dictionary returned by the manage_project_settings method.

warehouse/manage/views/__init__.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/warehouse/manage/views/__init__.py b/warehouse/manage/views/__init__.py
--- a/warehouse/manage/views/__init__.py
+++ b/warehouse/manage/views/__init__.py
@@ -1161,2 +1161,3 @@
 
+        import html
         return {
@@ -1170,3 +1171,3 @@
             ),
-            "add_alternate_repository_form_class": add_alt_repo_form,
+            "add_alternate_repository_form_class": html.escape(str(add_alt_repo_form)),
         }
EOF
@@ -1161,2 +1161,3 @@

import html
return {
@@ -1170,3 +1171,3 @@
),
"add_alternate_repository_form_class": add_alt_repo_form,
"add_alternate_repository_form_class": html.escape(str(add_alt_repo_form)),
}
Copilot is powered by AI and may make mistakes. Always verify output.
@@ -14,6 +14,8 @@

import wtforms

from urllib3 import PoolManager, Timeout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: ‏we use requests in other places in the code (example), and this appears to be the first time we're using urllib3 directly. Is that to get both connect and read timeouts? If so, then we can pass those as a tuple to requests like so:

response = requests.head(field.data, timeout=(1.0, 1.0), headers=...)

But that apparently has the same affect as only specifying it once, since th value will be applied to both, so we could satisfy with

response = requests.head(field.data, timeout=1.0, headers=...)

https://requests.readthedocs.io/en/latest/user/advanced/#timeouts

I'm also wondering why we aren't using request.http.head(...) from

def includeme(config):
config.add_request_method(
ThreadLocalSessionFactory(config.registry.settings.get("http")),
name="http",
reify=True,
)
more - is it that the forms don't have the request object available to them?

)
except Exception as exc:
raise wtforms.validators.ValidationError(
_(f"Unable to parse simple index at given url: {exc}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_(f"Unable to parse simple index at given url: {exc}")
_(f"The URL provided does not respond as a Simple repository: {exc}")

And maybe even tack on a URL? https://packaging.python.org/specifications/simple-repository-api/

@@ -1129,7 +1129,7 @@ def __init__(self, project, request):
self.add_alternate_repository_form_class = AddAlternateRepositoryForm

@view_config(request_method="GET")
def manage_project_settings(self):
def manage_project_settings(self, add_alternate_repository_form=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is this because we're using two methods for GET vs POST, and how to make sure that the form values aren't reset on validation failure? I don't know if I've seen this pattern before, and wanted to understand it more.‏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants