-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a custom :contains-regexp() pseudo class? #117
Comments
It's important to note Beautiful Soup already provides regex, we don't need this, but it might be nice to incorporate regex in some way for selectors as well. We just need to decide if we are willing to pay to commit to a solution, and what that solution should look like. |
If we do this, a name like When defining regex keywords, should we require them to be in the form of custom CSS variables: Or we could extend custom maybe? If you give a regex pattern instead of selector string, it searches a tag's content? Just some ideas. |
Thinking about this more, we really could use custom selectors to do regex. Currently we take a string for a given custom pseudo-class, but we could accept an custom pseudo-class object as well. The object could take a selector, a text search value regex or string. You could even extend it to allow attribute values as well: So just thinking out loud here. Assuming custom is a hashable object import soupsieve as sv
import re
custom = {
':--custom-pseudo': sv.CustomPseudo(
'p.class',
text=re.compile(r'test-[a-z\d]+', re.I),
attr={'data-item': re.compile(r'1[0-9]{2}')}
)
}
sv.compile('article div > :custom-pseudo', custom=custom) It may even be possible to allow a custom function, but I'm not sure yet. As long as things remained hashable and pickle-able, it would be doable, but I imagined this may not always behave proper sending in a function, as the patterns get cached. Caching a pattern with a function does not guarantee you'd get the same behavior....I think I'd pass on functions for now. |
Another possibility is to extend contains and the attribute equal case to accept custom template variables: You would define regular expressions with custom variable names which could be a valid identifier with a regexp = {
'content-pattern': re.compile(r'test-[a-z\d]+', re.I),
'attr-pattern': re.compile(r'1[0-9]{2}')
}
sv.compile('p:contains($content-pattern)[data-item=$attr-pattern]', regexp=regexp) Maybe this is the most straight forward approach? If nothing, it is another option. Custom patterns may still need a way to |
If we end up doing #175, this would not be needed. |
This is open currently as an exploratory idea. This would be a custom pseudo-class that would allow for regular expression searches of content. The idea would probably not be to include regular expression directly in the pattern, but most likely references to compiled patterns:
Do we make this like contains, and have it search all children of
p
looking for the pattern, or do we constrain it to the target element ofp
? Or do we have two variants that do all children or only the target::-regexp()
and:-regexp-direct
(or some other name that gets the idea across).Anyways this is just an idea, but maybe in the future (if we flesh this out enough), we can implement this.
The text was updated successfully, but these errors were encountered: