You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some server owners have long requested adding ways to stream a number of defined pages using the bot.
I have thought before that the best way for doing this would be something like glob patterns, but this has multiple problems. For one, you would have to re-implement or take a library that is doing glob matching. There are also questions on whether it would be clashing with actual MediaWiki titles. After researching this question for a bit, I decided that just allowing people to use regular expressions (regexps) is good enough to solve this need.
Here are the theoretical requirements for any potential implementation:
Regexps can be passed only to --title attribute of the configuration.
Regexps should be passed using --title /.*/ syntax (i. e. always wrapped into //), since this would keep the params to the minimum and introduce a simple way to tell what is a regexp and what is not (str.StartsWith('/')). This needs to account for articles like https://en.wikipedia.org/wiki//b/ which are unlikely to have their own stream feeds but probably still need some way to reference them in EventStreams (e. g. :/b/?).
The code should define a reasonable MatchTimeout (0.5 second?) and try/catch errors from slow regexps to prevent any ReDOS attacks.
Passed regexps should be tested with the timeout and slow regexps should be rejected by the bot on the configuration step (!openStream).
Passed regexps should match the whole string for clarity (^…$) and should not ignore case.
(If we can find a way) Regexps should be as simple as possible in the number of features allowed.
There might be other notable things I forgot, please report them if you read the issue and can think of them.
The text was updated successfully, but these errors were encountered:
Another idea: make --title-matches key (name can be discussed) (--in-title?) for --namespace streams only for simplicity (makes it easier to process this and would require less changes to the current shaky structure of the code).
Some server owners have long requested adding ways to stream a number of defined pages using the bot.
I have thought before that the best way for doing this would be something like glob patterns, but this has multiple problems. For one, you would have to re-implement or take a library that is doing glob matching. There are also questions on whether it would be clashing with actual MediaWiki titles. After researching this question for a bit, I decided that just allowing people to use regular expressions (regexps) is good enough to solve this need.
Here are the theoretical requirements for any potential implementation:
--title
attribute of the configuration.--title /.*/
syntax (i. e. always wrapped into//
), since this would keep the params to the minimum and introduce a simple way to tell what is a regexp and what is not (str.StartsWith('/')
). This needs to account for articles like https://en.wikipedia.org/wiki//b/ which are unlikely to have their own stream feeds but probably still need some way to reference them in EventStreams (e. g.:/b/
?).!openStream
).^…$
) and should not ignore case.There might be other notable things I forgot, please report them if you read the issue and can think of them.
The text was updated successfully, but these errors were encountered: