You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've got a page with a bunch of javascript (which I believe Mechanize will make no effort to handle) - and helpfully there is something in that page that looks like the start of a tag. This ends badly! I get the ParseError("nested SELECTs") exception raised when a genuine select is found.
So, two questions
long term, is it right that Mechanize is parsing the contents of the <script> tag at all? Not handling javascript is fine, but trying to parse that javascript as if it was HTML won't work well.
can anyone think of a short term fix to get me moving? Can I manipulate pages within Mechanize (in this case so I can delete the <script> elements)?
The text was updated successfully, but these errors were encountered:
I found an answer for (2) - see below. Epic hack, but works :)
No idea how to fix (1), won't waste time looking until someone confirms it is at least something that should be fixed. [For instance, I don't know if a script tag can ever have children - if so that would complicate things, as we'd only want to exclude the script contents and not its children]
_sgmllib_copy.py:parse_starttag()
self.__starttag_text = rawdata[start_pos:j]
+ if not self.rawdata[i-1] == "'": # HACK - quick way to exclude fake tags hiding in javascript
+ self.finish_starttag(tag, attrs)
return j
I've got a page with a bunch of javascript (which I believe Mechanize will make no effort to handle) - and helpfully there is something in that page that looks like the start of a tag. This ends badly! I get the ParseError("nested SELECTs") exception raised when a genuine select is found. So, two questions long term, is it right that Mechanize is parsing the contents of the <script> tag at all? Not handling javascript is fine, but trying to parse that javascript as if it was HTML won't work well. can anyone think of a short term fix to get me moving? Can I manipulate pages within Mechanize (in this case so I can delete the <script> elements)?
The text was updated successfully, but these errors were encountered: