Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] implement the selenium-wire webdriver #730

Open
zodman opened this issue Nov 12, 2019 · 1 comment
Open

[feature] implement the selenium-wire webdriver #730

zodman opened this issue Nov 12, 2019 · 1 comment

Comments

@zodman
Copy link

zodman commented Nov 12, 2019

https://github.com/wkeeling/selenium-wire

Looks awesome logs the http(s) requests.

@turicas
Copy link
Contributor

turicas commented Nov 9, 2024

I would love to have this feature implemented and I'm willing to create a PR if the maintainers accept the idea.

In many Web scraping projects I need to get information regarding browser requests and responses. Sometimes to avoid making these requests again (to save an image, for example -- the browser downloaded it already) and other times to inspect the URLs and headers. selenium-wire is a handy library to do it, but it's
not directly available in splinter.

Current solution

My current solution requires monkey-patching splinter.driver.webdriver.firefox. After installing requirements with pip install splinter selenium-wire blinker==1.7.0, run:

import time

def start_browser():
    from seleniumwire.webdriver import Firefox as FirefoxWireDriver
    from splinter.browser import get_driver
    from splinter.driver.webdriver import firefox as splinter_firefox

    splinter_firefox.Firefox = FirefoxWireDriver
    browser = get_driver(splinter_firefox.WebDriver)
    return browser

browser = start_browser()
browser.visit("https://brasil.io/")
time.sleep(5)
print(len(browser.driver.requests))  # 48
browser.quit()

First proposal

If you're uncomfortable in supporting selenium-wire, I'd like to ask if it's possible to at least changing splinter.driver.webdriver.BaseWebDriver interface: if we add a class method get_driver_class I could implement a solution without monkey patching:

from splinter.driver.webdriver.firefox import WebDriver

class FirefoxWireDriver(WebDriver):
    @classmethod
    def get_driver_class(cls):
        # This would be called by __init__ and passed to _setup_firefox()
        from seleniumwire.webdriver import Firefox
        return Firefox

def start_browser():
    from splinter.browser import get_driver

    browser = get_driver(FirefoxWireDriver)
    return browser

browser = start_browser()
browser.visit("https://brasil.io/")
time.sleep(5)
print(len(browser.driver.requests))  # 48
browser.quit()

The code would be longer, but less hacky.

Another good improvement would be to add an official way to register new drivers, something like:

from splinter import Browser
from mymodule import FirefoxWireDriver
Browser.register("firefox-wire", FirefoxWireDriver)

browser = Browser("firefox-wire")
browser.visit(...)

Second proposal

The second (and ideal) proposal for me would be to add direct support to selenium-wire. The library could be an optional requirement and the implementation the same as above (creating the class FirefoxWireDriver in splinter.driver.webdriver.firefox_wire) plus adding
"firefox-wire" to splinter.browser._DRIVERS.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants