Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save custom attribute like wait_time in SeleniumRequest into Request.meta #67

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

laggardkernel
Copy link

@laggardkernel laggardkernel commented Jul 10, 2020

The canonical way to share info between Request and download middleware is Request.meta.
Custom attributes Request.custom_attr should be avoided, cause it may be dropped
after possible serialization and de-serialization in Scheduler.

E.g. scrapy-redis converts Request into dict with scrapy.utils.reqpar.request_to_dict(). Custom attribute on Request will be lost.

# scrapy.utils.reqpar
def request_to_dict(request, spider=None):
    """Convert Request object to a dict.

    If a spider is given, it will try to find out the name of the spider method
    used in the callback and store that as the callback.
    """
    cb = request.callback
    if callable(cb):
        cb = _find_method(spider, cb)
    eb = request.errback
    if callable(eb):
        eb = _find_method(spider, eb)
    d = {
        'url': to_unicode(request.url),  # urls should be safe (safe_string_url)
        'callback': cb,
        'errback': eb,
        'method': request.method,
        'headers': dict(request.headers),
        'body': request.body,
        'cookies': request.cookies,
        'meta': request.meta,
        '_encoding': request._encoding,
        'priority': request.priority,
        'dont_filter': request.dont_filter,
        'flags': request.flags,
        'cb_kwargs': request.cb_kwargs,
    }
    if type(request) is not Request:
        d['_class'] = request.__module__ + '.' + request.__class__.__name__
    return d

The canonical way to share info between `Request` and download middleware is
`Request.meta`. Custom attributes `Request.custom_attr` should be avoided,
cause they may be dropped after possible serialization and de-serialization
in `Scheduler`.

E.g. `scrapy-redis` converts `Request` into dict with
`scrapy.utils.reqpar.request_to_dict()`.
Custom attribute on `Request` will be lost.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant