Add ESB ID capture #281

blakefinney · 2017-01-06T22:08:43Z

Copied the GSIS_ID function to pick up the ESB ID which is used for Injuries and Player Headshots.

Moved the function out, named parameters profile_url

BurntSushi · 2017-01-06T22:29:03Z

nflgame/update_players.py

@@ -181,6 +193,7 @@ def meta_from_soup_row(team, soup_row):
    return {
        'team': team,
        'profile_id': profile_id_from_url(profile_url),
+        'esb_id': esb,


I don't think this is sufficient. I think you also need to add it as a member on the Player object?

Do you have code that works with this patch?

I initially thought I had it in my code when it was fetching correctly.

Then used: game.players.playerid(pid)

Which fetched the player data from a particular game but was omitting the esb_id.

Working on it now

ochawkeye · 2017-01-06T22:35:05Z

Not trying to prematurely optimize this, but when we're dealing with thousands of players it's probably prudent to be aware that you'll end up hitting every profile_url twice, once to collect the gsis_id and again to collect the esb_id. Think it would be more sensitive to NFL.com to just grab both ids in one pass, no?

blakefinney · 2017-01-06T22:41:26Z

@ochawkeye Probably best to. In hindsight after running the update a few times I noticed it slowed significantly. Would update the gsis_id function to fetch both essentially and feed it in that way?

blakefinney · 2017-01-06T22:46:30Z

Improving efficiency. Make sure nflgame Player object has the "esb_id" passed through

BurntSushi · 2017-01-06T22:57:48Z

@ochawkeye Good catch, I missed that. Yeah, the update script is specifically designed around limiting HTTP traffic.

blakefinney · 2017-01-06T23:03:32Z

@ochawkeye @BurntSushi I've been trying to see when the gsis_id function (here: https://github.com/blakefinney/nflgame/blob/5af4de168872493fbd315d92c3adda0ff8e9ba45/nflgame/update_players.py#L116 ) is called.

But I think it only gets called when a player isn't in nflgame yet (Correct me if I'm wrong)

So it seems like the best approach would be to only include the ESB capture on a full-scan? But I can't run that because I'm getting " File "C:/Python27/Lib/site-packages/nflgame/update_players.py", line 365, in run
for _, schedule in nflgame.sched.games.itervalues():
ValueError: too many values to unpack"

What do you recommend?

ochawkeye · 2017-01-06T23:20:41Z

A previously submitted pull request #224 and a current pull request #266 attempted to address that issue. You can implement something like those as a work around.

blakefinney · 2017-01-06T23:27:00Z

@ochawkeye It's not the same issue (multiple commas in names) as I've already patched that myself.

It's a different one I believe, the same one as in this issue: #202

ochawkeye · 2017-01-07T02:10:07Z

Oh...that one. The old .itervalues() to .iteritems() one.

Can either do this:

for schedule in nflgame.sched.games.itervalues():
    # If the game is too far in the future, skip it...

or this:

for _, schedule in nflgame.sched.games.iteritems():
    # If the game is too far in the future, skip it...

…hen picking up new players

blakefinney · 2017-01-07T10:30:35Z

How about this guys? Still works in a similar way to the original PR I submitted, but when reading a row for a rostered player it checks if it's on full_scan. If it is, then go ahead and scrape the Player Profile page, if not skip over it.

It'll also be getting the ESB at the same time as the GSIS when scraping it for a completely new player so it shouldn't ever request the same profile page twice.

Obviously significant increases the speed of the --full-scan update players, but hopefully doing that once will get every player up to date with their ESB

ochawkeye · 2017-01-19T05:08:51Z

@blakefinney Would you mind tossing your players.json into this PR as well? Inclusion of that would allow all to have to full_scan to bring them up to latest(-ish).

blakefinney · 2017-01-19T10:02:53Z

@ochawkeye I've actually been working on doing it slightly differently. One of the other features that I was curious about was player's injury status leading up to a game.

I've been working on a scraping injury status from http://www.nfl.com/injuries and that could include the ESB ID capture. On that page they list players in the html, but Beautiful Soup doesn't pick it up. However they have a script tag that contains a JSON object of the injuries which I've successfully scraped and then assigned to players via the esb_id.

Just want to test it again with this week's injury report and may have a different PR up.

Let me know if it's worth doing. Maybe you as well @BurntSushi

Add ESB ID capture

15695c3

Copied the GSIS_ID function to pick up the ESB ID which is used for Injuries and Player Headshots.

blakefinney mentioned this pull request Jan 6, 2017

No ESB ID Capture #282

Open

Tweaked ESB capture

5af4de1

Moved the function out, named parameters profile_url

BurntSushi reviewed Jan 6, 2017

View reviewed changes

blakefinney closed this Jan 6, 2017

Blake Finney added 2 commits January 7, 2017 10:23

Altered code to only scan profile page for ESB ID on Full Scan, and w…

cdb3754

…hen picking up new players

Removed .idea file from Forked Repo

e260036

blakefinney reopened this Jan 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ESB ID capture #281

Add ESB ID capture #281

blakefinney commented Jan 6, 2017

BurntSushi Jan 6, 2017

blakefinney Jan 6, 2017

ochawkeye commented Jan 6, 2017

blakefinney commented Jan 6, 2017

blakefinney commented Jan 6, 2017

BurntSushi commented Jan 6, 2017

blakefinney commented Jan 6, 2017

ochawkeye commented Jan 6, 2017

blakefinney commented Jan 6, 2017 •

edited

Loading

ochawkeye commented Jan 7, 2017

blakefinney commented Jan 7, 2017

ochawkeye commented Jan 19, 2017

blakefinney commented Jan 19, 2017

Add ESB ID capture #281

Are you sure you want to change the base?

Add ESB ID capture #281

Conversation

blakefinney commented Jan 6, 2017

BurntSushi Jan 6, 2017

Choose a reason for hiding this comment

blakefinney Jan 6, 2017

Choose a reason for hiding this comment

ochawkeye commented Jan 6, 2017

blakefinney commented Jan 6, 2017

blakefinney commented Jan 6, 2017

BurntSushi commented Jan 6, 2017

blakefinney commented Jan 6, 2017

ochawkeye commented Jan 6, 2017

blakefinney commented Jan 6, 2017 • edited Loading

ochawkeye commented Jan 7, 2017

blakefinney commented Jan 7, 2017

ochawkeye commented Jan 19, 2017

blakefinney commented Jan 19, 2017

blakefinney commented Jan 6, 2017 •

edited

Loading