-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ESB ID capture #281
base: master
Are you sure you want to change the base?
Add ESB ID capture #281
Conversation
Copied the GSIS_ID function to pick up the ESB ID which is used for Injuries and Player Headshots.
Moved the function out, named parameters profile_url
@@ -181,6 +193,7 @@ def meta_from_soup_row(team, soup_row): | |||
return { | |||
'team': team, | |||
'profile_id': profile_id_from_url(profile_url), | |||
'esb_id': esb, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is sufficient. I think you also need to add it as a member on the Player
object?
Do you have code that works with this patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially thought I had it in my code when it was fetching correctly.
Then used: game.players.playerid(pid)
Which fetched the player data from a particular game but was omitting the esb_id.
Working on it now
Not trying to prematurely optimize this, but when we're dealing with thousands of players it's probably prudent to be aware that you'll end up hitting every |
@ochawkeye Probably best to. In hindsight after running the update a few times I noticed it slowed significantly. Would update the gsis_id function to fetch both essentially and feed it in that way? |
Improving efficiency. Make sure nflgame Player object has the "esb_id" passed through |
@ochawkeye Good catch, I missed that. Yeah, the update script is specifically designed around limiting HTTP traffic. |
@ochawkeye @BurntSushi I've been trying to see when the gsis_id function (here: https://github.com/blakefinney/nflgame/blob/5af4de168872493fbd315d92c3adda0ff8e9ba45/nflgame/update_players.py#L116 ) is called. But I think it only gets called when a player isn't in nflgame yet (Correct me if I'm wrong) So it seems like the best approach would be to only include the ESB capture on a full-scan? But I can't run that because I'm getting " File "C:/Python27/Lib/site-packages/nflgame/update_players.py", line 365, in run What do you recommend? |
@ochawkeye It's not the same issue (multiple commas in names) as I've already patched that myself. It's a different one I believe, the same one as in this issue: #202 |
Oh...that one. The old Can either do this: for schedule in nflgame.sched.games.itervalues():
# If the game is too far in the future, skip it... or this: for _, schedule in nflgame.sched.games.iteritems():
# If the game is too far in the future, skip it... |
…hen picking up new players
How about this guys? Still works in a similar way to the original PR I submitted, but when reading a row for a rostered player it checks if it's on full_scan. If it is, then go ahead and scrape the Player Profile page, if not skip over it. It'll also be getting the ESB at the same time as the GSIS when scraping it for a completely new player so it shouldn't ever request the same profile page twice. Obviously significant increases the speed of the --full-scan update players, but hopefully doing that once will get every player up to date with their ESB |
@blakefinney Would you mind tossing your |
@ochawkeye I've actually been working on doing it slightly differently. One of the other features that I was curious about was player's injury status leading up to a game. I've been working on a scraping injury status from http://www.nfl.com/injuries and that could include the ESB ID capture. On that page they list players in the html, but Beautiful Soup doesn't pick it up. However they have a script tag that contains a JSON object of the injuries which I've successfully scraped and then assigned to players via the esb_id. Just want to test it again with this week's injury report and may have a different PR up. Let me know if it's worth doing. Maybe you as well @BurntSushi |
Copied the GSIS_ID function to pick up the ESB ID which is used for Injuries and Player Headshots.