-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change schedule scrape helper to include offdays
#86
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jack Lichtenstein <[email protected]>
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
WalkthroughThe changes modify the ESPN Women's Basketball (WBB) calendar retrieval functionality in the Changes
Sequence DiagramsequenceDiagram
participant User
participant espn_wbb_calendar
participant __onoffdays_wbb_calendar
participant ESPN API
User->>espn_wbb_calendar: Call with season, onoffdays
espn_wbb_calendar->>__onoffdays_wbb_calendar: Request calendar data
__onoffdays_wbb_calendar->>ESPN API: Fetch on-days URL
__onoffdays_wbb_calendar->>ESPN API: Fetch off-days URL
ESPN API-->>__onoffdays_wbb_calendar: Return data
__onoffdays_wbb_calendar->>__onoffdays_wbb_calendar: Concatenate DataFrames
__onoffdays_wbb_calendar->>__onoffdays_wbb_calendar: Remove duplicate URLs
__onoffdays_wbb_calendar-->>espn_wbb_calendar: Return unique calendar data
espn_wbb_calendar-->>User: Return calendar DataFrame
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
sportsdataverse/wbb/wbb_schedule.py (1)
185-200
: Add error handling and consider performance optimization.While the implementation is functional, consider these improvements:
Error Handling:
- Add error handling for failed API requests.
- Validate response data structure before accessing nested fields.
Performance:
- Consider parallelizing the API calls to reduce latency.
Here's a suggested implementation with error handling and parallel requests:
def __onoffdays_wbb_calendar(season, **kwargs): + import concurrent.futures + import functools + + def fetch_calendar(url, **kwargs): + try: + resp = download(url=url, **kwargs) + data = resp.json() + if not data or 'eventDate' not in data or 'dates' not in data['eventDate']: + return [] + return data['eventDate']['dates'] + except Exception as e: + print(f"Error fetching calendar data: {e}") + return [] + url_on = f"https://sports.core.api.espn.com/v2/sports/basketball/leagues/womens-college-basketball/seasons/{season}/types/2/calendar/ondays" url_off = f"https://sports.core.api.espn.com/v2/sports/basketball/leagues/womens-college-basketball/seasons/{season}/types/2/calendar/offdays" - resp_on = download(url=url_on, **kwargs) - txt_on = resp_on.json().get("eventDate").get("dates") - url_off = f"https://sports.core.api.espn.com/v2/sports/basketball/leagues/womens-college-basketball/seasons/{season}/types/2/calendar/offdays" - resp_off = download(url=url_off, **kwargs) - txt_off = resp_off.json().get("eventDate").get("dates") + + with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor: + fetch_with_kwargs = functools.partial(fetch_calendar, **kwargs) + futures = [ + executor.submit(fetch_with_kwargs, url_on), + executor.submit(fetch_with_kwargs, url_off) + ] + txt_on, txt_off = [f.result() for f in futures] + result = pl.concat( [pl.DataFrame(txt_on, schema=["dates"]), pl.DataFrame(txt_off, schema=["dates"])], how="diagonal_relaxed" ) + if result.height == 0: + return pl.DataFrame(schema={"dates": pl.Utf8, "dateURL": pl.Utf8, "url": pl.Utf8}) + result = result.with_columns(dateURL=pl.col("dates").str.slice(0, 10)) result = result.with_columns( url="http://site.api.espn.com/apis/site/v2/sports/basketball/womens-college-basketball/scoreboard?dates=" + pl.col("dateURL") ).unique(subset="url") return result
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
sportsdataverse/wbb/wbb_schedule.py
(3 hunks)
🔇 Additional comments (2)
sportsdataverse/wbb/wbb_schedule.py (2)
141-146
: LGTM! Parameter rename and docstring update are clear.The parameter rename from
ondays
toonoffdays
better reflects its expanded functionality to handle both on-days and off-days.
157-158
: LGTM! Implementation changes are consistent.The condition and helper function call have been correctly updated to use the renamed parameter.
Summary by CodeRabbit