Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding twitch #1132

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions docs/modules/twitch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
```python
from scrape_up import twitch
```

### Scrape

First, create an object of class `TwitchScraper`

```python
twitch_scraper = TwitchScraper()
```

| Methods | Details |
| -------------------------- | --------------------------------------------- |
| `.scrape_title_description(channel)` | Returns: Stream Title (if Live) or Channel Description (if Offline). |


---

Example: using KaiCenat's twitch channel
```python
scraper = TwitchScraper()
title = scraper.scrape_title_description("kaicenat")
print(title)
```

Output (if Live): ⚔️100+ HOUR STREAM⚔️ELDEN RING DLC MARATHON⚔️CLICK HERE⚔️LORD DWARF⚔️ELITE GAMER⚔️FOCUS⚔️


**WARNING!**
Smaller twitch channels with low stream time results is generic return value:
"Twitch is the world's leading video platform and community for gamers."
18 changes: 18 additions & 0 deletions documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -733,3 +733,21 @@ boxoffice = imdb.BoxOffice()
| Methods | Details |
| --------------- | ------------------------------------------------------------------------------- |
| `.top_movies()` | Returns the top box office movies, weekend and total gross, and weeks released. |

### Twitch

```py
from scrape_up import twitch
```

Create an instance of `TwitchScraper` class

```python
twitch_scraper = TwitchScraper()
```

| Method | Details |
| --------------------------- | -------------------------------------------------------------------- |
| `scrape_title_description(channel)` | Returns: Stream Title (if Live) or Channel Description (if Offline). |

---
59 changes: 59 additions & 0 deletions src/scrape_up/twitch/twitch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
from bs4 import BeautifulSoup

import requests

'''
WARNING!!!
Smaller twitch channels with low stream time results is generic return value:
"Twitch is the world's leading video platform and community for gamers."

Steps for use:

1. Create "TwitchScraper()" class instance
2. call the instance's function "scrape_title_description()" providing the channel name as a string
3. Channel stream title is returned (if live), Channel Description is returned (if offline)

Example: using KaiCenat's twitch channel

scraper = TwitchScraper()
title = scraper.scrape_title_description("kaicenat")
print(title)

'''
class TwitchScraper():

def __init__(self):
self.status = None

# this function just gets the text within the quotes for the descrption or title
def get_in_quotes(self, input_string):
if input_string is None:
return None
start_index = 15
end_index = input_string.find('"', start_index)
try:
result = input_string[start_index:end_index]
return result

except:
return "Error: No suitable title/description"

# This function returns either the title of the stream if the channel is live, or the channel description if it's offline
def scrape_title_description(self, channel: str):
url = f"https://www.twitch.tv/{channel}"
try:
response = requests.get(url)
if response.status_code != 200:
print(f"Failed to retrieve the page. This channel might not exist. Status code: {response.status_code}")
return None
# make soup to parse html
soup = BeautifulSoup(response.content, 'html.parser')
# find the property contain the title or description
meta_tag_2 = soup.find('meta', {'property': "og:description"})
# return just the string text itself
return self.get_in_quotes(str(meta_tag_2))

except:
print("Failed to scrape.")
return None