Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Импорт публичных постов из основного клуба в dev/local среду #1212

Merged
merged 3 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions authn/management/commands/import_posts_to_dev.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
import json
import urllib.request
from datetime import datetime, timedelta

from django.conf import settings
from django.core.management import BaseCommand

from posts.models.post import Post
from users.models.user import User
from common.markdown.markdown import markdown_text
from utils.strings import random_string


class Command(BaseCommand):
help = "Импорт постов с оригинального vas3k.club на dev/local сборки"

def add_arguments(self, parser):
parser.add_argument(
"--pages",
type=int,
default=1,
help="Количество страниц, забираемых из фида",
)

parser.add_argument(
"--skip",
type=int,
default=0,
help="Количество страниц, которые надо пропустить",
)

parser.add_argument(
"--force",
action="store_true",
help="Заменять посты, если они уже существуют",
)

def handle(self, *args, **options):
if not settings.DEBUG:
return self.stdout.write("☢️ Только для запуска в DEBUG режиме")

result = {
'post_exists': 0,
'post_created': 0,
'user_created': 0
}

for x in range(options['skip'], options['pages'] + options['skip']):
url = "https://vas3k.club/feed.json?page={}".format(x + 1)
self.stdout.write("📁 {}".format(url))
req = urllib.request.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Предлагаю поставить здесь отличимый User-Agent. На случай, если этот скрипт начнут абьюзить.

Suggested change
req.add_header('User-Agent', 'Mozilla/5.0')
req.add_header('User-Agent', 'poststodev')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igoose1 исправил на posts-to-dev

response = urllib.request.urlopen(req)
data = json.load(response)
for item in data['items']:
# приватные нафиг
if not (item['_club']['is_public']):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not (item['_club']['is_public']):
if not item['_club']['is_public']:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igoose1 php привычка, поправлю

continue

author, created = create_user(item['authors'][0])
if created:
result['user_created'] += 1
self.stdout.write(" 👤 \"{}\" пользователь создан".format(author.full_name))

defaults = dict(
id=item['id'],
title=item['title'],
type=item['_club']['type'],
slug=random_string(10),
text=item['content_text'],
html=markdown_text(item['content_text']),
image=author.avatar, # хак для постов типа "проект", чтобы не лазить по вастрику лишний раз
created_at=item['date_published'],
comment_count=item['_club']['comment_count'],
view_count=item['_club']['view_count'],
upvotes=item['_club']['upvotes'],
is_visible=True,
is_visible_in_feeds=True,
is_commentable=True,
is_approved_by_moderator=True,
is_public=True,
author_id=author.id,
is_shadow_banned=False,
published_at=item['date_published'],
coauthors=[]
)

exists = False
try:
post = Post.objects.get(id=item['id'])
exists = True
except Post.DoesNotExist:
post = Post.objects.create(**defaults)

if exists and not options['force']:
result['post_exists'] += 1
self.stdout.write(" 📌 \"{}\" уже существует".format(item['title']))
continue

post.__dict__.update(defaults)
post.save()

result['post_created'] += 1
self.stdout.write(" 📄 \"{}\" запись создана".format(item['title']))

self.stdout.write("")
self.stdout.write("Итого:")
self.stdout.write("📄 Новых постов: {}".format(result['post_created']))
self.stdout.write("📌 Уже существовало: {}".format(result['post_exists']))
self.stdout.write("👤 Новых пользователей: {}".format(result['user_created']))


def create_user(author):
split = author['url'].split('/')
slug = split[-2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
split = author['url'].split('/')
slug = split[-2]
*_, slug, _ = author['url'].split('/') # takes SLUG from "https://vas3k.club/user/SLUG/"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igoose1 повторю, я в питоне на уровне фреймворков пару дней, в основном скрипты писал разовые. как скажешь. *_ -- крутая конструкция, почти влюбился


defaults = dict(
slug=slug,
avatar=author['avatar'],
email=random_string(30),
full_name=author['name'],
company="FAANG",
position="Team Lead конечно",
balance=10000,
created_at=datetime.utcnow(),
updated_at=datetime.utcnow(),
membership_started_at=datetime.now(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
membership_started_at=datetime.now(),
membership_started_at=datetime.utcnow(),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igoose1 my bad, правлю

membership_expires_at=datetime.utcnow() + timedelta(days=365 * 100),
is_email_verified=True,
moderation_status=User.MODERATION_STATUS_APPROVED,
roles=[],
)

user, created = User.objects.get_or_create(slug=slug, defaults=defaults)

return user, created
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
user, created = User.objects.get_or_create(slug=slug, defaults=defaults)
return user, created
return User.objects.get_or_create(slug=slug, defaults=defaults)

22 changes: 22 additions & 0 deletions docs/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,25 @@ To run telegram bot you have to:
## Docker-compose

Check out our [docker-compose.yml](https://github.com/vas3k/vas3k.club/blob/master/docker-compose.yml) to understand the infrastructure.

## Load posts from main vas3k.club to dev/local database

Sometimes you need fill saome posts/users data from real project. For this case you can use `import_posts_to_dev` command.

Command fetch https://vas3k.club/feed.json and copy `is_public=True` posts to your database:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Command fetch https://vas3k.club/feed.json and copy `is_public=True` posts to your database:
Command fetch https://vas3k.club/feed.json and copy public posts to your database:

Copy link
Contributor Author

@trin4ik trin4ik Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igoose1 решил на мотив питона указать, всё же питонисты будут читать. точно надо просто public указать?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trin4ik, по-твоему тоже ок.

```bash
# fetch first page
$ python3 manage.py import_posts_to_dev

# fetch first 10 pages
$ python3 manage.py import_posts_to_dev --pages 10

# fetch 10 pages, starts from page 5
$ python3 manage.py import_posts_to_dev --pages 10 --skip 5

# fetch 10 pages, starts from page 5 and update exists posts
$ python3 manage.py import_posts_to_dev --pages 10 --skip 5 --force

# if use docker-compose
$ docker exec -it club_app python3 manage.py import_posts_to_dev --pages 2
```
Loading