Skip to content

Commit

Permalink
上传代码
Browse files Browse the repository at this point in the history
  • Loading branch information
cuifengcn committed Jan 2, 2023
1 parent 20d3e75 commit 1a0a750
Show file tree
Hide file tree
Showing 129 changed files with 15,143 additions and 1 deletion.
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/encodings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/wechat-video-generate.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 21 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,22 @@
# wechat-video-generate
一键生成微信对话视频的工具
___
## 微信对话视频一键生成器
### 使用方法
1. 下载本项目代码
2. 安装requirements.txt环境
3. 执行main.py文件

### 使用教程
1. 填写关键字
![](docs/1.png)
2. 填写要下载的图片数量,并点击下载
![](docs/2.png)
3. ocr会进行文字识别并展示
![](docs/3.png)
4. 手动修改ocr识别不准确的地方
![](docs/4.png)
5. 点击生成,等待生成完成
![](docs/5.png)

### 参考项目
[adsda](http://baidu.com)
Binary file added docs/1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10001.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10002.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10003.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10004.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10005.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10006.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10007.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10008.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10009.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10010.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10011.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10012.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10013.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10014.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10015.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10016.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10017.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10018.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10019.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10020.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faces/10021.jpg
Binary file added faces/10022.jpg
Binary file added faces/10023.jpg
Binary file added faces/10024.jpg
Binary file added faces/10025.jpg
Binary file added faces/10026.jpg
Binary file added faces/10027.jpg
Binary file added faces/10028.jpg
Binary file added faces/10029.jpg
Binary file added faces/10030.jpg
Binary file added faces/10031.jpg
Binary file added faces/10032.jpg
Binary file added faces/10033.jpg
Binary file added faces/10034.jpg
Binary file added faces/10035.jpg
Binary file added faces/10036.jpg
Binary file added faces/10037.jpg
Binary file added faces/10038.jpg
Binary file added faces/10039.jpg
Binary file added faces/10040.jpg
Binary file added faces/10041.jpg
Binary file added faces/10042.jpg
Binary file added faces/10043.jpg
Binary file added faces/10044.jpg
Binary file added faces/10045.jpg
Binary file added faces/10046.jpg
Binary file added faces/10047.jpg
Binary file added faces/10048.jpg
Binary file added faces/10049.jpg
Binary file added faces/10050.jpg
Binary file added faces/10051.jpg
Binary file added faces/10052.jpg
Binary file added faces/10053.jpg
Binary file added faces/10054.jpg
Binary file added faces/10055.jpg
Binary file added faces/10056.jpg
Binary file added faces/10057.jpg
Binary file added faces/10058.jpg
Binary file added faces/10059.jpg
Binary file added faces/10060.jpg
5 changes: 5 additions & 0 deletions main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
if __name__ == "__main__":
import flet as ft
from ui.ui import main

ft.app(target=main)
19 changes: 19 additions & 0 deletions methods/audio_split.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from moviepy.editor import AudioFileClip
import logging
import glob


class AudioSplitClient:
def start(self, path, logger=logging.getLogger()):
if not path:
logger.error("音频分离:路径为空!")
return
files = glob.glob(path + "/*.mp4")
if not files:
logger.error("音频分离:mp4文件未找到")
return
for _file in files:
my_audio_clip = AudioFileClip(_file)
my_audio_clip.write_audiofile("".join(_file.split(".")[:-1]) + ".wav")
logger.info(f"{_file}分离音频完成")
logger.info("音频分离完成")
225 changes: 225 additions & 0 deletions methods/baidu_pics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import argparse
import json
import logging
import os
import re
import socket
import sys

import time
import urllib
import urllib.error
import urllib.parse
import urllib.request
from threading import Thread
from typing import Optional
from utils import WORK_PATH

timeout = 5
socket.setdefaulttimeout(timeout)

"""
百度图片爬虫代码
"""


class BaiduPicsClient:
# 睡眠时长
__time_sleep = 0.1
__amount = 0
__start_amount = 0
__counter = 0
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0",
"Cookie": "",
}
__per_page = 30

# 获取图片url内容等
# t 下载图片时间间隔
def __init__(self):
self.time_sleep = 0.1
self.logger = logging.getLogger()

# 获取后缀名
@staticmethod
def get_suffix(name):
m = re.search(r"\.[^\.]*$", name)
if m.group(0) and len(m.group(0)) <= 5:
return m.group(0)
else:
return ".jpeg"

@staticmethod
def handle_baidu_cookie(original_cookie, cookies):
"""
:param string original_cookie:
:param list cookies:
:return string:
"""
if not cookies:
return original_cookie
result = original_cookie
for cookie in cookies:
result += cookie.split(";")[0] + ";"
result.rstrip(";")
return result

# 保存图片
def save_image(self, rsp_data, word):
path = WORK_PATH.joinpath(word)
path.mkdir(exist_ok=True, parents=True)
# 判断名字是否重复,获取图片长度
self.__counter = len(list(path.iterdir())) + 1
for image_info in rsp_data["data"]:
try:
if "replaceUrl" not in image_info or len(image_info["replaceUrl"]) < 1:
continue
obj_url = image_info["replaceUrl"][0]["ObjUrl"]
thumb_url = image_info["thumbURL"]
url = (
"https://image.baidu.com/search/down?tn=download&ipn=dwnl"
"&word=download&ie=utf8&fr=result&url=%s&thumburl=%s"
% (urllib.parse.quote(obj_url), urllib.parse.quote(thumb_url))
)
time.sleep(self.time_sleep)
suffix = self.get_suffix(obj_url)
# 指定UA和referrer,减少403
opener = urllib.request.build_opener()
opener.addheaders = [
(
"User-agent",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/83.0.4103.116 Safari/537.36",
),
]
urllib.request.install_opener(opener)
# 保存图片
filepath = path.joinpath(str(self.__counter) + str(suffix))
urllib.request.urlretrieve(url, filepath)
if os.path.getsize(filepath) < 5:
self.logger.info("下载到了空文件,跳过!")
os.unlink(filepath)
continue
except urllib.error.HTTPError as urllib_err:
self.logger.error(urllib_err)
continue
except Exception as err:
time.sleep(1)
self.logger.error(str(err))
self.logger.error("产生未知错误,放弃保存")
continue
else:
self.logger.info("小黄图+1,已有" + str(self.__counter) + "张小黄图")
self.__counter += 1
return

# 开始获取
def get_images(self, word):
search = urllib.parse.quote(word)
# pn int 图片数
pn = self.__start_amount
while pn < self.__amount:
url = (
"https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj"
"&ct=201326592&is=&fp=result&queryWord=%s&cl=2&lm=-1&ie=utf-8&oe=utf-8"
"&adpicid=&st=-1&z=&ic=&hd=&latest=&copyright=&word=%s&s=&se=&tab=&width=&height=&face=0"
"&istype=2&qc=&nc=1&fr=&expermode=&force=&pn=%s&rn=%d&gsm=1e&1594447993172="
% (search, search, str(pn), self.__per_page)
)
# 设置header防403
try:
time.sleep(self.time_sleep)
req = urllib.request.Request(url=url, headers=self.headers)
page = urllib.request.urlopen(req)
self.headers["Cookie"] = self.handle_baidu_cookie(
self.headers["Cookie"], page.info().get_all("Set-Cookie")
)
rsp = page.read()
page.close()
except UnicodeDecodeError as e:
self.logger.error(e)
self.logger.error("-----UnicodeDecodeErrorurl:", url)
except urllib.error.URLError as e:
self.logger.error(e)
self.logger.error("-----urlErrorurl:", url)
except socket.timeout as e:
self.logger.error(e)
self.logger.error("-----socket timout:", url)
else:
# 解析json
rsp_data = json.loads(rsp, strict=False)
if "data" not in rsp_data:
self.logger.error("触发了反爬机制,自动重试!")
else:
self.save_image(rsp_data, word)
# 读取下一页
self.logger.info("下载下一页")
pn += self.__per_page
self.logger.info("下载任务结束")
return

def start(self, word, total_page=1, start_page=1, per_page=30, delay=0.1, logger=None):
if logger is not None:
self.logger = logger
self.__start(word, total_page, start_page, per_page, delay)

def stop(self):
if self.thread is not None:
if self.thread.isAlive():
self.thread.join(0)
self.thread = None
self.thread = None

def __start(self, word, total_page=1, start_page=1, per_page=30, delay=0.1):
"""
爬虫入口
:param delay:
:param word: 抓取的关键词
:param total_page: 需要抓取数据页数 总抓取图片数量为 页数 x per_page
:param start_page:起始页码
:param per_page: 每页数量
:return:
"""
self.__per_page = int(per_page)
self.__start_amount = (int(start_page) - 1) * self.__per_page
self.__amount = int(total_page) * self.__per_page + self.__start_amount
self.time_sleep = float(delay)
self.get_images(word)


if __name__ == "__main__":
if len(sys.argv) > 1:
parser = argparse.ArgumentParser()
parser.add_argument("-w", "--word", type=str, help="抓取关键词", required=True)
parser.add_argument(
"-tp", "--total_page", type=int, help="需要抓取的总页数", required=True
)
parser.add_argument("-sp", "--start_page", type=int, help="起始页数", required=True)
parser.add_argument(
"-pp",
"--per_page",
type=int,
help="每页大小",
choices=[10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
default=30,
nargs="?",
)
parser.add_argument("-d", "--delay", type=float, help="抓取延时(间隔)", default=0.05)
args = parser.parse_args()

crawler = BaiduPicsClient(args.delay)
crawler.__start(
args.word, args.total_page, args.start_page, args.per_page
) # 抓取关键词为 “美女”,总数为 1 页(即总共 1*60=60 张),开始页码为 2
else:
# 如果不指定参数,那么程序会按照下面进行执行
crawler = BaiduPicsClient(0.05) # 抓取延迟为 0.05

crawler.__start(
"美女", 10, 2, 30
) # 抓取关键词为 “美女”,总数为 1 页,开始页码为 2,每页30张(即总共 2*30=60 张)
# crawler.start('二次元 美女', 10, 1) # 抓取关键词为 “二次元 美女”
# crawler.start('帅哥', 5) # 抓取关键词为 “帅哥”
Loading

0 comments on commit 1a0a750

Please sign in to comment.