B站api应用实例:抓取小约翰可汗的BB空间签名
2021年10约13日更新主程序,添加了图片展示功能,具体程序看本文末尾新加内容。
签名记录为 https://pa.ci/ljk/index.html
头像记录为 https://pa.ci/ljk/images.html
B站之前是直接提供api的,网址是docs.bilibili.cn,后来因为负载太高不对外开放了。所幸现在还有人在收集api放在github上。利用api解析出json格式的个人信息并抓取签名对应的sign
文字。抓取使用的是python脚本,用crontab每15分钟运行一次,记录保存到csv文件里面。用PHP写了个简易的页面,将csv读取并展示出来,地址为https://pa.ci/ljk/index.html。
以下是python脚本,因为csv文件不大,所以没有用mysql,直接w/r一把梭。
#!/usr/bin python3
import csv
import requests
import time
url = 'https://api.bilibili.com/x/web-interface/card'
params = (
('mid', '23947287'),
)
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0'
}
response = requests.get(url=url, params=params, headers=headers).json()
sign_text = response['data']['card']['sign']
file_path = r'record.csv'
with open(file_path, newline='', encoding='utf-8') as f:
csv_reader = csv.reader(f)
first_line = next(csv_reader)
f.close()
if str(first_line[1]) != str(sign_text):
time_update = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
row = [time_update, sign_text]
with open(file_path, 'r', encoding='utf-8') as readFile:
rd = csv.reader(readFile)
lines = list(rd)
lines.insert(0, row)
with open(file_path, 'w', newline='', encoding='utf-8') as writeFile:
wt = csv.writer(writeFile)
wt.writerows(lines)
readFile.close()
writeFile.close()
以下是PHP页面,直接读csv就完事了,现在文件不大响应速度还行,不知道以后文件太大会不会高io拖垮服务器。
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>小约翰可汗的签名记录</title>
<link rel="shortcut icon" href="favicon.ico">
</head>
<body>
<center>
<h1>小约翰可汗今天鸽了吗?</h1>
<p>我怎么知道?自己去看!</p>
<h3>小约翰可汗的BB空间签名记录,每15分钟检查一次。</h3>
<p>小约翰可汗的签名记录为 <a href="https://pa.ci/ljk">https://pa.ci/ljk</a>(ljk是Little John Khan的缩写)</p>
<p>本站详情/说明请看 <a href="https://pa.ci/137.html">https://pa.ci/137.html</a></p>
<?php
//echo '博客主站 <a href="https://pa.ci">https://pa.ci</a>';
//echo '<br>';
echo "<table>\n\n";
// Open a file
$file = fopen("record.csv", "r");
// Fetching data from csv file row by row
while (($data = fgetcsv($file)) !== false) {
// HTML tag for placing in row format
echo "<tr>";
foreach ($data as $i) {
echo "<td>" . htmlspecialchars($i) . "</td>";
}
echo "</tr> \n";
}
// Closing the file
fclose($file);
echo "\n</table>";
?>
</center>
</body>
</html>
更新功能,现在可以同时记录签名和头像。
python写的主程序如下:
#!/usr/bin python3
import os
from pathlib import Path
import csv
import requests
import time
url = 'https://api.bilibili.com/x/web-interface/card'
params = (
('mid', '23947287'),
)
headers = {
"user-agent": ""
}
response = requests.get(url=url, params=params, headers=headers).json()
sign_text = response['data']['card']['sign']
avatar_url = response['data']['card']['face']
avatar_name = Path(avatar_url)
file_name = avatar_name.name
file_path = r'time.csv'
with open(file_path, newline='', encoding='utf-8') as f:
csv_reader = csv.reader(f)
first_line = next(csv_reader)
f.close()
if str(first_line[1]) != str(sign_text):
time_update = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
row = [time_update, sign_text]
with open(file_path, 'r', encoding='utf-8') as readFile:
rd = csv.reader(readFile)
lines = list(rd)
lines.insert(0, row)
with open(file_path, 'w', newline='', encoding='utf-8') as writeFile:
wt = csv.writer(writeFile)
wt.writerows(lines)
readFile.close()
writeFile.close()
file_path_avatar = r'avatar.csv'
with open(file_path_avatar, newline='', encoding='utf-8') as f:
csv_reader = csv.reader(f)
first_line = next(csv_reader)
f.close()
if str(first_line[1]) != str(file_name):
save_path = r'images/'
completeName = os.path.join(save_path, file_name)
response = requests.get(avatar_url)
file = open(completeName, "wb")
file.write(response.content)
file.close()
time_update = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
row = [time_update, file_name]
with open(file_path_avatar, 'r', encoding='utf-8') as readFile:
rd = csv.reader(readFile)
lines = list(rd)
lines.insert(0, row)
with open(file_path_avatar, 'w', newline='', encoding='utf-8') as writeFile:
wt = csv.writer(writeFile)
wt.writerows(lines)
readFile.close()
writeFile.close()
PHP写的图片展示,最新的放在最上面。
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>小约翰可汗的签名记录</title>
<link rel="shortcut icon" href="favicon.ico">
</head>
<body>
<center>
<h1>小约翰可汗今天鸽了吗?</h1>
<p>我怎么知道?自己去看!</p>
<h3>小约翰可汗的BB空间签名和头像记录,每15分钟检查一次。</h3>
<p>签名记录为 <a href="https://pa.ci/ljk/index.php">https://pa.ci/ljk/index.php</a>(ljk是Little John Khan的缩写)</p>
<p>头像记录为 <a href="https://pa.ci/ljk/images.php">https://pa.ci/ljk/images.php</a></p>
<p>本站详情/说明请看 <a href="https://pa.ci/137.html">https://pa.ci/137.html</a></p>
<?php
$dirname = '.images/';
$images = glob($dirname . '*.jpg');
$mostrecent = 0;
$mostrecentimg = null;
// scan
foreach ($images as $image) {
$imagemod = filemtime($image);
if ($mostrecent < $imagemod) {
$mostrecentimg = $image;
$mostrecent = $imagemod;
}
}
// display
echo '<img src="' . $mostrecentimg . '" height="300"/><br />';
foreach($images as $image) {
// the most recent was already output above so skip remainder this iteration
if ($image == $mostrecentimg) continue;
echo '<img src="' . $image . '" height="300"/><br />';
}
?>
</center>
</body>
</html>
大佬牛皮
过奖了
大佬能不能让图片那边也能显示日期呢?
等我有空的时候弄一下吧