python爬取精美壁纸

news/2024/11/29 3:46:15/

目标网址为Top Wallpapers - wallhaven.cc

F12检查网页元素，点击网络（Network）,刷新页面，之后找到Name的第一个toplist?page=，点击标头（Headers）,找到user-agent。

headers如下：

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36'
}

开始运行爬虫程序，保存壁纸（保存的壁纸在当前文件夹）：

爬取到的结果如下：

整体代码如下：

# 导入所需库
import requests
from lxml import etree
import os
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36'
}def get_html_info(page):url = f'https://wallhaven.cc/toplist?page={page}'resp = requests.get(url,headers=headers)resp_html = etree.HTML(resp.text)return resp_htmldef get_pic(resp_html):pic_url_list = []lis = resp_html.xpath('//*[@id="thumbs"]/section[1]/ul/li')for li in lis:pic_url = li.xpath('./figure/a/@href')[0]pic_url_list.append(pic_url)for pic_url in pic_url_list:resp2 = requests.get(pic_url,headers=headers)r_html2 = etree.HTML(resp2.text)'//*[@id="thumbs"]/section[1]/ul/li[2]/figure/a'pic_size = r_html2.xpath('//*[@id="showcase-sidebar"]/div/div[1]/h3/text()')[0]final_url = r_html2.xpath('//*[@id="wallpaper"]/@src')[0]pic = requests.get(url=final_url,headers=headers).contentif not os.path.exists('Wallhaven'):os.mkdir('Wallhaven')with open('Wallhaven\\' + pic_size +final_url[-10:],mode='wb') as f:f.write(pic) # 保存图片的函数print(pic_size + final_url[-10:]+'，下载完毕，已下载{}张壁纸'.format(len(os.listdir('Wallhaven'))))def main():page_range = range(1,10) # 爬取1-5页的壁纸  (可以自己设置需要爬取的页面数量)for i in page_range:r = get_html_info(i)get_pic(r)print(f'===============第{i}页下载完毕=============')
if __name__ == '__main__':main()

python爬取精美壁纸

相关文章

python：下载精美壁纸--学习笔记

如何保存 Windows 10「聚焦」功能的精美壁纸-------附带python脚本

python制作日历并保存成excel_Python+Excel制作精美壁纸日历，任意DIY

Ubuntu下载精美壁纸

一种用于Linux 自动更换精美壁纸的方法

MAC动态精美壁纸强烈推荐【文末福利】

如何破解Mac并为其提供真正应得的精美壁纸

Python爬虫应用实战案例-xpath正则表达式使用方法，爬取精美壁纸