wallhaven.cc网站图片超清壁纸爬虫

news/2024/12/2 20:49:58/

测试时间:2021-02-16

    • 1.参考博客
    • 2.python代码

1.参考博客

From(侵删):
https://blog.csdn.net/qq_41849471/article/details/89607706

2.python代码

图片保存路径:
save_dir = ‘C:/Users/Administrator/Pictures/wallpaper/’
缩略图

pycharm直接运行即可下载到该目录,其他参数可参考上述博客。

import requests
from lxml import etree
import re
import os
import timedef get_pictures(url,folder_name,dest_count,c):html = requests.get(url)res = etree.HTML(html.content)img_url = res.xpath('//img[@id="wallpaper"]/@src')[0]img_name = img_url.split('/')[-1]try:img_html = requests.get(img_url)save_dir = 'C:/Users/Administrator/Pictures/wallpaper/' + folder_nameif not os.path.exists(save_dir):os.mkdir(save_dir)with open(save_dir +'/'+img_name,'wb') as f:f.write(img_html.content)print("正在下载第 {} 张图片=====> ".format(c+1)+img_name+' -----success!')return 1except:print("正在下载第 {} 张图片=====> ".format(c+1)+img_name+' -----failure!')return 0def get_next_url(url,folder_name,stars_num,dest_count,all):html = requests.get(url)res = etree.HTML(html.content)next_urls = res.xpath("//a[@class='preview']/@href")stars = res.xpath("//div[@class='thumb-info']/a[1]/text()")res_url = []sum = allfor i in range(0,len(stars)):if int(stars[i])>=int(stars_num):res_url.append(next_urls[i])for i in res_url:sum += get_pictures(i,folder_name,dest_count,sum)if sum >= dest_count:exit("目标已达成!")if len(next_urls) == 0:print("无更多图片!")time.sleep(3)exit("0")return sumif __name__ == "__main__":print("请选择获取方式:1.范围选择 2.关键词搜索 3.二者结合")# style = input()style = '1'categories = ['0','0','0']purity = ['0','0','0']url = ""keyword = ""sort_list = [# 'https://wallhaven.cc/search?categories=101&purity=110&atleast=2560x1080&topRange=1M&sorting=toplist&order=desc&page={}','https://wallhaven.cc/search?categories={}&purity={}&atleast=2560x1080&ratios=16x9&topRange=1M&sorting=toplist&order=desc&page={}'# ,'https://wallhaven.cc/search?q={}&categories={}&purity={}&sorting=date_added&order=desc&page={}'#, 'https://alpha.wallhaven.cc/search?q={}&categories={}&purity={}&resolutions=1920x1080&topRange=1M&sorting=toplist&order=desc&page={}',# 'https://alpha.wallhaven.cc/search?q={}&categories={}&purity={}&resolutions=1920x1080&sorting=random&order=desc&page={}',# 'https://alpha.wallhaven.cc/search?q={}&search_image=&page={}']if style == '1' or style == '3':if style == '3':print("请输入搜索关键词(建议英文):")keyword = input().replace(' ','+')print("请选择图片类型:1.General 2.Anime 3.People (可多选,默认全选,空格分割选项)")# selection_str = input()selection_str = ''selection = selection_str.split()for i in selection:try:categories[int(i)-1] = '1'except:categories = ['1','1','1']print("图片附加选项:1.SFW 2.Sketchy (可多选,默认选择1,空格分隔选项,建议选择SFW)")# selection_str = input()selection_str = ''selection = selection_str.split()for i in selection:try:purity[int(i)-1] = '1'except:purity = ['1','0','0']purity[2] = '0'if selection_str == "":purity = ['1','1','0']print("请选择排序方式:1.Latest 2.Toplist 3.Random (单选,默认Random)")# selection_str = input()selection_str = '2'count = 1while selection_str != '1' and selection_str != '2' and selection_str != '3' and count <= 3 and selection_str != "":print("请正确选择(多次错误则默认选择)")selection_str = input()count += 1if count == 4:url = sort_list[2]elif selection_str == "":url = sort_list[2]else:# url = sort_list[int(selection_str)-1]url = sort_list[0]elif style == '2':print("请输入搜索关键词(建议英文):")keyword = input().replace(' ','+')url = sort_list[3]print("请输入文件夹的名称:")# folder_name = input()folder_name = 'wallhaven_wallpaper'while folder_name == "":folder_name = input()print("请输入最低的点赞数:")# stars_num = input()stars_num = 20print("请输入目标图片数量:")# dest_count = input()dest_count = 200all = 1 # 目前爬取的张数,用来控制下载张数for i in range(1,999):print('get the page: {}'.format(i))if style != '2':# print("getting from " + url.format(keyword,"".join(categories),"".join(purity),i))print("getting from " + url.format("".join(categories),"".join(purity),i))# all = get_next_url(url.format(keyword,"".join(categories),"".join(purity),i),folder_name,stars_num,int(dest_count),all)all = get_next_url(url.format("".join(categories),"".join(purity),i),folder_name,stars_num,int(dest_count),all)else:print("getting from " + url.format(keyword,i))all = get_next_url(url.format(keyword,i),folder_name,stars_num,int(dest_count),all)```

http://www.ppmy.cn/news/220145.html

相关文章

超清壁纸爬虫

超清壁纸爬虫 概述一、超清壁纸搜索并下载 概述 本文介绍如何使用python爬虫实现超清壁纸的下载。 思路&#xff1a;通过requests模块对网页发起url请求&#xff0c;使用xpath解析提取图片链接&#xff0c;将二进制数据保存至电脑桌面并创建对应的文件夹&#xff01; 提示&…

快来领取哔哩哔哩412张超清壁纸!

经过2020年元旦跨年晚会「最美的夜」一波营销, bilibili的文化也从小众走向大众&#xff0c;zhao前段时间在折腾Linux发行版桌面的时候&#xff0c;发现一张好的壁纸能极大提高操作系统的B格&#xff0c;壁纸内容也反应了系统主人的喜好&#xff0c;zhaoolee作为一位bilibili忠…

python唯美壁纸_Python爬虫教程-爬取5K分辨率超清唯美壁纸源码

# -*- coding:utf-8 -*- from requests import get from filetype import guess from os import rename from os import makedirs from os.path import exists from json import loads from contextlib import closing # 文件下载器 def Down_load(file_url, file_ful…

python唯美壁纸_Python爬虫教程爬取5K分辨率超清唯美壁纸源码

1 #-*- coding:utf-8 -*- 2 3 from requests importget4 from filetype importguess5 from os importrename6 from os importmakedirs7 from os.path importexists8 from json importloads9 from contextlib importclosing10 11 12 #文件下载器 13 defDown_load(file_url, file…

php超清壁纸,美得让人不能呼吸、4K超清壁纸3840×2160p超多图[小水管勿进]

203940q90h3jz0ki3z9b6j.jpg (2.09 MB, 下载次数: 3) 2017-1-16 20:39 上传 " {8 @6 b) _9 n5 f1 p: @+ k9 r1 v3 V# D; a : _7 [& b. g2 P) l- Z" E 1 M4 X U7 V. ~; W( i% S; B1 L1 q) F$ M9 s* b- s1 O( Q 203941va4ba3abhhhwhajg.jpg (2.49 MB, 下载次数: 0…

python下载图片 referer_Python必应超清壁纸爬虫下载|Python爬取必应每日图片源码 - PS下...

不知道大家是否对每日一成不变的壁纸感到厌倦呢?反正对于我个人来说&#xff0c;如果每天打开电脑映入眼帘的都是不同的画面&#xff0c;那么科研热情都会被充分激发&#xff0c;从而提高自己的劳动生产力。下面笔者给大家分享一下Python必应超清壁纸爬虫&#xff0c;让大家每…

超清壁纸头像软件

超清壁纸头像软件 应用隐私政策 尊敬的用户&#xff1a; 超清壁纸头像软件 应用是由 北京微言科技有限公司 &#xff08;以下简称 “ 微言科技 ” &#xff09;为您提供的一款 手机桌面壁纸美化软件 。 “微言科技” 十分尊重您的个人信息和数据&#xff0c;并会尽全力保护您…

基于摄影测量的三维重建【终极指南】

我们生活的时代非常令人兴奋&#xff0c;如果你对 3D 东西感兴趣&#xff0c;更是如此。 我们有能力使用任何相机&#xff0c;从感兴趣的物体中捕捉一些图像数据&#xff0c;并在眨眼间将它们变成 3D 资产&#xff01; 这种通过简单的数据采集阶段进行的 3D 重建过程是许多行业…