【Python】爬取xici和快代理的免费代理ip

news/2024/12/2 23:01:47/

文章目录

    • 使用Python爬取xici代理的高匿代理ip
    • 使用Python爬取快代理的高匿代理ip

有时候需要做一些代理ip.常见的xici和快代理.下面是爬取他们的代码
使用requests进行爬取

使用Python爬取xici代理的高匿代理ip

import requests
from bs4 import BeautifulSoup
import randomclass get_xici_ip():# 尝试代理agents增强反反爬def random_agent(self):user_agents = ["Mozilla/5.0 (iPod; U; CPU iPhone OS 4_3_2 like Mac OS X; zh-cn) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8H7 Safari/6533.18.5","Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_2 like Mac OS X; zh-cn) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8H7 Safari/6533.18.5","MQQBrowser/25 (Linux; U; 2.3.3; zh-cn; HTC Desire S Build/GRI40;480*800)","Mozilla/5.0 (Linux; U; Android 2.3.3; zh-cn; HTC_DesireS_S510e Build/GRI40) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1","Mozilla/5.0 (SymbianOS/9.3; U; Series60/3.2 NokiaE75-1 /110.48.125 Profile/MIDP-2.1 Configuration/CLDC-1.1 ) AppleWebKit/413 (KHTML, like Gecko) Safari/413"'Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1','Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50','Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11']return random.choice(user_agents)# 尝试代理IP增强反反爬def get_ip_list(self, url, headers):web_data = requests.get(url, headers=headers)soup = BeautifulSoup(web_data.text, 'lxml')ips = soup.find_all('tr')ip_list = []for i in range(1, len(ips)):ip_info = ips[i]tds = ip_info.find_all('td')ip_list.append(tds[1].text + ':' + tds[2].text)return ip_listdef get_random_ip(self, ip_list):proxy_list = []for ip in ip_list:proxy_list.append('http://' + ip)proxy_ip = random.choice(proxy_list)proxies = {'http': proxy_ip}return proxiesdef get_one(self):url = 'http://www.xicidaili.com/nn/%s'%random.randint(1,10)headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36'}ip_list = self.get_ip_list(url, headers=headers)print(ip_list)return self.get_random_ip(ip_list)

调用class

c = get_xici_ip()
d = c.get_one()
print(d)

输出结果

['27.25.196.242:9999', '117.91.232.146:9999', '111.177.178.107:9999', '111.177.188.158:9999', '111.177.179.103:9999', '111.177.181.81:9999', '183.148.133.158:9999', '110.52.235.25:9999', '111.177.187.63:9999', '111.177.172.18:9999', '111.177.178.175:9999', '116.209.54.63:9999', '183.148.140.20:9999', '116.209.52.115:9999', '117.90.2.139:9999', '111.177.177.212:9999', '119.102.189.134:9999', '119.102.188.140:9999', '119.102.188.156:9999', '121.61.2.196:9999', '49.86.180.90:9999', '219.139.141.112:9999', '111.177.189.26:9999', '111.177.191.179:9999', '122.192.174.244:9999', '111.177.167.67:9999', '125.123.139.143:9999', '125.126.210.203:9999', '125.123.140.229:9999', '171.41.84.191:9999', '111.177.185.8:9999', '110.52.235.27:9999', '123.163.117.72:9999', '111.181.35.17:9999', '113.121.146.190:9999', '111.176.29.245:9999', '116.209.58.5:9999', '111.177.175.161:9999', '113.122.169.65:9999', '121.61.2.8:808', '121.61.0.140:9999', '111.176.23.161:9999', '116.209.54.236:9999', '171.41.85.124:9999', '125.126.209.156:9999', '180.119.68.211:9999', '111.177.191.214:9999', '58.50.1.139:9999', '59.62.166.108:9999', '115.151.2.63:9999', '111.177.179.41:9999', '171.41.84.200:9999', '115.151.5.40:53128', '59.62.164.163:9999', '121.61.2.128:9999', '116.209.54.117:9999', '111.177.161.26:9999', '125.123.140.246:9999', '111.181.35.55:9999', '125.123.143.70:9999', '171.41.85.163:9999', '112.85.130.88:9999', '121.61.0.165:9999', '171.80.136.10:9999', '111.177.188.81:9999', '115.151.2.101:9999', '171.41.85.201:9999', '113.121.145.6:9999', '121.61.0.98:9999', '171.41.86.14:9999', '111.177.172.77:9999', '111.177.171.222:9999', '110.52.235.11:9999', '183.148.145.122:9999', '110.52.235.206:9999', '111.177.189.246:9999', '110.52.235.237:9999', '58.50.3.137:9999', '117.90.137.148:9999', '116.209.58.116:9999', '116.209.53.154:9999', '110.52.235.123:9999', '175.165.146.223:1133', '115.151.3.7:9999', '116.209.54.220:9999', '111.79.198.71:9999', '115.151.2.189:9999', '116.209.54.48:9999', '116.209.54.235:9999', '116.7.176.29:8118', '59.62.165.245:9999', '115.151.7.159:9999', '222.189.190.47:9999', '183.15.121.77:3128', '111.177.170.247:9999', '111.181.61.163:9999', '112.85.170.173:9999', '115.151.2.37:9999', '116.209.56.92:9999', '121.61.2.242:9999']
{'http': 'http://183.148.140.20:9999'}

使用Python爬取快代理的高匿代理ip

随机获取其中一个IP地址

import requests
from bs4 import BeautifulSoup
import randomclass get_kuaidaili_ip():# 尝试代理agents增强反反爬def random_agent(self):user_agents = ["Mozilla/5.0 (iPod; U; CPU iPhone OS 4_3_2 like Mac OS X; zh-cn) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8H7 Safari/6533.18.5","Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_2 like Mac OS X; zh-cn) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8H7 Safari/6533.18.5","MQQBrowser/25 (Linux; U; 2.3.3; zh-cn; HTC Desire S Build/GRI40;480*800)","Mozilla/5.0 (Linux; U; Android 2.3.3; zh-cn; HTC_DesireS_S510e Build/GRI40) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1","Mozilla/5.0 (SymbianOS/9.3; U; Series60/3.2 NokiaE75-1 /110.48.125 Profile/MIDP-2.1 Configuration/CLDC-1.1 ) AppleWebKit/413 (KHTML, like Gecko) Safari/413"'Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1','Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50','Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11']return random.choice(user_agents)# 尝试代理IP增强反反爬def get_ip_list(self, url, headers):web_data = requests.get(url, headers=headers)soup = BeautifulSoup(web_data.text, 'lxml')ips = soup.find_all('tr')ip_list = []for i in range(1, len(ips)):ip_info = ips[i]tds = ip_info.find_all('td')ip_list.append(tds[0].text + ':' + tds[1].text)return ip_listdef get_random_ip(self, ip_list):proxy_list = []for ip in ip_list:proxy_list.append('http://' + ip)proxy_ip = random.choice(proxy_list)proxies = {'http': proxy_ip}return proxiesdef get_one(self):# url = 'http://www.xicidaili.com/nn/5'url = 'https://www.kuaidaili.com/free/inha/%s/'%random.randint(1,10)headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36'}ip_list = self.get_ip_list(url, headers=headers)print(ip_list)return self.get_random_ip(ip_list)

调用class

c = get_kuaidaili_ip()
d = c.get_one()
print(d)

输出结果:

['121.61.27.120:9999', '163.204.242.44:9999', '115.151.5.138:9999', '121.239.127.128:9999', '1.192.245.72:9999', '121.232.194.13:9000', '125.123.136.50:9999', '60.13.42.8:9999', '111.177.169.209:9999', '183.147.30.228:9000', '110.52.235.238:9999', '180.118.128.86:9999', '49.89.85.101:9999', '163.204.245.36:9999', '115.151.7.86:9999']
{'http': 'http://111.177.169.209:9999'}

http://www.ppmy.cn/news/165149.html

相关文章

爬虫学习+实战

爬虫 概念: 网络爬虫:就是模拟客户端发送请求,获取响应数据,一种按照一定的规则,自动地抓取万维网上的信息的程序或者脚本 爬虫分类: 通用爬虫:抓取系统中重要的组成部分。抓取的是一整张页面数据聚焦爬…

HIVE优化系列(1)-- 自动合并输出的小文件

小文件的缺陷我们就不说了,直接进入到正题. HIVE自动合并输出的小文件的主要优化手段为: set hive.merge.mapfiles true:在只有map的作业结束时合并小文件, set hive.merge.mapredfiles true:在Map-Reduce的任务结…

联想笔记本如何安装内存条——附安装结果检查方法及问题解决方法

本文同时发表于我自己搭建的博客,欢迎直接进入我的博客阅读! 文章地址:https://www.wangliguang.cn/?p49 在现在,大家普遍想让自己的电脑运行的更流畅,于是装内存条就成了一个不错的选择。下面以联想G50-80为例介绍如…

数据库上机实验1 数据库定义语言

一、实验目的 1、熟悉数据库管理软件的使用。 2、熟练掌握数据库的创建、修改和删除语句。 3、掌握表的创建、修改和删除语句。 二、实验内容 给定如表3.6、表3.7和表3.8所示的学生信息。 表3.6 学生表(Student) 学号 姓名 性别 专业班级 出生…

htc s510e g12 开机第一屏 制作过程及软件, 开机、关机动画替换教程

2019独角兽企业重金招聘Python工程师标准>>> 本帖最后由 :-) 于 2011-8-30 13:50 编辑 开机第一屏文件修改工具(1).rar (373.52 KB, 下载次数: 212) 将附件解压缩后 文件夹内有三个文件分别是:nbimg.exe、splash1.bmp、一键制作开…

今天买了部数码相机NiKon S510

今天到中关村买了一部数码相机,NiKon S510 1997元 含发票。 自己到别的地方买了一个2G的SD卡110元,还有一包装相机的35元。 以后可以多出出去玩了哈哈~~。

【HTC Desire S/S510e(S-OFF)完美刷机攻略】超详细新手教程之刷Recovery及刷ROM

首先,你要先确定你的机器是否是 S-OFF的。 确认方法: 完全关机(如果不确定是否完全关机,最好把电池拿出来一下),按住 音量下电源键进入 HBOOT界面。看第一行。 如果显示为 S-OFF,就请继续往…

HTC S510b官方解锁、刷recovery、刷ROOT教...

HTC S510b官方解锁、刷recovery、刷ROOT教... Step 1 Remove and reinsert the battery then proceed to step 2. For devices without a removable battery, long press the power key then select restart. Hold down the volume down key while restarting to start the dev…