2 从数据库中取出有效 代理IP 并格式化好
我们先定义一个类,方便后面导入获取可用的代理IP
class GetIp(object):
def judge_ip(self, ip, port, proxy_type):
"""
判断是否有效
:param ip:
:param port:
:param proxy_type:
:return: bool
"""
http_url = " http://www.baidu.com "
proxy_ip = "://:".format(proxy_type, ip, port)
try:
proxy_dict = {
proxy_type: proxy_ip
}
response = requests.get(http_url, proxies=proxy_dict)
except Exception as e:
print("invalid ip and port", ip, port, proxy_type, e)
self.del_ip(ip)
return False
else:
code = response.status_code
if 200
return True
else:
self.del_ip(ip)
return False
def del_ip(self, ip):
"""删除无效ip"""
sql = """delete from ip_info WHERE ip='{}'""".format(ip)
cus.execute(sql)
conn.commit()
def get_random_ip(self):
"""
获取随机ip
:return: (ip, port, proxy_type)
"""
sql = """select ip,port,proxy_type from ip_info order by RAND() limit 1;"""
cus.execute(sql)
for ip_info in cus.fetchall():
ip = ip_info[0]
port = ip_info[1]
proxy_type = ip_info[2]
if self.judge_ip(ip, port, proxy_type):
return "://:".format(proxy_type, ip, port)
else:
self.get_random_ip()
if name == ' main ':
test_net = GetIp()
ip = test_net.get_random_ip()
print(ip)
效果图:如下
调用函数 效果图
总结,这段脚本了,有两个看点吧,一是如何爬取代理IP,我使用了Scrapy框架里面的选择器来进行提取数据,二就是如何判断一个能用的代理IP,用百度这个网址去测试是否可用。
892
174
下一篇:在Win7中换IP地址难吗?