某网站二次元美女图片爬取加破解（plus版） python-Linux大棚

admin 管理员组

文章数量: 1087652

某网站二次元美女图片爬取加破解（plus版） python

import sys
import time
import os
import requests
import re  # 正则表达式，进行文字匹配
from bs4 import BeautifulSoup  # (网页解析，获取数据)
import urllib.request, urllib.error  # 制定URL，获取网页数据，urllib.request urllib.error
import sqlite3
import random# UA_LIST = ['Mozilla/5.0 (compatible; U; ABrowse 0.6; Syllable) AppleWebKit/420+ (KHTML, like Gecko)', 'Mozilla/5.0 (compatible; U; ABrowse 0.6;  Syllable) AppleWebKit/420+ (KHTML, like Gecko)', 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser 1.98.744; .NET CLR 3.5.30729)', 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser 1.98.744; .NET CLR   3.5.30729)', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0;   Acoo Browser; GTB5; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;   SV1) ; InfoPath.1; .NET CLR 3.5.30729; .NET CLR 3.0.30618)', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; SV1; Acoo Browser; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; Avant Browser)', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Acoo Browser; SLCC1;   .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Acoo Browser; GTB5; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; Maxthon; InfoPath.1; .NET CLR 3.5.30729; .NET CLR 3.0.30618)', 'Mozilla/4.0 (compatible; Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser 1.98.744; .NET CLR 3.5.30729); Windows NT 5.1; Trident/4.0)', 'Mozilla/4.0 (compatible; Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6; Acoo Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727); Windows NT 5.1; Trident/4.0; Maxthon; .NET CLR 2.0.50727; .NET CLR 1.1.4322; InfoPath.2)', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser; GTB6; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; InfoPath.1; .NET CLR 3.5.30729; .NET CLR 3.0.30618)']
'''测试通过，解码方式为gbk'''
'''k = int(input('输入页面数'))
for i in range(2, k+1):ip = random.randint(0, 12)headers = {'user-agent': '自己的ua'}url = '{}.html'.format(i)response = requests.get(url, headers=headers)response.encoding = 'gbk'html = response.texturl_2 = re.findall('<li><a href="(.*?)" target="_blank"><img src=".*?" alt=".*?" /><b>.*?</b></a>', html)print(url_2)'''
#
# url = '.html'#自己增加format会吧，线性的，记得加sleep
k = int(input("请输入爬取页面数："))
uum = []
UA_LIST = ['Mozilla/5.0 (compatible; U; ABrowse 0.6; Syllable) AppleWebKit/420+ (KHTML, like Gecko)', 'Mozilla/5.0 (compatible; U; ABrowse 0.6;  Syllable) AppleWebKit/420+ (KHTML, like Gecko)', 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser 1.98.744; .NET CLR 3.5.30729)', 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser 1.98.744; .NET CLR   3.5.30729)', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0;   Acoo Browser; GTB5; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;   SV1) ; InfoPath.1; .NET CLR 3.5.30729; .NET CLR 3.0.30618)', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; SV1; Acoo Browser; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; Avant Browser)', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Acoo Browser; SLCC1;   .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Acoo Browser; GTB5; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; Maxthon; InfoPath.1; .NET CLR 3.5.30729; .NET CLR 3.0.30618)', 'Mozilla/4.0 (compatible; Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser 1.98.744; .NET CLR 3.5.30729); Windows NT 5.1; Trident/4.0)', 'Mozilla/4.0 (compatible; Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6; Acoo Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727); Windows NT 5.1; Trident/4.0; Maxthon; .NET CLR 2.0.50727; .NET CLR 1.1.4322; InfoPath.2)', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Acoo Browser; GTB6; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; InfoPath.1; .NET CLR 3.5.30729; .NET CLR 3.0.30618)']
#网上找的uafor j in range(2, k + 1):ip = random.randint(1, 10)headers = {'user-agent': UA_LIST[ip]}# .htmlu_2 = '{}.html'.format(j)response = requests.get(u_2, headers=headers)response.encoding = 'gbk'html = response.texturl_2 = re.findall('<li><a href="(.*?)" target="_blank"><img src=".*?" alt=".*?" /><b>.*?</b></a>', html)uum += url_2time.sleep(0.1)for i in range(0, len(uum)):ip = random.randint(1, 10)print('爬取中{}'.format(i))url = '' + uum[i]headers = {'user-agent': UA_LIST[ip]}response = requests.get(url=url, headers=headers)response.encoding = 'gbk'time.sleep(0.1)  # 留点缓冲时间html = response.text# html.encode('utf-8')# urls = re.findall('<img lazysrc="(.*?)" lazysrc2x=".*?" height="348px" alt=".*?" title=".*?" />', html)urls = re.findall('<img src="(.*?)" data-pic=".*?" alt=".*?" title=".*?"></a>', html)filename = 'D:\点击获取资源壁纸破解\'print(urls)if not os.path.exists(filename):os.mkdir(filename)if len(urls) != 0:for url in urls:url = '/' + urlname = url.split('/')[-1]response = requests.get(url, headers=headers)with open(filename + name, mode='wb') as f:f.write(response.content)if i == len(uum) - 1:print("爬取结束了")

我这几天偶然发现之前一些网站的高清图址有迹可循，想着用异步和selenium写个爬虫，多少我今天一看，哇之前没看到，高清图址就在一段xpath中我竟然没发现，所以，今天赶紧写了一个可以爬取固定页面的爬虫，就是4k风景图，4k动漫等等，就只会爬该页1-n页，n是自己定义，想抓多少抓多少，不过我劝小伙伴们善良，网站已经加了验证码，难度上升了，我也加了反爬手段，如果后来实在不行，我会手写个图像识别。那么，高清大图爬取源代码奉上，如果有帮助，希望小伙伴们点个赞再走，感谢！

效果如下：

本文标签：某网站二次元美女图片爬取加破解（plus版） python

版权声明：本文标题：某网站二次元美女图片爬取加破解（plus版） python 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.roclinux.cn/b/1693576744a230226.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

Linux大棚 – 不忘初心的技术博客，浮躁时代的安静角落

某网站二次元美女图片爬取加破解（plus版） python

某网站二次元美女图片爬取加破解（plus版） python

更多相关文章

某网站二次元美女图片爬取加破解（plus版） python

发表评论

推荐文章

Microsoft Office LTSC 2021企业办公新标杆，稳定高效助力业务发展

Python curses refresh issues when resizing terminal - Stack Overflow

Qwen2.5+Qwen3安装（Windows和raspberrypi-4GB）

【干货教程】在Windows计算机部署DeepSeek大模型，给在实验室无外网的同事们用（基于Ollama和OpenWebUI）

Win7电脑无法进入睡眠模式？这里有解决方案

热门文章

c++ - Boost.TypeErasure `any` with a concept returning the same `any`? - Stack Overflow

javascript - How to put an onclick event for a HTML table row created dynamically through java script.? - Stack Overflow

javascript - How to add user input from a text box to a list in HTML - Stack Overflow

powerbi - Create a Power BI measure that can handle in AND out of context values - Stack Overflow

迈普2800路由器设置trunk

Delta Table to Iceberg metadata migration is failing - Stack Overflow

Redis下载及安装(windows版)

2025年最全面的18种C盘清理方法，轻松释放50G以上空间，可以收藏备用！

windows 安装pnpm

【python】Windows下使用Tkinter，出现No module named 'Tkinter'的解决办法

最新文章

javascript - How do I toggle the readonly attribute of all child element with jquery - Stack Overflow

javascript - Might it be possible to block an entire US state from accessing my site, using PHP? - Stack Overflow

c++ - Is dereferencing std::span::end always undefined? - Stack Overflow

javascript - Delay function execution if it has been called recently - Stack Overflow

javascript - Google Maps Autocomplete List - Stack Overflow

【免费下载】重温经典：MSDN原版Windows 7 with SP1各版本下载推荐

【免费下载】大神U盘工具（Win10PE）UEFI纯净版启动盘制作工具

【免费下载】重温经典：Windows 98原版系统镜像下载资源推荐

Windows系统更新，显示Windows启动管理器，进去后为重装系统界面的解决方法。

win11登录密码忘记了？别慌！无需重装系统，一个U盘轻松移除！

Exploring the Finest Accommodations: A Comprehensive Guide to Ruston LA Hotels

The Enchanting Experience of ScaliniTella NYC: A Culinary Gem in the Heart of Manhattan

Exploring the Exquisite Aloft Chicago O'Hare: A Blend of Modern Luxury and Convenience

A Culinary Journey: Discovering the Finest Dining Experiences in Waco, TX

A Culinary Journey: Discovering the Finest Dining Experiences in Athens, GA

Linux大棚 – 不忘初心的技术博客，浮躁时代的安静角落

某网站二次元美女图片爬取加破解（plus版） python

某网站二次元美女图片爬取加破解（plus版） python

更多相关文章

某网站二次元美女图片爬取加破解（plus版） python

发表评论

推荐文章

Microsoft Office LTSC 2021企业办公新标杆，稳定高效助力业务发展

Python curses refresh issues when resizing terminal - Stack Overflow

Qwen2.5+Qwen3安装（Windows和raspberrypi-4GB）

【干货教程】在Windows计算机部署DeepSeek大模型，给在实验室无外网的同事们用（基于Ollama和OpenWebUI）

Win7电脑无法进入睡眠模式？这里有解决方案

热门文章

c++ - Boost.TypeErasure `any` with a concept returning the same `any`? - Stack Overflow

javascript - How to put an onclick event for a HTML table row created dynamically through java script.? - Stack Overflow

javascript - How to add user input from a text box to a list in HTML - Stack Overflow

powerbi - Create a Power BI measure that can handle in AND out of context values - Stack Overflow

迈普2800路由器设置trunk

Delta Table to Iceberg metadata migration is failing - Stack Overflow

Redis下载及安装(windows版)

2025年最全面的18种C盘清理方法，轻松释放50G以上空间，可以收藏备用！

windows 安装pnpm

【python】Windows下使用Tkinter，出现No module named 'Tkinter'的解决办法

最新文章

javascript - How do I toggle the readonly attribute of all child element with jquery - Stack Overflow

javascript - Might it be possible to block an entire US state from accessing my site, using PHP? - Stack Overflow

c++ - Is dereferencing std::span::end always undefined? - Stack Overflow

javascript - Delay function execution if it has been called recently - Stack Overflow

javascript - Google Maps Autocomplete List - Stack Overflow

【免费下载】 重温经典：MSDN原版Windows 7 with SP1各版本下载推荐

【免费下载】 大神U盘工具（Win10PE）UEFI纯净版启动盘制作工具

【免费下载】 重温经典：Windows 98原版系统镜像下载资源推荐

Windows系统更新，显示Windows启动管理器，进去后为重装系统界面的解决方法。

win11登录密码忘记了？别慌！无需重装系统，一个U盘轻松移除！

Exploring the Finest Accommodations: A Comprehensive Guide to Ruston LA Hotels

The Enchanting Experience of ScaliniTella NYC: A Culinary Gem in the Heart of Manhattan

Exploring the Exquisite Aloft Chicago O'Hare: A Blend of Modern Luxury and Convenience

A Culinary Journey: Discovering the Finest Dining Experiences in Waco, TX

A Culinary Journey: Discovering the Finest Dining Experiences in Athens, GA

【免费下载】重温经典：MSDN原版Windows 7 with SP1各版本下载推荐

【免费下载】大神U盘工具（Win10PE）UEFI纯净版启动盘制作工具

【免费下载】重温经典：Windows 98原版系统镜像下载资源推荐