admin 管理员组

文章数量: 1087675

Java爬虫

Java爬虫 — 爬取王者荣耀英雄图片

import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;/*** @author 郭珂*/
public class TextMain {public static void main(String[] args) throws IOException {//导包//和王者服务器创建连接Connection connection = Jsoup.connect(".shtml");//通过连接 获取那个Document对象 来间接操作(HTML)Document document = connection.get();//找到那些存储图片头像的位置Element elementUL = document.selectFirst("[class=herolist clearfix]");//通过UL找寻其中包含的LiElements elementLis = elementUL.select("li");//将Lis遍历for(Element elementLi : elementLis){//每一次在li中获取一个aElement elementA = elementLi.selectFirst("a");//获取A标签中的那个href属性String hrefURL = elementA.attr("href");//获取A标签中夹着的那个文字String InnerText = elementA.text();//地址拼接成完整的全路径String path = "/" + hrefURL;//通过拼接好的path创建一个新的连接Connection newConnection = Jsoup.connect(path);//通过新连接获取一个新的Document对象Document newDocument = newConnection.get();//通过document找寻那个存有大图的divElement div = newDocument.selectFirst("[class=zk-con1 zk-con]");//找到div标签中的那个style属性，要里面的地址String backgroundURL = div.attr("style");int left = backgroundURL.indexOf("'");int right = backgroundURL.lastIndexOf("'");String newBG = backgroundURL.substring(left+1,right);URL url = new URL("https:" + newBG);//==============================================================================//通过url获取一个用来读取图片的输入流InputStream inputStream = url.openStream();//写在本地的硬盘上FileOutputStream fileOutputStream = new FileOutputStream("D:\\King\\"+ InnerText +".jpg\\");//需要一个临时小数组byte[] b = new byte[1024];//读取图片信息，存入小数组int count = inputStream.read(b);while(count != -1){fileOutputStream.write(b,0,count);//清空流管道fileOutputStream.flush();//再读取下一次count = inputStream.read(b);}fileOutputStream.close();inputStream.close();}}}

本文标签： Java爬虫

版权声明：本文标题：Java爬虫内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.roclinux.cn/b/1687962083a162493.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

Linux大棚 – 不忘初心的技术博客，浮躁时代的安静角落

Java爬虫

Java爬虫

Java爬虫 — 爬取王者荣耀英雄图片

更多相关文章

Java爬虫

【Java爬虫

发表评论

推荐文章

javascript - jQuery each function not working properly - Stack Overflow

css - Background smooth fade on hover - Stack Overflow

Why does CMake install(TARGETS) install header files even if I don&#39;t specify PUBLIC_HEADER? - Stack Overflow

javascript - Find 1, 2, 3 missing numbers in an array of first N natural numbers - Stack Overflow

开机、注销后自动登录Windows

热门文章

sorting - Bubble sort in Assembly (ARM) - Stack Overflow

google maps - polyline with label in javascript - Stack Overflow

javascript - Ckeditor uploadimage 404 errorplugin setup - Stack Overflow

如何制作U盘启动盘并安装Windows 10系统

C盘空间告急？10个高效清理技巧拯救你的系统盘！

无线路由器的基础配置(二)

亲测可用制作纯DOS的U盘启动工具：快速制作纯DOS启动盘

《Windows系统Java环境安装指南：从JDK17下载到环境变量配置》

Windows 11 24H2 2025 年 4 月补丁日将发布KB5055523更新修复了文件资源管理器菜单以相反方向打开的问题

【免费下载】 VMware Windows 7 硬盘镜像

最新文章

javascript - How do I toggle the readonly attribute of all child element with jquery - Stack Overflow

javascript - Might it be possible to block an entire US state from accessing my site, using PHP? - Stack Overflow

c++ - Is dereferencing std::span::end always undefined? - Stack Overflow

javascript - Delay function execution if it has been called recently - Stack Overflow

javascript - Google Maps Autocomplete List - Stack Overflow

【免费下载】 联想拯救者Y7000 2020H原厂Win10系统镜像：重拾纯净体验

【免费下载】 Java 11 下载 - 版本 11.0.17 (Windows 各版本)

【免费下载】 Windows7旗舰版简体中文ISO镜像下载：轻松获取正版系统安装镜像

【免费下载】 Ventory-u盘启动制作工具：让你的Ubuntu之旅更加顺畅

【免费下载】 Ventory-u盘启动制作工具：轻松打造高效启动盘

Exploring the Finest Accommodations: A Comprehensive Guide to Ruston LA Hotels

The Enchanting Experience of ScaliniTella NYC: A Culinary Gem in the Heart of Manhattan

Exploring the Exquisite Aloft Chicago O'Hare: A Blend of Modern Luxury and Convenience

A Culinary Journey: Discovering the Finest Dining Experiences in Waco, TX

A Culinary Journey: Discovering the Finest Dining Experiences in Athens, GA

Why does CMake install(TARGETS) install header files even if I don't specify PUBLIC_HEADER? - Stack Overflow

【免费下载】联想拯救者Y7000 2020H原厂Win10系统镜像：重拾纯净体验