详情介绍
jsoup 1.12.1 发布了,该版本包含众多可用性的提升,提升了解析速度和内存效率,修复了不少 bug 。
jsoup 是一款 Java 的HTML 解析器,可直接解析某个URL地址、HTML文本内容。它提供了一套非常省力的API,可通过DOM,CSS以及类似于JQuery的操作方法来取出和操作数据。
jsoup主要功能如下:
从一个URL,文件或字符串中解析HTML;
使用DOM或CSS选择器来查找、取出数据;
可操作HTML元素、属性、文本;
jsoup是基于MIT协议发布的,可放心使用于商业项目。
示例代码:

完整的改进记录如下:
Changes
Change: removed deprecated method to disable TLS cert checking in Connection.validateTLSCertificates().
Change: some internal methods have been rearranged; if you extended any of the Jsoup internals you may need to make updates.
Updated jetty-server (which is used for integration tests) to latest 9.2 series (9.2.28).
Improvements
Improvement: documents now remember their parser, so when later manipulating them, the correct HTML or XML tree builder is reused, as are the parser settings like case preservation.
Improvement: Jsoup now detects the character set of the input if specified in an XML Declaration, when using the HTML parser. Previously that only happened when the XML parser was specified.
Improvement: if the document's input character set does not support encoding, flip it to one that does.
Improvement: if a start tag is missing a > and a new tag is seen with a <, treat that as a new tag. (This differs from the HTML5 spec, which would make at attribute with a name beginning with <, but in practice this impacts too many pages.
Improvement: performance tweaks when parsing start tags, data, tables.
Improvement: added Element.nextElementSiblings() and Element.previousElementSiblings()
Improvement: treat center tags as block tags.
Improvement: allow forms to be submitted with Content-Type=multipart/form-data without requiring a file upload; automatically set the mime boundary.
Improvement: Jsoup will now detect if an input file or URL is binary, and will refuse to attempt to parse it, with an IO Exception. This prevents runaway processing time and wasted effort creating meaningless parsed DOM trees.
Bug Fixes
Bugfix: when using the tag case preserving parsing settings, certain HTML tree building rules where not followed for upper case tags.
Bugfix: when converting a Jsoup document to a W3C DOM, if an element is namespaced but not in a defined namespace, set it to the global namespace.
Bugfix: attributes created with the Attribute constructor with just spaces for names would incorrectly pass validation.
Bugfix: some pseudo XML Declarations were incorrectly handled when using the XML Parser, leading to an IOOB exception when parsing.
Bugfix: when parsing URL parameter names in an attribute that is not correctly HTML encoded, and near the end of the current buffer, those parameters may be incorrectly dropped. (Improved CharacterReader mark/reset support.)
Bugfix: boolean attribute values would be returned as null, vs an empty string, when accessed via the Attribute#getValue() method.
Bugix: orphan Attribute objects (i.e. created outside of a parse or an Element) would throw an NPE on Attribute#setValue(val)
Bugfix: Element.shallowClone() was not making a clone of its attributes.
Bugfix: fixed an ArrayIndexOutOfBoundsException in HttpConnection.looksLikeUtf8() when testing small strings in specific character ranges.
下载地址
人气软件
相关文章
-
IDEA插件EasyTool 多功能插件工具集 v2.1.8 免费版EasyTool 是一个支持多系统平台的IDE插件工具集,具有日常编程过程中常用的功能特性,提供一系列可视化、个性化的配置界面, 具体功能可在安装插件后尽情体验,欢迎需要的...
-
JetBrains IDE插件Grazie Pro 2024.2 V0.3.369 官方免费版Grazie是一个IDEA插件,科技界人士的 AI 写作伴侣,通过将生成式 AI 集成到您的 JetBrains IDE、浏览器和其他工具中,简化非编码任务...
-
AI助手JetBrains AI Assistant插件 2025.1 v251.0 官方最新免费解压版JetBrainsQ AI Assistant现已全面推出,搭载大量新功能和改进,助力提高您在JetBrains IDE中的工作效率,可以解释代码、回答有关代码片段的问题、提交消息等...
-
IntelliJ IDEA 辅助插件 jutils 2.0.6 官方免费版jutils是一个IntelliJ IDEA 辅助插件,主要包含四个实用功能,外部编辑器打开文件、生成默认 setter 方法、打包文件、打包编译文件,需要的朋友可下载...
-
IntelliJ IDEA插件 IdeaVim 2024.3 v2.21.0 官方免费版IdeaVim是IntelliJ IDEA的一个插件,JetBrains IDE的Vim引擎,支持许多Vim功能,包括正常/插入/视觉模式、运动键、删除/更改、标记、寄存器、一些Ex命令、宏、Vim插件等等...
-
intellij插件CSV Editor 2024.2 v4.0.2 官方最新免费版用彩虹色的表格和文本编辑器编辑CSV文件的插件,还提供语法验证、高亮显示、自定义等功能。是轻巧的CSV插件...
下载声明
☉ 解压密码:www.jb51.net 就是本站主域名,希望大家看清楚,[ 分享码的获取方法 ]可以参考这篇文章
☉ 推荐使用 [ 迅雷 ] 下载,使用 [ WinRAR v5 ] 以上版本解压本站软件。
☉ 如果这个软件总是不能下载的请在评论中留言,我们会尽快修复,谢谢!
☉ 下载本站资源,如果服务器暂不能下载请过一段时间重试!或者多试试几个下载地址
☉ 如果遇到什么问题,请评论留言,我们定会解决问题,谢谢大家支持!
☉ 本站提供的一些商业软件是供学习研究之用,如用于商业用途,请购买正版。
☉ 本站提供的jsoup Java HTML解析器 v1.12.1 最新免费版资源来源互联网,版权归该下载资源的合法拥有者所有。























