1.0.0-rc.2
Pre-releaseThis release changes Cheerio's default parser to the Parse5 HTML
parser. Parse5 is an excellent project
that rigorously conforms to the HTML standard. It does not support XML, so
Cheerio continues to use htmlparser2
when working with XML documents.
This switch addresses many long-standing bugs in Cheerio, but some users may
experience slower behavior in performance-critical applications. In addition,
htmlparser2
is more forgiving of invalid markup which can be useful when
input sourced from a third party and cannot be corrected. For these reasons,
the load
method also accepts a DOM structure as produced by the htmlparser2
library. See the project's "readme" file for more details on this usage
pattern.
Migrating from version 0.x
cheerio.load( html[, options ] )
This method continues to act as a "factory"
function. It produces functions that define an API that is similar to the
global jQuery
function provided by the jQuery library. The generated function
operates on a DOM structure based on the provided HTML.
In releases prior to version 1.0, the provided HTML was interpreted as a
document fragment. Following version 1.0, strings provided to the load
method
are interpreted as documents. The same example will produce a $
function that
operates on a full HTML document, including an <html>
document element with
nested <head>
and <body>
tags. This mimics web browser behavior much more
closely, but may require alterations to existing code.
For example, the following code will produce different results between 0.x and
1.0 releases:
const $ = cheerio.load('<p>Hello, <b>world</b>!</p>');
$.root().html();
//=> In version 0.x: '<p>Hello, <b>world</b>!</p>'
//=> In version 1.0: '<html><head></head><body><p>Hello, <b>world</b>!</p></body></html>'
Users wishing to parse, manipulate, and render full documents should not need
to modify their code. Likewise, code that does not interact with the "root"
element should not be effected by this change. (In the above example, the
expression $('p')
returns the same result across Cheerio versions--a Cheerio
collection whose only member is a paragraph element.)
However, users wishing to render document fragments should now explicitly
create a "wrapper" element to contain their input.
// First, create a Cheerio function "bound" to an empty document (this is
// similar to loading an empty page in a web browser)
var $ = cheerio.load('');
// Next, create a "wrapper" element for the input fragment:
var $wrapper = $('<div/>');
// Finally, supply the input markup as the content for the wrapper:
$wrapper.append('<p>Hello, <b>world</b>!</p>');
$wrapper.html();
//=> '<p>Hello, <b>world</b>!</p>'
Change log:
- Update History.md (and include migration guide) (Mike Pennisi)
- Rename
useHtmlParser2
option (Mike Pennisi) - Remove documentation for
xmlMode
option (Mike Pennisi) - Document advanced usage with htmlparser2 (Mike Pennisi)
- Correct errors in Readme.md (Mike Pennisi)
- Improve release process (Mike Pennisi)
- 1.0.0-rc.1 (Mike Pennisi)
- Update unit test (Mike Pennisi)
- Normalize code style (Mike Pennisi)
- Added support for nested wrapping. (Diane Looney)
- Add nested wrapping test (Toni Helenius)
- Added $.merge following the specification at https://api.jquery.com/jquery.merge/ Added test cases for $.merge (Diane Looney)
- Clarify project scope in README file (Mike Pennisi)
- .text() ignores script and style tags (#1018) (Haleem Assal)
- Test suite housekeeping (#1016) (DianeLooney)
- experiment with a job board (Matthew)
- Change options format (inikulin)
- Add test for #997 (inikulin)
- Update .filter function docs. (Konstantin)
- Standardise readme on ES6 variable declarations (Dekatron)
- Use documents via $.load (inikulin)
- Use parse5 as a default parser (closes #863) (inikulin)
- Fix small typo in Readme (Darren Scerri)
- Report test failures in CI (Mike Pennisi)
- serializeArray should not ignore input elements without value attributes (Ricardo Gladwell)
- Disallow variable shadowing (Mike Pennisi)
- Update hasClass method (sufisaid)
- Added MIT License fixes #902 (Prasanth Vaaheeswaran)
- chore(package): update dependencies (greenkeeper[bot])
- Use modular lodash package (#913) (Billy Janitsch)