mirror of
https://github.com/WordPress/WordPress.git
synced 2026-06-19 07:37:07 +00:00
HTML API: Ensure that code points always encode to UTF-8
This was brought up during fuzz testing of the HTML API. After polyfilling `mb_chr()` and relying on it in the HTML decoder, it became possible that for sites with a non-UTF-8 charset selected, then the creation of text from code points when decoding numeric character references might produce corrupted text, or text which encodes to non-UTF-8 bytes. While for these sites, there are broader issues with non-UTF-8 support, this change ensures that code point encoding remains deterministic. Developed in: https://github.com/WordPress/wordpress-develop/pull/12155 Discussed in: https://core.trac.wordpress.org/ticket/65372 Follow-up to [62424]. Props dmsnell, jonsurrell. See #65372. Built from https://develop.svn.wordpress.org/trunk@62487 git-svn-id: http://core.svn.wordpress.org/trunk@61768 1a063a9b-81f0-0310-95a4-ce76da25c4cd
This commit is contained in:
@@ -424,7 +424,7 @@ class WP_HTML_Decoder {
|
||||
* @return string Converted code point, or `�` if invalid.
|
||||
*/
|
||||
public static function code_point_to_utf8_bytes( $code_point ): string {
|
||||
$string = mb_chr( $code_point );
|
||||
$string = mb_chr( $code_point, 'UTF-8' );
|
||||
|
||||
return false !== $string ? $string : '�';
|
||||
}
|
||||
|
||||
@@ -16,7 +16,7 @@
|
||||
*
|
||||
* @global string $wp_version
|
||||
*/
|
||||
$wp_version = '7.1-alpha-62486';
|
||||
$wp_version = '7.1-alpha-62487';
|
||||
|
||||
/**
|
||||
* Holds the WordPress DB revision, increments when changes are made to the WordPress DB schema.
|
||||
|
||||
Reference in New Issue
Block a user