#104 rhino xhr does not use contentType;charset or other content sniffing

Type	To find
responsible:me	tickets assigned to you
tagged:"@high"	tickets tagged @high
milestone:next	tickets in the upcoming milestone
state:invalid	tickets with the state invalid
created:"last week"	tickets created last week
sort:number, importance, updated	tickets sorted by #, importance or updated
Combine keywords for powerful searching.
Use advanced searching »

#104 new

rhino xhr does not use contentType;charset or other content sniffing

Reported by nickg | March 15th, 2010 @ 02:33 AM

In env-js/src/platform/rhino @ line 251

    xhr.responseText = java.nio.charset.Charset.forName("UTF-8").
        decode(java.nio.ByteBuffer.wrap(baos.toByteArray())).toString()+"";

will either explode with an exception or produce junk if the doco is in latin-1 (or equiv)

First step is to write a test to prove this. then ideally follow a basic charset sniffer algorithm:

use content-type;charset=
if html, look for meta tag
look at byte-order marks
default to utf8

there are more complicated versions to do this, but this should cover the basics

In a different note, a bit above (line 219)

   var contentEncoding = connection.getContentEncoding() || "utf-8",

should just be || '', the content-encoding isn't used for charset encoding.

No comments found

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.