NOTE: If you are a developer, please use a private wiki based on foswiki/trunk on a daily base ...or use trunk.foswiki.org to view this page for some minimal testing.
Use Item11383 for general documentation changes for release 1.1.5. Use Item9693 for docu changes for release 2.0.

Item9126: Wrong encoding in CompareRevisionsAddOn

Priority: CurrentState: AppliesTo: Component: WaitingFor:
Normal Closed Extension CompareRevisionsAddOn  
I use {Languages}{'pt-br'}{Enabled} ticked, {Site}{Locale} is pt_BR.utf8 and {Site}{CharSet} is utf-8.

Everything works fine, except if I enable CompareRevisionsAddOn and look at topic history (everything else keeps working fine): all "special" characters seem to get double-encoded. Problem is that $entity->as_HTML() is called without parameters and this makes HTML::Element to encode all "unsafe" characters (1):

Returns a string representing in HTML the element and its descendants. The optional argument $entities specifies a string of the entities to encode. For compatibility with previous versions, specify '<>&' here. If omitted or undef, all unsafe characters are encoded as HTML entities. See HTML::Entities for details. If passed an empty string, no entities are encoded.

I changed the call from:
        return $element->as_HTML( undef, undef, {} );

to:
        return $element->as_HTML( q|'"<>%&|, undef, {} );

Taking the "dangerous" characters from "safe" encoding. Then everything worked as expected.

Any concern about commiting this change?

-- GilmarSantosJr - 08 Jun 2010

After more reading, I implemented this change (relative to trunk):

$ git diff
diff --git a/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Compare.pm b/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Com
index ed0a7e3..6d31949 100755
--- a/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Compare.pm
+++ b/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Compare.pm
@@ -322,6 +322,10 @@ sub _getTree {
     my $tree = new HTML::TreeBuilder;
     $tree->implicit_body_p_tag(1);
     $tree->p_strict(1);
+    if ( $Foswiki::cfg{UseLocale} ) {
+        require Encode;
+        $text = Encode::decode( $Foswiki::cfg{Site}{CharSet}, $text );
+    }
     $tree->parse($text);
     $tree->eof;
     $tree->elementify;

And it worked, without the change described at my previous comment. With the first solution, parse() method prints lots of messages to STDERR about parsing undecoded utf-8 strings. This solution works with no warnings.

So, what is the best fix? Any other suggestion?

-- GilmarSantosJr - 08 Jun 2010

 
Topic revision: r5 - 13 Sep 2010 - 21:31:51 - KennethLavrsen
 
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. see CopyrightStatement. Creative Commons LicenseGet Foswiki at sourceforge.net. Fast, secure and Free Open Source software downloads