Tamer Embaby
2007-04-04 08:59:37 UTC
All,
I have character encoding problem with my environment:
$ uname -a
SunOS vulcano 5.10 Generic_118844-26 i86pc i386 i86pc
Server: Apache/2.0.58 (Unix) mod_perl/2.0.3 Perl/v5.8.4
I'm hosting commercial application using mod_perl, the site we are
dealing with has Arabic character so I changed the following in Apache
to add support for UTF-8 charset:
AddDefaultCharset UTF-8
The application itself doesn't handle character set encoding as I
verified
with the vendor that they don't have anything to do with character
encoding
and they verified that their application is working fine in the same
settings so that the problem is with my environment.
Somehow something is transforming characters with encoding above 0x7f to
HTML character entities &#XX; so that the document with Arabic letters
arrive to the browser corrupted.
I started to suspect it's something either with Apache or mod_perl that
is
doing that, Apache itself is capable of serving static files with UTF-8
encoding correctly (without transforming UTF-8 character to HTML char
entities).
Below is additional info about my server.
Would anyone have an idea about what might be causing this? And how to
correct it.
I have a hunch that it's something to do with the Locale passed to the
mod_perl that I should be using "PerlPassEnv LANG" or something.
Any pointers are appreciated.
Thanks,
Tamer
----- INFO BEGIN -----
$ ../../bin/apachectl -l
Compiled in modules:
core.c
mod_access.c
mod_auth.c
mod_include.c
mod_log_config.c
mod_env.c
mod_setenvif.c
prefork.c
http_core.c
mod_mime.c
mod_status.c
mod_autoindex.c
mod_asis.c
mod_cgi.c
mod_negotiation.c
mod_dir.c
mod_imap.c
mod_actions.c
mod_userdir.c
mod_alias.c
mod_so.c
$ locale -a
C
POSIX
de
es
fi
fr
iso_8859_1
nl
ru
sl
$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=
$ perl -V
Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
Platform:
osname=solaris, osvers=2.10, archname=i86pc-solaris-64int
uname='sunos localhost 5.10 i86pc i386 i86pc'
config_args=''
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=define use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
-D_TS_ERRNO',
optimize='-O2 -fno-strict-aliasing',
cppflags=''
ccversion='GNU gcc', gccversion='', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t', lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =''
libpth=/lib /usr/lib /usr/ccs/lib
libs=-lsocket -lnsl -ldl -lm -lc
perllibs=-lsocket -lnsl -ldl -lm -lc
libc=/lib/libc.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-R
/usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE'
cccdlflags='-fPIC', lddlflags='-G'
Characteristics of this binary (from libperl):
Compile-time options: USE_64_BIT_INT USE_LARGE_FILES
Locally applied patches:
22667 The optree builder was looping when constructing the ops
...
22715 Upgrade to FileCache 1.04
22733 Missing copyright in the README.
22746 fix a coredump caused by rv2gv not fully converting a PV
...
22755 Fix 29149 - another UTF8 cache bug hit by substr.
22774 [perl #28938] split could leave an array without ...
22775 [perl #29127] scalar delete of empty slice returned
garbage
22776 [perl #28986] perl -e "open m" crashes Perl
22777 add test for change #22776 ("open m" crashes Perl)
22778 add test for change #22746 ([perl #29102] Crash on assign
...
22781 [perl #29340] Bizarre copy of ARRAY make sure a pad op's
...
22796 [perl #29346] Double warning for int(undef) and abs(undef)
...
22818 BOM-marked and (BOMless) UTF-16 scripts not working
22823 [perl #29581] glob() misses a lot of matches
22827 Smoke [5.9.2] 22818 FAIL(F) MSWin32 WinXP/.Net SP1 (x86/1
cpu)
22830 [perl #29637] Thread creation time is hypersensitive
22831 improve hashing algorithm for ptr tables in perl_clone:
...
22839 [perl #29790] Optimization busted: '@a = "b", sort @a' ...
22850 [PATCH] 'perl -v' fails if local_patches contains code
snippets
22852 TEST needs to ignore SCM files
22886 Pod::Find should ignore SCM files and dirs
22888 Remove redundant %SIG assignments from FileCache
23006 [perl #30509] use encoding and "eq" cause memory leak
23074 Segfault using HTML::Entities
23106 Numeric comparison operators mustn't compare addresses of
...
23320 [perl #30066] Memory leak in nested shared data structures
...
23321 [perl #31459] Bug in read()
Built under solaris
Compiled at Jan 21 2005 15:48:11
@INC:
/usr/perl5/5.8.4/lib/i86pc-solaris-64int
/usr/perl5/5.8.4/lib
/usr/perl5/site_perl/5.8.4/i86pc-solaris-64int
/usr/perl5/site_perl/5.8.4
/usr/perl5/site_perl
/usr/perl5/vendor_perl/5.8.4/i86pc-solaris-64int
/usr/perl5/vendor_perl/5.8.4
/usr/perl5/vendor_perl
----- INFO END -----
--
Tamer Embaby <***@itworx.com>
" f u cn rd ths, u cn gt a gd jb n cmptr prgrmmng. "
I have character encoding problem with my environment:
$ uname -a
SunOS vulcano 5.10 Generic_118844-26 i86pc i386 i86pc
Server: Apache/2.0.58 (Unix) mod_perl/2.0.3 Perl/v5.8.4
I'm hosting commercial application using mod_perl, the site we are
dealing with has Arabic character so I changed the following in Apache
to add support for UTF-8 charset:
AddDefaultCharset UTF-8
The application itself doesn't handle character set encoding as I
verified
with the vendor that they don't have anything to do with character
encoding
and they verified that their application is working fine in the same
settings so that the problem is with my environment.
Somehow something is transforming characters with encoding above 0x7f to
HTML character entities &#XX; so that the document with Arabic letters
arrive to the browser corrupted.
I started to suspect it's something either with Apache or mod_perl that
is
doing that, Apache itself is capable of serving static files with UTF-8
encoding correctly (without transforming UTF-8 character to HTML char
entities).
Below is additional info about my server.
Would anyone have an idea about what might be causing this? And how to
correct it.
I have a hunch that it's something to do with the Locale passed to the
mod_perl that I should be using "PerlPassEnv LANG" or something.
Any pointers are appreciated.
Thanks,
Tamer
----- INFO BEGIN -----
$ ../../bin/apachectl -l
Compiled in modules:
core.c
mod_access.c
mod_auth.c
mod_include.c
mod_log_config.c
mod_env.c
mod_setenvif.c
prefork.c
http_core.c
mod_mime.c
mod_status.c
mod_autoindex.c
mod_asis.c
mod_cgi.c
mod_negotiation.c
mod_dir.c
mod_imap.c
mod_actions.c
mod_userdir.c
mod_alias.c
mod_so.c
$ locale -a
C
POSIX
de
es
fi
fr
iso_8859_1
nl
ru
sl
$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=
$ perl -V
Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
Platform:
osname=solaris, osvers=2.10, archname=i86pc-solaris-64int
uname='sunos localhost 5.10 i86pc i386 i86pc'
config_args=''
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=define use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
-D_TS_ERRNO',
optimize='-O2 -fno-strict-aliasing',
cppflags=''
ccversion='GNU gcc', gccversion='', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t', lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =''
libpth=/lib /usr/lib /usr/ccs/lib
libs=-lsocket -lnsl -ldl -lm -lc
perllibs=-lsocket -lnsl -ldl -lm -lc
libc=/lib/libc.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-R
/usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE'
cccdlflags='-fPIC', lddlflags='-G'
Characteristics of this binary (from libperl):
Compile-time options: USE_64_BIT_INT USE_LARGE_FILES
Locally applied patches:
22667 The optree builder was looping when constructing the ops
...
22715 Upgrade to FileCache 1.04
22733 Missing copyright in the README.
22746 fix a coredump caused by rv2gv not fully converting a PV
...
22755 Fix 29149 - another UTF8 cache bug hit by substr.
22774 [perl #28938] split could leave an array without ...
22775 [perl #29127] scalar delete of empty slice returned
garbage
22776 [perl #28986] perl -e "open m" crashes Perl
22777 add test for change #22776 ("open m" crashes Perl)
22778 add test for change #22746 ([perl #29102] Crash on assign
...
22781 [perl #29340] Bizarre copy of ARRAY make sure a pad op's
...
22796 [perl #29346] Double warning for int(undef) and abs(undef)
...
22818 BOM-marked and (BOMless) UTF-16 scripts not working
22823 [perl #29581] glob() misses a lot of matches
22827 Smoke [5.9.2] 22818 FAIL(F) MSWin32 WinXP/.Net SP1 (x86/1
cpu)
22830 [perl #29637] Thread creation time is hypersensitive
22831 improve hashing algorithm for ptr tables in perl_clone:
...
22839 [perl #29790] Optimization busted: '@a = "b", sort @a' ...
22850 [PATCH] 'perl -v' fails if local_patches contains code
snippets
22852 TEST needs to ignore SCM files
22886 Pod::Find should ignore SCM files and dirs
22888 Remove redundant %SIG assignments from FileCache
23006 [perl #30509] use encoding and "eq" cause memory leak
23074 Segfault using HTML::Entities
23106 Numeric comparison operators mustn't compare addresses of
...
23320 [perl #30066] Memory leak in nested shared data structures
...
23321 [perl #31459] Bug in read()
Built under solaris
Compiled at Jan 21 2005 15:48:11
@INC:
/usr/perl5/5.8.4/lib/i86pc-solaris-64int
/usr/perl5/5.8.4/lib
/usr/perl5/site_perl/5.8.4/i86pc-solaris-64int
/usr/perl5/site_perl/5.8.4
/usr/perl5/site_perl
/usr/perl5/vendor_perl/5.8.4/i86pc-solaris-64int
/usr/perl5/vendor_perl/5.8.4
/usr/perl5/vendor_perl
----- INFO END -----
--
Tamer Embaby <***@itworx.com>
" f u cn rd ths, u cn gt a gd jb n cmptr prgrmmng. "