Saturday, January 29, 2011

Set default MySQL connect charset for PHP (in RHEL)?

We're running a hundred or so legacy PHP websites on an older server which runs Gentoo Linux. When these sites were built latin1 was still the common charset, both in PHP and MySQL.

To make sure those older sites used latin1 by default, while still allowing newer sites to use utf8 (our current standard), we set the default connect charset in php.ini:

mysql.connect_charset = latin1
mysqli.connect_charset = latin1
pdo_mysql.connect_charset = latin1

Specific more modern sites could override this in their bootstrapping code with:

<?php
mysql_set_charset("utf8", $dsn );

...and all was well.

Now the server is overloaded and we're no longer with that hoster, so we're moving all these sites to a faster server at our standard hoster, which uses RHEL 5 as their OS of choice.

In setting up this new server I discover to my surprise that the *.connect_charset directives are a Gentoo specific patch to PHP, and RHEL's version of PHP doesn't recognize them! Now how do I set PHP to connect to MySQL with the latin1 charset?

I thought about setting a default in my.cnf but would prefer not to force every app and client to default to latin1. Our policy is to use utf8, and we'd like to restrict the exception to PHP only. Also, converting every legacy site to properly use utf8 is not doable since many are of the touch 'm and you break 'm kind. We simply don't have the time to go fix them all.

How would I set a default mysql/mysqli/pdo_mysql connection charset to latin1 for PHP, while still allowing individual scripts to override this to utf8 with mysql_set_charset()?

  • default_charset = "latin1" Should do the trick, placed inside php.ini.

    Edit: This obviously isn't exactly the same thing, so you may have better control by using this .htaccess directive for each of those old domains:

    AddDefaultCharset ISO-8859-1 Though I haven't tested it.

    Dan Carley : Those two options only affect output character encoding from PHP and Apache. It won't affect how data is read from MySQL, further down the chain.
    Martijn Heemels : Those options determine how PHP outputs to the browser, not how it retrieves data from the database. So I'm afraid they won't work.
    Martijn Heemels : Also, be careful with the default_charset option, since that *forces* all output to be the specified charset, and cannot be overridden by a .htaccess or a tag.
    From gekkz
  • Might this do what you are after?

    mysql_query('SET NAMES latin1');
    

    (Preferable called right afterwards you've established the database connection.)

    Dan Carley : This performs the same bootstrapping function as `mysql_set_charset()` but isn't the recommended approach as per the notes section of http://php.net/manual/en/function.mysql-set-charset.php
    Martijn Heemels : Like Dan says, 'mysql_set_charset()' is the recommended version of that statement. I'd still need to modify every single site though. Maybe I could use PHP's auto_prepend_file?
    andol : While mysql_set_charset() might be the recommended approach, it requires PHP 5 >= 5.2.3, which currently isn't available in RHEL. The PHP version in RHEL 5.4 is 5.1.6.
    Martijn Heemels : @andol, good that you mention it. We're using Zend Server, which includes PHP 5.2.11. Before you ask, it doesn't recognize the *.connect_charset directives either.
    From andol
  • Well, after some searching it appears the mysql*.connect_charset is a Gentoo specific patch. I've found no way to get the same specific behaviour with RHEL's default PHP package or Zend Server's PHP stack.

    I've resorted to defaulting MySQL to use latin1, because the majority of sites on this server are legacy. New sites have their charset defined explicitly so they will override the default.

0 comments:

Post a Comment