在上篇日志 “apache 乱码
” 中讲到了如何用.htaccess 解决了 apache 服务器编码问题,这次我们来深入的了解一下.htaccess 对服务器编码设置的其他功能设置选项.
此文档主要参考 W3C 的一篇技术文档:Setting charset information in .htaccess
Specifying by extension
Use the AddCharset directive to associate the character encoding with all files having a particular extension in the current directory
and its subdirectories
. For example, to serve all files with the extension .html as UTF-8, open the .htaccess file in a plain text editor and type the following line:
AddCharset UTF-8 .html
The extension can be specified with or without a leading dot. You can add multiple extensions to the same line. This will still work if you have file names such as example.en.html or example.html.en.
The example will cause all files with the extension .html to be served as UTF-8. The HTTP Content-Type header will contain a line that ends with the 'charset' information as shown in the example that follows.
Content-Type: text/html; charset=UTF-8
Note:
All files with this extension in all subdirectories of the current location will also be served as UTF-8. If, for some reason, you need to serve the odd file with a different encoding you will need to override this using additional directives.
Note:
You can associate the character encoding with any extension attached to your file. For example, suppose you do language negotiation and you have pages in two languages that follow the model example.en.html and example.ja.html. Let's also suppose that you are happy to serve English pages using your server's ISO-8859-1 default, but want to serve Japanese files in UTF-8. To do this, you can associate the character encoding with the language extension, as follows:
AddCharset UTF-8 .ja
Take note, however, that, if you can, it might be a better solution to change the server default to UTF-8, or serve all files in new directories as UTF-8.
Note:
It is also possible to achieve the same result using the
AddType
directive, although this declares both the character encoding and the MIME type at the same time. The decision as to which is most appropriate will depend in part on how you are using extensions for content negotiation. If you are using different extensions to express the document type and the character encoding, this is less likely to be appropriate.
AddType 'text/html; charset=UTF-8' html
Changing the occasional file
Let's now assume that you want to serve only one file as UTF-8 in a large directory where all the other older files are correctly served as ISO-8859-1. The file you want to serve as UTF-8 is called example.html. Open the .htaccess file in a plain text editor and type the following:
<Files "example.html"> AddCharset UTF-8 .html </Files>
What we did here was wrap the directive discussed in the previous section in some
markup
that identifies the specific file we are concerned with. If you have the need, there is also a slightly different syntax that allows you to specify a number of file names using a regular expression.
Note:
It is also possible to achieve the same result using the AddType directive shown above, or, in this case, the
ForceType
directive, although these declare both the character encoding and the MIME type at the same time.
<Files "example.html"> ForceType 'text/html; charset=UTF-8' </Files>
Note:
Any files with the same name in a subdirectory of the current location will also be served as UTF-8, unless you create a counter directive in the relevant directory.
More complex scenarios
When two extension rules apply to the same document the order of extensions is important. Thus, in the following example
AddCharset UTF-8 .utf8 AddCharset windows-1252 .html
the file 'example.utf8.html' will be served as "windows-1252" and 'example.html.utf8' as UTF-8.