Writing Multilingual WebPages
Websites are one of the most effective tools of reaching your audience and customers anywhere in the world. Websites are viewed all over the globe from Manchester to Montreal and Makkah to Moscow, by people speaking different languages and belonging to different cultures. This diversification of culture and language demands the same flexibility and diversity from websites, and this is the reason why we need multilingual WebPages. Different cultures see and approach things in different ways, and you will need to take this into account if you want your site to truly reach out to a global audience. Multilingual web sites demand an exceptional mix of linguistics, cultural and technological knowledge to smoothly deliver intelligent, interactive interfaces in various languages.
Multilingualism of a site means here that the same textual content is available to users in different language versions, for all or at least some of the pages. Normally each page is in one language only, at least for the most of it. Sometimes multilingualism can be implemented so that one page contains texts in different languages. However, this is usually practical only if there are just a few languages and the texts are short, e.g. on a page where the main content is an image, accompanied with short captions in two or a few languages. In most cases, separate pages in separate languages are needed.
Multilingualism in this sense normally means that each language version of a page is in a file of its own and can be referred to using a Web address (URL) of its own. But since it would be difficult to announce the address of a French version to French-speaking people, the address of a German version to German-speaking people, etc., it would be best if the same address could be used by all - so that everyone would get the page in his own language, or in the language among the available alternatives that is best understood by him. This can be partly achieved using automatic language negotiation; on the user side, this only requires that the user once specifies his language preferences in the settings of this browser. But for several reasons, the language negotiation mechanism is not sufficient (and not indispensable, on the other hand). In any case the author should write explicit links, through which the user can move e.g. from a German version to a French version and vice versa.
Language negotiation can greatly improve the usability of a site. It is however not necessary, even if the pages exist in different language versions. Neither should one regard it as sufficient. In any case, linking to different language versions is needed.
There are strong reasons to provide links to different language versions even if the server supports language negotiation and arrangements have been made to utilize that. The reasons include the following:
· Browser support to language negotiation cannot be trusted. Some browsers have no support, but most importantly the general awareness about the issue among users is still rather limited. Thus, the information sent by a browser can be in serious conflict with the actual preferences of the user.
· Problems related to caches may cause that the browser gets a wrong language version.
· Users may wish to compare the different language versions or otherwise make use of them. Perhaps someone does not understand a statement in a French version even if French is his native language, but checking the corresponding statement in an English version may help (especially in areas where English is dominant in technical terminology).
· The user may "run into" a language specific page in different ways - by following a link, by using a search engine like AltaVista, or by using an address announced somewhere. This may mean that the entire language negotiation mechanism is bypassed. So the user might run into a page which is all Greek to him but which also exists in a language he knows. Thus, if the page has links to the other versions, it will help.
In practice, it is best to start with linking and only then consider whether there is a need and a possibility to use language negotiation, too. This is an example of "augmentative authoring": do first something simple that works for sure, though perhaps primitively, and whenever you add new possibilities, make sure that the old ways still keep working, at least in situations where the new ones don't.
It's a tough question whether language-specific or generic links should be used within the site itself and in references to its pages from outside. Normally generic links are preferable, but such an approach makes things more difficult, if the user wishes to read pages in a language which is not topmost in his preferences. For example, if I'd like to know, perhaps just out of curiosity, what information exists in Italian at the Debian, then I can select its main page in Italian. But when I follow links there, I will get versions as determined by the language preferences in my browser. It is true that I can, after finding myself in a page in Finnish, find there a link to the Italian version, but the problem is then repeated whenever I follow a link. This however should probably be regarded as an exceptional case, which should be handled by the user e.g. by temporarily changing the language preferences in the browser. To summarize, links should normally be generic.
Before ASP.NET, developing multilingual applications in HTML was a daunting task. Programmers were required to develop multiple pages for supporting multiple languages. Though, PHP and ASP was introduced, there was not much relief for programmers in terms of saving their time. ASP.NET together with Visual Studio makes life easier for a programmer. Being enriched with namespaces such as System Globalization and System Resources Manager and in the presence of Culture-Info class, a programmer feels at ease for going beyond the scope of a particular language.
Using Unicode:
Unicode is designed to allow single documents to contain characters or text from many scripts and languages, and to allow those documents to be used on computers with operating systems in any language and still remain intelligible. It is therefore ideally suited to the World Wide Web.
The HTML 4.0 Specification made a major step towards internationalizing the World Wide Web by adopting the Universal Character Set (as specified in ISO/IEC 10646 Information Technology - Universal Multiple-Octet Coded Character Set (UCS)) as the document character set for HTML. The UCS as specified in ISO/IEC 10646-1:2000 is precisely equivalent to the Unicode Standard 3.0.
RFC 2070 (Internationalization of the Hypertext Markup Language) has also been incorporated into HTML 4.0, which now includes provision for languages that are written right-to-left (such as Arabic and Hebrew), for appropriate punctuation, and for combining of letters and diacritics. Recent versions of Internet Explorer go even further, with support for Mongolian, which is written top-to-bottom.
Developing Multilingual Web Applications Using Java Server Pages Technology:
JavaServer Pages (JSP) technology has become a favorite tool for developers of web applications. With JSP pages, developers can design dynamic web pages without the need for other programming knowledge. At the same time, web developers can use an extensible tag mechanism to harness the power of underlying software components. JavaServer Pages (and several related technologies) form the presentation layer of web applications. With JSP pages, the developer can create dynamic web pages that interact with business logic, databases, and other services available on the network. Pages developed using JSP technology combine HTML, XML, or other static content with XML-like tags that connect to underlying software libraries, which are typically written in the Java programming language. Java technologies that are particularly important in this context are the JavaBeans components architecture (as a general-purpose interface between JSP and Java classes), the Java Database Connectivity (JDBC) APIs for access to SQL databases, and various libraries for XML processing.
JSP pages themselves are compiled to Java code in the form of servlets for execution. Servlets are web server extensions that are compiled and linked into the server, thus enabling faster execution than scripting languages. Servlets directly programmed in the Java programming language and JSP pages are often used together, with servlets acting as controllers and JSP pages as views of the application.
JavaServer Pages and the underlying servlet technology provide extensive support for handling HTTP request and response information as well as for session maintenance using cookies or URL rewriting.
An important reason for using JSP technology is that it allows the work of page authors and application developers to be separated. While it ispossible to embed Java statements directly into JSP pages, developers have realized that this is best avoided and now prefer custom tags
When designing a multilingual web application, you must first decide how to determine the user's language and locale preferences, and how to match those preferences against the set of locales that the application and the underlying Java runtime environment support. This section first describes the external environment and requirements web applications have to deal with. Next, we'll take a look at the functionality provided by the underlying Java 2 Standard Edition (J2SE) platform, and finally see how JavaServer Pages Standard Tag Library tags connect the environment and J2SE. A web application has two ways to determine the user's language preferences: First, it can use language and locale preferences that are transmitted from the browser to the server using the HTTP request header field
Accept-Language
. While the standard provides for a variety of language tags, normal use combines an ISO 639 language code (for example, ja
for Japanese) and an ISO 3166 country code (for example, IT
for Italy). Browsers usually let the user create a list of languages as part of their preferences. Unfortunately, this way is not very reliable; users may or may not create the list, and the list may or may not include a locale that the application supports. Because of these uncertainties, multilingual applications usually also employ a second way: They let the user choose directly from a list of supported languages, and store the chosen language as part of the user's profile or just for the duration of the session. A good approach is to use the Accept-Language
information initially, when nothing is known about the user, but give the user an opportunity to choose a language explicitly on the application's start page.It's worth noting that the
Accept-Language
locales are intended primarily for language and cultural preferences. They shouldn't, for example, be interpreted as indicating the user's country of residence. Also, in many cases browsers provide locales that only have a language code, while some locale-sensitive functionality (for example, date formatting) varies from country to country. In many cases, it may be reasonable to assume the conventions of the main country using the language (if no country is specified); for example, use a date format known in Japan if only Japanese is specified. However, if important functionality depends on the country (for example, the currency), the user needs to be given a chance to correct the assumption.In many cases, web applications are assembled from several components, which may be localized for different language sets. One particularly interesting component is the Java runtime environment, which in some locale-sensitive areas of functionality (such as date formatting) may support over 100 locales in over 40 languages, far more than typical web applications. Thus, the developer of an application has to decide whether to restrict localized functionality to the languages supported throughout the application, or take advantage of the capabilities of each component. The first approach has the advantage that the user sees pages that use the same language throughout, while the second may result in pages that mix different languages -- one language for most of the text, but a different one for, say, formatted dates.
Keywords:
Of course, there’s little point in putting all this effort into a cross-cultural design if your global audience is unable to find your site. To make the most of search engines you will need to put the same careful research into choosing keywords for other nations as you do for your home country.
A direct ‘dictionary translation’ of the keywords used for the home version of your site will not suffice. The terms people use in their searches often include synonyms, acronyms, colloquialisms and abbreviations. For example, a direct French translation of ‘car insurance’ would be ‘assurance voiture’. However, a far more popular search term in France is ‘assurance auto’ (with ‘auto’ in this context being an abbreviation of ‘automobile’).
You will need to find out what is popular in your target area and choose your keywords accordingly. Google’s keyword tool can help with this.
Benefits of writing Multilingual Web Pages:
- The organization's mandate can reach new potential users without print media and associated costs (which is a separate issue by itself and treated in another section of this manual).
- As a communications solution, it can be scaled to any audience size and projected into any location at no extra cost.
- Rapidly changing content can be published and revised in many languages simultaneously without the need to manage an inventory of printed material for global distribution.
- If you already have a Web site in one language, then its content is already in digital form, organized for translation. It's simply a matter of delivering the files to a translation agency by e-mail or FTP. This unique characteristic of Web sites reduces turnaround and logistical problems for you the client, especially on highly interactive sites.
- The organization needs to be aware of its resource allocation and the additional costs which translations may incur.
Comments
Post a Comment