I think one of the best ways is to use link
, like:
<link rel="alternate" hreflang="en-gb" href="http://en-gb.site.com" />
<link rel="alternate" hreflang="pt-br" href="http://pt-br.site.com" />
This will already make search engines already show content in the right language, in most cases.
But obviously this is not enough. To get the user language, a good way is to get window.navigator.language
, then use some kind of matcher from BCP-47 , so if your browser result is en-us
it would get the closest language available, en-gb
, for example. If there is not a next language it will fall into a pattern, which is what you define.
This is the first time I try to make a website with multiple languages, so what I'm going to write here is what I'm doing, but it was not "production tested" and there might be other problems with this method.
I'm using GopherJS, Golang, so I just copied what I used and tried to simplify it, but you'll need to find equivalent functions for Javascript.
What I have done is simply:
<div data-text="welcome"></div>
Then, when you start the page, you get the client language, using window.navigator.language
, and there is a dictionary and list of languages:
var Idiomas = []language.Tag{language.English, language.BrazilianPortuguese}
var Dicionario = texts{
"welcome": {
"Welcome",
"Bem-vindo",
},
"another-text": {
"Another text",
"Outro texto",
},
}
Now, to see what content to display, I use text/language
, you should find some other for Javascript. That way it simply becomes:
var idiomaEscolhido string
// Obtêm o idioma do subdominio, se existir
subdominio := strings.Split(js.Global.Get("location").Get("host").String(), ",")
if len(subdominio) > 1 && subdominio[0] != "meusite" {
idiomaEscolhido = subdominio[0]
}
// Obtêm o idioma do navegador, se não há definido até então
if idiomaUsuario = "" {
idiomaUsuario = js.Global.Get("navigator").Get("language").String()
}
// Encontra o idioma mais próximo
_, index := language.MatchStrings(language.NewMatcher(AvailableLanguages), lang)
index
represents 0
to language.English
and 1
to language.BrazilianPortuguese
. First we use the language of the subdomain (as en-us.meusite.com
), but if there is no subdomain (like meusite.com
) we will use the language of the browser. So when loading the page just make one:
for _, el := range t.QuerySelectorAll("[data-text]") {
text, ok := Dicionario[el.GetAttribute("data-text")][index]
if !ok == "" {
logs.Warn("Não encontramos textos para " + el.GetAttribute("data-text"))
}
t.SetText(el, text)
}
The idea of this code is to get all data-text
and replace. So, if there is data-text="welcome"
it should get the text that is on the dicionario
map. Then, dicionario["welcome"][0]
will be English, and dicionario["welcome"][1]
will be Portuguese.
This has some limitations, such as date or plural formatting. For example, in Brazil, we use DD/MM/AAAA
, while in other places it is MM/DD/AAAA.
One way to mitigate this is by using an international standard, but this may be strange in some cases.
On the performance side, there is a way to mitigate the impact. Since you have to find all data-text
and then perform text insertion, you can use HTMLTemplateElement ( <template>
) or HTMLSlotElement ( <slot>
). These two elements are not rendered except when started. This indicates that all data-text
within it will not be translated and will not be fetched. Then, every time you display a template
you will have to translate the text contained in it.
This reduces the impact, since only what was started will be translated. So, just what the user sees will be translated, you will not waste time translating what the user will not see.
On the SEO side, you can do the same thing on JS on the server side. The purpose here is simple. When the user accesses en-us.site.com
the bot will already see the content in English, then all data-text
will already be filled in advance, even if the client does not have Javascript.
For this, it is possible to create, in my case, a http.Handle
, in it will execute a code using htmlquery
, which uses XPATH. Then, it creates a file index_en_us.html
and index_pt_bt.html
, both have content data-text
already filled and previously stored.
When the user accesses en-us.site.com
it will be served by index_en_us.html
. However, if it clicks "Portuguese", the effects will be immediate and you will not have to wait for a request for pt-br.site.com
, since the translation can also be done on the client.
If you use NodeJS you can do the same thing, maybe even reuse the client code on the server.
When the user accesses the default page, site.com
, it will be served with index_en_us.html
. However, content will be translated on the client side, based on the language of the browser.
You could also read the Accept-Language
header on the server, and give content with the best language. However, many CDNs (such as CloudFlare) do not respect Vary
. So even if you return a header of Vary: Accept-Language
, to indicate that the content is another depending on Accept-Language
, they still give the same result for any request. So I preferred to send the default content and the client, in Javascript, translate. The translation on the client side is positive because it is immediate, so the user can switch the language at any time without having to wait.