How not to do Localization
The content of this blog post applies to all Apple platforms and to most other user-facing operating systems and frameworks. For brevity, Iāll use iOS as an example but almost everything in here can be done similarly on other platforms.
Some time ago I stumbled upon the following code in a project I was working on:
Whatās wrong about this you ask? Well, almost everything.
What it does
The Localizable.strings
files are used to provide translations for strings used within an app on Appleās platforms. The first one above is the translation-file for German, the second one is for English. E.g. to translate the title of a button we could do the following:
NSLocalizedString
checks the deviceās current language and looks at the appropriate file to fetch the string. So if you have set your device to English the buttonās title will be āDoneā, if you have set it to German, the buttonās title will be āFertigā.
Except NSLocalizedString
āand the whole localization system of iOS/macOSāis much more intelligent. E.g. it will find the best matching language. This is what my āLanguage and Regionā settings screen looks like:
On the first screen, you can see that my iPhone is currently set to German (=Deutsch). But below you see my list of āpreferred languagesā. This is used to determine what language should be used. If Iām going to use an app that was developed in French, uses French as the default language and is only localized to English but not to German you might assume that the app would be displayed in French to me (as itās the default language and German isnāt available). But thatās not what happens. Because I have set English as my second choice, the system chooses the English language localization of the app for me. Thatās great because my French is really rusty.
But there is one more thing. If you look closely, on the second screen you can see that my language is actually set to āDeutsch (Deutschland)ā so itās actually āGerman (Germany)ā or, to use itās locale-identifier: de_DE
.
This specifies the variant of German spoken in Germany in contrast to the one spoken in Austria (de_AT
).
You are probably more familiar with variants of English like en_UK
and en_US
.
These variants allow you to fine-tune your translations to the specific dialect of a language.
But now letās look at the example above again. We had translations for en
and de
, but not for de_DE
. Now, if the appās default language is English, does that mean I would get the app in English because de_DE
is not available and neither is my second choice en_UK
? So would I have to set my preferred languages to āGerman (Germany), German (Austria), German (Switzerland), English (U.K), English, English (Australia), English (India)ā just to be sure I get a language I like instead of an appās default language? Again, the system is more intelligent than that. It knows that de_DE
is a variant of de
and by supplying only a de
translation the app developer assures us that this translation works for all variants of German. So even though I never set de
as an acceptable language, this is what I will get (which will most likely be what I expect).
Great, so we can have base translations for a language family and we can have specific dialects.
So whatās the problem of putting an en_US
āCURRENT_LOCALEā key into the en
file and using it for the date formatter?
Okay, it would use US-American names for the month even for someone from the UK.
But last I checked, they spelt at least the month and weekday names the same, so what does it matter?
Language ā Locale
The problem is that we are not just setting the language of the date formatter by setting its locale but much more.
The locale encompasses not only the language (and the dialect) used, but also which currency is used, how numbers are formatted, which calendar to use and, surprise, how dates are formatted.
You can specify all that in a locale-identifier, so e.g. en_US@calendar=japanese
specifies I want the US-English language variants, all the defaults from the US-region, but a Japanese calendar.
But even if I donāt specify all properties of a Locale in the identifier, the region sets some defaults. So for example, my locale identifier de_DE
means the following:
- Language: Germanyās German
- Decimal Separator:
,
(yes we use a comma, so 2,5 means two-and-a-half here) - Calendar: Gregorian
- Currency Code: ā¬
- Date Formatting:
28.07.17, 16:05
1 (We specify the day of the month before the month and we use a dot to separate the components. We also like a 24-hour time style.)
and many other things. If you want to see all the things specified by a Locale, take a look at the documentation by Apple or of your favourite framework.
So an en_US
Locale without any other specifiers means the following:
- Language: US English
- Decimal Separator:
.
- Calendar: Gregorian
- Currency Code: $
- Date Formatting:
7/28/17 4:05 PM
But what does an en_UK
Locale default look like? This:
- Language: British English
- Decimal Separator:
.
- Calendar: Gregorian
- Currency Code: ļæ”
- Date Formatting:
28/7/2017, 16:05
So people2 from the UK apparently agree with the Germans that it should be day-month-year and one should use a 24-hour time format.
But they use slashes as separators, instead of the dots used by Germans.
Oh, and did you notice that people from the US and Germany are usually happy with the the year specified as 17
, whereas people from the UK would also like to know the millenium we are talking about?
The Problem
So by using the language-file to determine the locale-code, doing that incorrectly (because e.g. en
does not autoamtically mean en_US
and de
does not automatically mean de_DE
) and using the resulting code, e.g. en_US
, to instantiate and set a Locale on the date formatter, we override the defaults the user has specified in the system preferences.
This leads to a user from the UK, or anyone who prefers their apps in a variant of english, getting their dates formatted as 7/28/17
and being forced to use AM/PM again.
It also means if we use the same way to set the locale of a NumberFormatter
, people from the UK would get $
as their currency symbol. And while people from Germany and Austria format a number like ten-thousand-and-a-half like this 10.000,5
and the UK and US format it like this 10,000.5
, German-speaking people from Switzerland format it like this: 10'000.5
.
But the Swiss canāt seem to agree as people speaking french in Switzerland do it like this: 10Ā 000.5
.
And while those are nice defaults for the separate regions, maybe some people would like to change it. Maybe I want my device-language to be German but I kinda like those slashes the people from the UK have and because Iām programming a lot Iām actually more used to having .
as a decimal separator. So I just go to the system preferences and change my region to U.K.
but keep the language as Deutsch
.
In macOS you can even configure your own date and number formats to use, if you like. While this is not yet possible on iOS, who says it wonāt be introduced with the next iOS version?
And thatās not all, I can also change the calendar. Maybe I want to try out an islamic calendar. There are several choices, I will go with the islamicTabular
one and get 05.11.1438 AH
for the date above. And while we are at it, if I change my language to arabic the number symbols also change, so it looks like this: Ł„ Ų°Ł. ŁŲ Ł”Ł¤Ł£ŁØ ŁŁ
. I hope you all have browsers with proper unicode-support.
How to do it correctly
So, how can we support all the fancy configurations a user has made? If I canāt get the Locale-code from the language-file how am I supposed to figure out what the user wants? What Locale am I supposed to set on that date formatter?
Well, thatās actually really easy: Apple provides [NSLocale currentLocale]
which contains all the settings the user made. So you can just use that. But you know what? If you create a new DateFormatter
(or any other Formatter for that matter), thatās actually the default. So all you need to do to support all of this awesomeness: Donāt break it!
Do not set the locale
property if you donāt know what you are doing and how locales work.
Other DON'Ts
- Do not set the
dateFormat
property on DateFormatters that are formatting user-visible dates. Only use it for formatting dates sent to other computers that need a fixed format. And in that case you probably wantISO8601DateFormatter
anyway. - Do not rely on strings produced by formatters having a specific length in your UI. As you can see above there are lots of different versions.
- In general, do not assume any region, calendar, currency, decimal delimiter, number symbol, date format etc. for your users.
- Donāt use
preferredLanguages
to figure out which language to request from a server that supplies data to your app.preferredLanguages
contains all the userās languages including the ones your app doesnāt support. So if the userās device is set to French, but your app only supports English, but your server supports French, your app will request French data from the server and suddenly mix English and French.- To figure out which of your supported languages the user prefers use
[[NSBundle mainBundle] preferredLocalizations]
- To figure out which of your supported languages the user prefers use
DOs
- Do use Formatters. Donāt event think about trying to format numbers, dates, date intervals, currencies, person names or any kind of measurements.
No, not even if your app should only be released in one single country that only speaks a single language (how many of those are there?).
And
[NSString stringWithFormat:"Your number: %d", theNumber]
(or string interpolation in Swift) counts as ārolling your own formatterā3. Apple did a really good job taking care of all those variants and edge-cases. Take advantage of it!- The
DateComponentsFormatter
is good for formatting durations and things like āx minutes remainingā. However, it has some limitations, e.g. it cannot create a string like āx minutes agoā.
- The
- If you need more fine-grained control about the date-format (do you really?), use the
setLocalizedDateFormatFromTemplate
-method to specify the components you need but let theDateFormatter
take care of the ordering.
Further Reading
- Download this playground and play around with a few formatters to get a better feel for them.
- Take a look at Appleās Internationalization and Localization Guide.
- The Reviewing Language and Region Seetings chapter gives a nice overview how the calendar app behaves for different languages, regions and calendars.
- ISO 639 for language codes
- ISO 3166 for country codes
- ISO 15924 for script codes
- ISO 4217 for currency codes
- Internationalization Best Practices from WWDC 2016
-
This uses the short
dateStyle
andtimeStyle
ofDateFormatter
. The longer styles already contain month names in some cases and I wanted to focus on the order of the components and the delimiters, not translations of names.Ā ↩ -
Whenever Iām referring to āpeople fromā in this post what I actually mean is āmany people fromā or āthe default in this country isā. And the ādefaultā Iām describing is the result of an iOS
DateFormatter
orNumberFormatter
set to the specific language and region. It might be that Apple got their defaults wrong for one of my examples, in that case, please let me know.Ā ↩ -
If you really, really need to use
stringWithFormat:
for a string with numbers displayed to a user at least uselocalizedStringWithFormat:
Ā ↩