Generating and Parsing Localized Numbers In Windows
While Windows supports dozens or even hundreds of languages, its localization APIs require quite a bit of getting used to. Below is how I solved some common problems related to formatting and parsing a number for a specific locale.
The function GetNumberFormat() formats a number for a particular locale. Its simplest usage looks something like:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
#define ARRAYSIZE(x) ( sizeof(x) / sizeof(x[0]) )
TCHAR buf[80];
int ret = GetNumberFormat
(
LOCALE_USER_DEFAULT, // locale
0, // dwFlags
TEXT("1234567.89"), // lpValue
NULL, // lpFormat
buf, // lpNumberStr
ARRAYSIZE(buf) // cchNumber
);
ASSERT(ret != 0);
|
buf
now contains the number 1234567.89 formatted for the user’s default locale. For example, for the English-United States locale, buf will contain “1,234,567.89”; for German-Germany, “1.234.567,89”; for Hindi, “12,34,567.89”.
The format of the lpValue
parameter is important. From GetNumberFormat()’s MSDN documentation:
lpValue
[in] Pointer to a null-terminated string containing the number string to
format. This string can only contain the following characters. All other
characters are invalid. The function returns an error if the string
indicated by lpValue deviates from these rules.
- Characters ‘0′ through ‘9′.
- One decimal point (dot) if the number is a floating-point value.
- A minus sign in the first character position if the number is a negative value.
Given these constraints, I’ve found the easiest way to convert, say, a double
to a string for use as lpValue
is to use StringCchPrintf() (or, equivalently, wnsprintf() or _sntprintf()), as in:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
int GetNumberFormatDbl(LCID locale, DWORD dwFlags, double value,
const NUMBERFMT* lpFormat, LPTSTR lpNumberStr,
int cchNumber)
{
// DBL_MAX is 1.7976931348623158e+308 and 317 characters
// (including null terminator)
TCHAR szBuf[317];
HRESULT hr = StringCchPrintf(szBuf, ARRAYSIZE(szBuf),
TEXT("%lf"), value);
if (hr != S_OK)
{
SetLastError(ERROR_INVALID_PARAMETER);
return 0;
}
return GetNumberFormat(locale, dwFlags, szBuf, lpFormat,
lpNumberStr, cchNumber);
}
|
One caveat: GetNumberFormatDbl()
does not deal well with very small numbers (below 1e-5 or so).
Parsing a Localized Number String
I spent a lot of time trying to figure out the best way to parse a localized number string until Michael Kaplan mentioned VariantChangeTypeEx(). Once I had that, the rest was (relatively) easy:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
|
// Convert szStr to a BSTR. Returns NULL on failure. Result must be
// freed with SysFreeString.
BSTR TstrToBstr(LPCTSTR szStr)
{
#if defined(UNICODE)
return SysAllocString(szStr);
#else
BSTR bstrRet = NULL;
int cch = MultiByteToWideChar(CP_ACP, 0, szStr, -1, NULL, 0);
if (cch != 0)
{
WCHAR* pswz = new WCHAR[cch];
cch = MultiByteToWideChar(CP_ACP, 0, szStr, -1, pswz, cch);
if (cch != 0)
{
bstrRet = SysAllocString(pswz);
}
delete[] pswz;
}
return bstrRet;
#endif
}
// Converts the localized number string szNumber to a double using the
// given locale. Returns TRUE and sets *pVal on success, FALSE otherwise.
BOOL LocalizedStrToDbl(LCID lcid, LPCTSTR szNumber, double* pVal)
{
BOOL bSuccess = FALSE;
// Set out parameter regardless
*pVal = 0;
BSTR bstr = TstrToBstr(szNumber);
if (bstr != NULL)
{
VARIANT var;
VariantInit(&var);
// bstr will be freed on VariantClear
var.bstrVal = bstr;
var.vt = VT_BSTR;
HRESULT hr = VariantChangeTypeEx(&var, &var, lcid, 0, VT_R8);
if (hr == S_OK)
{
*pVal = var.dblVal;
bSuccess = TRUE;
}
VariantClear(&var);
}
return bSuccess;
}
|
Using VarR8FromStr() instead of VariantChangeTypeEx() is also an option.
If you pass NULL
as the lpFormat
parameter to GetNumberFormat(), you use the locale’s default number formatting information. I often find this to be unacceptable — for example, many times I want to control the number of fractional digits I display. To do this, you need to provide a filled-in NUMBERFMT
structure to GetNumberFormat().
I suggest starting with the locale’s default NUMBERFMT
and then change only the members you require. Because Windows does not seem to provide a way to retrieve a locale’s default NUMBERFMT
, we’ll have to roll our own.
To populate the members of NUMBERFMT
we are going to use the function GetLocaleInfo(). The map between NUMBERFMT
members and LCTYPE
s to pass to GetLocaleInfo() is as follows:
NUMBERFMT Member |
LCTYPE Constant |
NumDigits |
LOCALE_IDIGITS |
LeadingZero |
LOCALE_ILZERO |
Grouping |
LOCALE_SGROUPING |
lpDecimalSep |
LOCALE_SDECIMAL |
lpThousandSep |
LOCALE_STHOUSAND |
NegativeOrder |
LOCALE_INEGNUMBER |
GetLocaleInfo() always returns strings, but many of these strings need to be converted to UINT
s. Furthermore, the conversion between the LOCALE_SGROUPING
string and the Grouping
member is quite tricky; read [How to fill in that number grouping member of NUMBERFMT][9] for more information.
We now have enough information to write the function to retrieve a locale-default NUMBERFMT
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
|
// Converts a grouping string returned by GetLocaleInfo(LOCALE_SGROUPING)
// into a UINT understood by NUMBERFMT.
UINT GroupingStrToUint(LPCTSTR szGrouping)
{
LPCTSTR szCurr = szGrouping;
UINT ret = 0;
while (true)
{
ret *= 10;
if (*szCurr == TEXT('\0'))
break;
TCHAR* pch;
ret += _tcstol(szCurr, &pch, 10);
if (_tcscmp(pch, TEXT(";0")) == 0)
break;
szCurr = pch + 1;
}
return ret;
}
// Fills the default NUMBERFMT structure for a given locale.
// pFmt->lpDecimalSep and pFmt->lpThousandSep must point to valid buffers
// of size cchDecimalSep and cchThousandSep respectively.
BOOL GetDefaultNumberFmt(LCID lcid, NUMBERFMT* pFmt, int cchDecimalSep,
int cchThousandSep)
{
TCHAR szBuf[80];
int ret = ::GetLocaleInfo(lcid, LOCALE_IDIGITS, szBuf, ARRAYSIZE(szBuf));
if (ret == 0)
return FALSE;
pFmt->NumDigits = _tcstol(szBuf, NULL, 10);
ret = ::GetLocaleInfo(lcid, LOCALE_ILZERO, szBuf, ARRAYSIZE(szBuf));
if (ret == 0)
return FALSE;
pFmt->LeadingZero = _tcstol(szBuf, NULL, 10);
ret = ::GetLocaleInfo(lcid, LOCALE_SGROUPING, szBuf, ARRAYSIZE(szBuf));
if (ret == 0)
return FALSE;
pFmt->Grouping = GroupingStrToUint(szBuf);
ret = ::GetLocaleInfo(lcid, LOCALE_SDECIMAL, pFmt->lpDecimalSep,
cchDecimalSep);
if (ret == 0)
return FALSE;
ret = ::GetLocaleInfo(lcid, LOCALE_STHOUSAND, pFmt->lpThousandSep,
cchThousandSep);
if (ret == 0)
return FALSE;
ret = ::GetLocaleInfo(lcid, LOCALE_INEGNUMBER, szBuf, ARRAYSIZE(szBuf));
if (ret == 0)
return FALSE;
pFmt->NegativeOrder = _tcstol(szBuf, NULL, 10);
return TRUE;
}
|
Now that we have these functions, we can use them to better control the output from GetNumberFormat(), as in:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
// Converts the double value to a localized string for the specified locale
// with the given number of fractional digits.
BOOL DblToLocalizedStr(LCID lcid, double value, int nDigits,
LPTSTR szStr, int cchStr)
{
// Get locale-default NUMBERFMT
TCHAR szDecimalSep[5];
TCHAR szThousandSep[5];
NUMBERFMT fmt;
fmt.lpDecimalSep = szDecimalSep;
fmt.lpThousandSep = szThousandSep;
if (!GetDefaultNumberFmt(lcid, &fmt, ARRAYSIZE(szDecimalSep),
ARRAYSIZE(szThousandSep)))
return FALSE;
// Override the NumDigits member of NUMBERFMT
fmt.NumDigits = nDigits;
// Format the number with the custom NUMBERFMT
int ret = GetNumberFormatDbl(lcid, 0, value, &fmt, szStr, cchStr);
return (ret != 0);
}
|