Input and output code pages
- It’s possible to get the Win32 compiler to be in a weird state where it starts writing UTF-8 encoded bytes for ASCII string data. This seems to depend on the order of
#pragma code_pagedirectives in a file and/or the code page given on the CLI.- A Windows-1252 encoded file that has only
1 RCDATA { "Ó" }in it will (with all default settings) be compiled into the single byte0xD3(Óin Windows-1252). If the option/c65001is given to the CLI, it will instead be compiled into0xEF 0xBF 0xBD, the UTF-8 sequence forU+FFFD(the replacement character). - A Windows-1252 encoded file that has
1 RCDATA { "Ó" }with#pragma code_page(1252)before it will be compiled into the single byte0xD3(Óin Windows-1252). If the option/c65001is given to the CLI, it will instead be compiled into0xC3 0x93, the UTF-8 sequence forÓ.- The
/c65001behavior can be ‘counteracted’ if there is any#pragma code_pagebefore the#pragma code_page(1252)(it will compile into0xD3again).
- The
- A Windows-1252 encoded file that has
1 RCDATA { "Ó" }with#pragma code_page(65001)before it will be compiled into the single byte0x3F(the?character). If the option/c65001is given to the CLI, it will instead be compiled into0xEF 0xBF 0xBD, the UTF-8 sequence forU+FFFD(the replacement character).- The
/c65001behavior can be triggered without using the/coption if there is any#pragma code_pagebefore the#pragma code_page(65001)
- The
- It seems like there are actually two distinct settings: input code page and output code page. The first
#pragma code_pagedoes not affect the output code page, but all the rest do. The/cCLI option affects both.
- A Windows-1252 encoded file that has only