Encoding Analysis Tables

by Arnold Cross
14 May 2018

Content

These tables summarize how UTF-8 encoded Russian text is saved in the md and html files when an Rmd file is knitted in RStudio on Windows 7 with various values for the ENCODING SETTING. The different tables (except the first one) present test results under different software configurations. More explanation is contained in the report, knitr Encoding in RStudio on Win7, at https://github.com/crossaw/Knit-Win7-Encoding-Report.

Ideal

For perspective, this is what the results would be if RStudio and knitr were working as I would expect.
Encoding Setting PASTED TEXT

(pasted into the Rmd file)

KNITTED TEXT

(from a code chunk)

RStudio md html md html
[Ask] ? ? ? ? ?
ISO8859-1 ISO8859-1 ISO8859-1 ISO8859-1 ISO8859-1 ISO8859-1
GB18030 Chinese Chinese Chinese Chinese Chinese
UTF-8 CORRECT CORRECT CORRECT CORRECT CORRECT
WINDOWS-1252 win1252 win1252 win1252 win1252 win1252

Most Common Results

This table shows the results under most of the test configurations. The specific configurations which produced these results are identified in the next table.
Encoding Setting PASTED TEXT

(pasted into the Rmd file)

KNITTED TEXT

(from a code chunk)

RStudio md html md html
[Ask] ISO8859-1 CORRECT ISO8859-1 CORRECT ISO8859-1
ISO8859-1 ISO8859-1 CORRECT ISO8859-1 CORRECT ISO8859-1
GB18030 Chinese win1252 Chinese Strange win1252
UTF-8 CORRECT CORRECT CORRECT win1252 win1252
WINDOWS-1252 win1252 CORRECT win1252 CORRECT win1252

Configurations With Above Results

In this table, the configurations which produced the above results are identified as "YES". Configurations which were tested but produced different results are identified as "Different". Configurations which were not tested are identified as "Not tested". Impossible configurations are left blank.
64-bit Win7 Pro

(Dell E6430)

32-bit Win7 Pro

(Dell E6410)

RStudio
v1.1.442
RStudio
v1.1.442
RStudio
v1.1.383
64-bit R v3.4.4 YES
32-bit R v3.4.4 YES YES Not tested
64-bit R v3.4.0 YES
32-bit R v3.4.0 YES Different Different

32-bit Windows with R v3.4.0 and RStudio v1.1.442

This table shows the results on a Dell E6410 running 32-bit Windows 7 Pro with RStudio v1.1.442 set on R v3.4.0.
Encoding Setting PASTED TEXT

(pasted into the Rmd file)

KNITTED TEXT

(from a code chunk)

RStudio md html md html
[Ask] ISO8859-1 ISO8859-1 modified 8859-1 ISO8859-1 modified 8859-1
ISO8859-1 ISO8859-1 ISO8859-1 modified 8859-1 ISO8859-1 modified 8859-1
GB18030 Chinese Chinese Chinese win1252 win1252
UTF-8 CORRECT CORRECT CORRECT ISO8859-1 ISO8859-1
WINDOWS-1252 win1252 win1252 Twice win1252 Twice ISO8859-1,
then win1252
ISO8859-1,
then win1252

32-bit Windows with R v3.4.0 and RStudio v1.1.383

This table shows the results on a Dell E6410 running 32-bit Windows 7 Pro with RStudio v1.1.383 set on R v3.4.0.
Encoding Setting PASTED TEXT

(pasted into the Rmd file)

KNITTED TEXT

(from a code chunk)

RStudio md html md html
All ISO8859-1 ISO8859-1 modified 8859-1 ISO8859-1 modified 8859-1