0

I'm reading in an Excel file and trying to output test exactly as shown in Excel, i.e. the Display text rather than the underlying raw value.

For this purpose I'm using the DataFormatter as follows:

DataFormatter formatter = new DataFormatter();
String displayValue = formatter.formatCellValue(cell);

The problem is, the displayValue I get for this one Spreadsheet changes based on whether I run the POI code inside Tomcat or not. Outside Tomcat (in a UnitTest), the displayed text is: "29-Apr-21" When run inside my Tomcat Webapp, the value extracted is "29-Apr.-21"

I can see in the org.apache.poi.ss.usermodel.DataFormatter object, there's a member variable called 'dateSymbols' of type java.text.DateFormatSymbols This has a member array 'shortMonths' where the abbreviated months have periods in the Tomcat case, but not when run in the UnitTest.

My first suspicion was that there was some environmental Locale differences, but in BOTH cases the formatter.locale = "en_AU" . What other environmental differences could lead to the use of different month abbreviations?

Black
  • 5,023
  • 6
  • 63
  • 92

2 Answers2

0

If you do

DataFormatter formatter = new DataFormatter(); 

then DataFormatter uses the environment locale. That may be different in different environments.

If you always want the same result, you should construct DataFormatter using a given locale. For example:

DataFormatter dataFormatter = new DataFormatter(new java.util.Locale("en", "AU"));

That always will lead to result "29-Apr-21" when Excel number format is DD-MMM-YY.

If using German locale for example:

DataFormatter dataFormatter = new DataFormatter(new java.util.Locale("de", "DE"));

then this will lead to "29-Apr.-21".

Axel Richter
  • 56,077
  • 6
  • 60
  • 87
  • As stated in my question, the default locale of the DataFormatter is en_AU in both cases, so some other environmental difference else is influencing the shortMonths format. Explicitly setting the locale to en_AU makes no difference – Black Jul 13 '21 at 23:14
0

I'm answering this in case it helps anyone as it's certainly an unexpected discovery.

The difference between the environments here is the use of a different JVM.

  • Using the standard Oracle Java 1.8 JDK results in shortMonth values without periods, e.g. "Apr"
  • Using the same code with OpenJDK DCEVM-11.0.7+4 results in shortMonth values without periods, e.g. "Apr."

Again, I stress that the Locale is the same in both cases - even inspecting these values produces the same results in both cases:

System.out.println(locale.getDisplayVariant());
System.out.println(locale.getDisplayLanguage());
System.out.println(locale.getLanguage());
System.out.println(locale.getDisplayName());
System.out.println(locale.getCountry());
System.out.println(locale.getVariant());

I'm unsure whether this is considered a 'bug' in DCEVM, but it's something people should know about.

Black
  • 5,023
  • 6
  • 63
  • 92
  • 2
    One suggestion: There have been changes between Java 8 and Java 11 regarding which locale data provider the JVM uses (i.e. which data files it loads its locale reference data from ). You may want to experiment with `-Djava.locale.providers=...` and use different values (`COMPAT`,`CLDR`,`SPI`) to see how those affect your output. See [this answer](https://stackoverflow.com/a/65221042/12567365) for more information. – andrewJames Jul 13 '21 at 23:50