0

Although there are certain posts in stackoverflow regarding file conversion to a specific charset format in java.In my case,I am receiving a csv file from a third party application and there after I need to process the data of this file and store them in DB.

Now,My system accept csv file in UTF-8 format but some time I am receiving files which are not in UTF-8 and moreover I am not able to track its charset type.

Now to convert an input csv file to UTF-8 its important to find out source file encoding type.In my case I am not able to track that.Have tried with InputStreamReader.getEncoding() or IOUtils classes.But it output CP-1252 for both UTF-8 and non UTF-8 format file.

So,is there any way by which I can convert the incoming CSV file to UTF-8 format without having to identify its original source encoding?

Sumit Ghosh
  • 484
  • 4
  • 13
  • 36
  • Take a look at https://stackoverflow.com/questions/499010/java-how-to-determine-the-correct-charset-encoding-of-a-stream – Roman Puchkovskiy Jan 14 '18 at 17:02
  • Are you saying you need your program to guess every time or that you just need to guess one encoding from one or more samples and then code your program to use that? Is it possible to configure/ask the sending system/person to always use one encoding that you tell them to or to get them to say each time? Regarding guessing: CP437 would always work and would possibly always be wrong; ISO 8859-1 would also always work and might be less likely to be wrong; … UTF-8 might not always work. – Tom Blodget Jan 16 '18 at 23:18

0 Answers0