0

I have a table with a column name (Machining) with 2 language versions.

Machining
-------------------------------
Ho?t d?ng do trên máy do t?a d?
B?t d?u ho?t d?ng trong
Measuring operation on the coordinate measuring machine
Operation initiation

how to identify "which line is having which language" like adding new column say Language:

Machining                                                 Language
----------------------                                    -----------
Ho?t d?ng do trên máy do t?a d?                           Vietnamese
B?t d?u ho?t d?ng trong                                   Vietnamese
Measuring operation on the coordinate measuring machine   English
Operation initiation                                      English
TKN
  • 15
  • 3
  • How do you know if a string is "English" or not? You will need a CASE WHEN, but the condition is unclear. According to your sample data, all strings including a "?" are Vietnamese, but I guess that's just because of incomplete sample data? You must tell us the exact condition if you want an answer, please. The string "B?t d?u ho?t d?ng trong " does not contain any non English letters or characters, so how to know this is not an English string? – Jonas Metzler Nov 28 '22 at 07:26
  • `Ho?t d?ng do trên máy do t?a d?` certainly isn't Vietnamese, it's just jibberish, presumably because you've *tried* to store Vietnamese in a `varchar` column that doesn't support the characters you need. – Thom A Nov 28 '22 at 12:00

1 Answers1

0

You have to provide the language when you insert the data, there is no way for database to know which language the data is.

For example when you insert the data into to database, it should be: insert into table_name (Machining,Language) values ('Ho?t d?ng do trên máy do t?a d?',Vietnamese); insert into table_name (Machining,Language) values ('Operation initiation','English');

Hyde
  • 11
  • 2
  • Hello @Hyde, actually in the database one field (Machining) is available which is a String datatype and user saving with both the translated information. Now we have succeed to bring in row level for each line in a temp table. now question is to identify which line having English and which one is non-English. – TKN Nov 28 '22 at 07:02
  • To detect which language from a portion of text is not what a database can do, if you do not want to do manually, or the data volume too large to do manually, probably you can try with some AI tools, for example refer to this list: https://stackoverflow.com/questions/39142778/how-to-determine-the-language-of-a-piece-of-text – Hyde Nov 29 '22 at 01:03