5

I have a Matlab code base whose comments are written in Swedish. It’s something like this:

% Syntax: result = ocr(DOC, METHOD, fname)
% DOC - bild som ska processas
% METHOD - ann eller knear
% fname - full filename of the net ('ann' method) or the database 
%         ('knear' method)
%         default: ann20.mat resp db4000.mat
function result = ocr(DOC, METHOD, fname)

% Segmentera bilden
disp('Segmenting...');
[ROWB, CH] = segment(DOC, 0.99, 0.99);

% Analysera den 
switch lower(METHOD)
  case 'ann', 
    % ladda in neuronnät, inför NET, E, CP
    if isempty(fname)
      load ./db/ann50.mat;
    else
      load(fname);
    end

Well, Google translate came out to be big rescue for me. Here is the result of the copy-paste into translate box, which is pretty satisfactory.

% Syntax: result = ocr (DOC, METHOD, fname)
% DOC - image to be processed
% METHOD - ann or knear
% Fname - full filename of the net ('ann' method) or the database
% ('Knear' method)
% Default: ann20.mat respectively db4000.mat
function result = ocr (DOC, METHOD, fname)

Segment image%
disp ('Segmenting ...');
[ROWB, CH] = segment (DOC, 0.99, 0.99);
% Analyze the
switch lower (METHOD)
   case 'ann'
     % Load the neural networks, for NET, E, CP
     f isempty (fname)
       ./db/ann50.mat load;
     else
       load (fname);
     end
  1. Can I automate this process and how, for a multi-file code base?
  2. How can I deal with the errors such as conversion of "% Segmentera bilden" to "Segment image%"?
Gary Barrett
  • 1,764
  • 5
  • 21
  • 33
Saurabh Kumar
  • 5,576
  • 4
  • 21
  • 30

1 Answers1

0

You cannot rely on Google Translate to keep the % in front of the line; it is known to randomly switch around punctuation and even to combine or separate lines. It may also try to translate code words or variables. For a reliable solution, create a little helper script e. g. in Ruby or your quick & dirty programming language of choice.

This helper script should
* go through every file in the code-base (backup first)
* look at every line without evaluating it
* extract everything after a % and feed that into Google Translate (separate request for each comment, to prevent mix-ups)
* replace the Swedish comment with the Google translation in the file
* save file and grab next file

Sprachprofi
  • 1,229
  • 12
  • 24