How is the text processed while converting a .doc file to .pdf file.I tried to intercept the "Tj" operator using Pdfbox. The sentence "interchange features of PDF. Again, the resulting PDF file can be viewed with a viewer application, such as " is broken into
"interchange features of PDF. Agai" & "n, the resulting PDF file can be viewed with a viewer application, such as ".arguments to the TJ operator were
[COSArray{[COSString{in}, COSInt{5}, COSString{t}, COSInt{5}, COSString{er}, COSInt{-4}, COSString{ch}, COSInt{5}, COSString{an}, COSInt{4}, COSString{g}, COSInt{5}, COSString{e }, COSInt{-2}, COSString{f}, COSInt{10}, COSString{eat}, COSInt{5}, COSString{ur}, COSInt{10}, COSString{es o}, COSInt{6}, COSString{f }, COSInt{-2}, COSString{P}, COSInt{6}, COSString{DF}, COSInt{6}, COSString{.}, COSInt{13}, COSString{ Ag}, COSInt{3}, COSString{ai}]}] and
[COSArray{[COSString{n, t}, COSInt{6}, COSString{he }, COSInt{10}, COSString{r}, COSInt{-2}, COSString{esu}, COSInt{5}, COSString{lt}, COSInt{8}, COSString{in}, COSInt{5}, COSString{g}, COSInt{5}, COSString{ P}, COSInt{4}, COSString{DF}, COSInt{6}, COSString{ f}, COSInt{-2}, COSString{il}, COSInt{5}, COSString{e }, COSInt{8}, COSString{ca}, COSInt{4}, COSString{n b}, COSInt{3}, COSString{e }, COSInt{8}, COSString{view}, COSInt{9}, COSString{ed wit}, COSInt{6}, COSString{h a}, COSInt{14}, COSString{ v}, COSInt{-3}, COSString{ie}, COSInt{12}, COSString{we}, COSInt{8}, COSString{r}, COSInt{8}, COSString{ app}, COSInt{5}, COSString{li}, COSInt{5}, COSString{ca}, COSInt{4}, COSString{t}, COSInt{5}, COSString{io}, COSInt{7}, COSString{n, s}, COSInt{6}, COSString{uc}, COSInt{5}, COSString{h as}, COSInt{7}, COSString{ }]}]
Is the because of the way a .doc is converted into a pdf? or is it because of the textblocks refered in the last answer of this question.What is the significance of those COSInt
between the COSString
? i dont really understand about textblock but i dont think there should be a problem if i try to intercept the Tj operator.would it be the same if i try to process a pdf creating from a pdf file?