2

I write Kotlin code in Android Studio. The user chooses a file from the phone (I need to access the content as a string). There I get a Uri?. With that Uri? I can extract text from .csv and .txt files:

if (typeOfFile == ".txt" || typeOfFile == ".csv") {
            try {
                val ins: InputStream? = contentResolver?.openInputStream(uriFromSelectedFile)
                val reader = BufferedReader(ins!!.reader())
                textIWant = reader.readText()

...

Getting the file type also works fine, but when it comes to opening pdf files, nothing seems to work. I tried using PDFBox from Apache in various ways. The pdf I try to open is a simple onePager and contains only extractable text (can be copied) like this pdf.

This is one of the things I tried, the phone freezes when the file to open is a pdf:

if (typeOfFile == ".pdf") {
            try {
                val myPDDocument:PDDocument = PDDocument(COSDocument(ScratchFile(File(uriFromSelectedFile.path))))
                textIWant = PDFTextStripper().getText(myPDDocument)

...

I´ve been trying for days. Does anyone know, how it works in Kotlin?

Sciveo
  • 329
  • 1
  • 2
  • 7
  • Please include the code that you tried, and link to the PDF. Make sure that the PDF has text to extract (try copy & paste from Adobe Reader) – Tilman Hausherr Jul 27 '20 at 06:54
  • Hope it`s better now – Sciveo Jul 27 '20 at 22:27
  • Your code looks very suspicious. In Java, loading a PDF is "PDDocument.load(file)". What you're doing is different. I don't know Kotlin, so please find how to call a static method. Seems it isn't easy: https://stackoverflow.com/questions/40352684/ – Tilman Hausherr Jul 28 '20 at 06:21
  • If you managed to correct your code so that the PDF opens and text stripping works, please answer the question yourself, so this will help others. – Tilman Hausherr Aug 01 '20 at 10:58
  • I will as soon as it works. Right now I still have no clue how to do it. I am continuing with more research and trying... – Sciveo Aug 01 '20 at 11:51
  • Maybe this could also help? https://stackoverflow.com/questions/34588117/ – Tilman Hausherr Aug 02 '20 at 11:34

1 Answers1

2

It worked using tom_roush.pdfbox and a companion object:

import com.tom_roush.pdfbox.text.PDFTextStripper

class MainActivity : AppCompatActivity() {

companion object PdfParser {
    fun parse(fis: InputStream): String {
        var content = ""
        com.tom_roush.pdfbox.pdmodel.PDDocument.load(fis).use { pdfDocument ->
            if (!pdfDocument.isEncrypted) {
               content = PDFTextStripper().getText(pdfDocument)
           }
        }
        return content
    }
}

Calling the parse function of the companion object:

val fis: InputStream = contentResolver?.openInputStream(uriFromSelectedFile)!!
textIWant = parse(fis)
Sciveo
  • 329
  • 1
  • 2
  • 7