I am using some great code taken from StackOverflow and converted to VB NET to:
- Extract all rows of text from all speadsheets in an excel .xlsx file
- Append each row to a StringBuilder.
The code runs really well, but my stringbuilder entries are split every 1026 characters.
i.e. Text reads
..This is Sta
ckOverflow..
instead of
'..This is StackOverflow..' and splitting at a natural row break.
Imports System.Text
Imports System.Linq
Imports DocumentFormat.OpenXml
Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Spreadsheet
Public Class clsParseXLS
Public Shared Sub parseXLSX(strFileName As String, sbTxtFromFile As StringBuilder)
Dim intFirst As Integer = 1
Try
Using spreadsheetDocument__1 As SpreadsheetDocument = SpreadsheetDocument.Open(strFileName, False)
Dim workbookPart As WorkbookPart = spreadsheetDocument__1.WorkbookPart
For Each worksheetPart As WorksheetPart In workbookPart.WorksheetParts
Dim reader As OpenXmlReader = OpenXmlReader.Create(worksheetPart)
While reader.Read()
If reader.ElementType Is GetType(Row) Then
reader.ReadFirstChild()
Do
If reader.ElementType Is GetType(Cell) Then
Dim c As Cell = DirectCast(reader.LoadCurrentElement(), Cell)
If c.DataType IsNot Nothing AndAlso c.DataType.Value.ToString = "SharedString" Then
Dim ssi As SharedStringItem = workbookPart.SharedStringTablePart.SharedStringTable.Elements(Of SharedStringItem)().ElementAt(Integer.Parse(c.CellValue.InnerText))
If Not ssi Is Nothing Then
If intFirst = 1 Then
sbTxtFromFile.Append(ssi.Text.Text)
intFirst = 2
Else
sbTxtFromFile.Append(Environment.NewLine & reader.GetText())
intFirst = 2
End If
End If
Else
If Not c.CellValue Is Nothing Then
If intFirst = 1 Then
sbTxtFromFile.Append(c.CellValue.InnerText)
intFirst = 2
Else
sbTxtFromFile.Append(Environment.NewLine & c.CellValue.InnerText)
intFirst = 2
End If
End If
End If
'Console.Out.Write("{0}: {1} ", c.CellReference, cellValue)
End If
Loop While reader.ReadNextSibling()
'Console.Out.WriteLine()
End If
End While
Next
End Using
Catch ex As Exception
MsgBox(ex.ToString)
End Try
End Sub
End Class
Does anyone know what I can do to remedy the situation and split on the rows?
Any help anyone can offer is most gratefully appreciated.
Many thanks to @Hans for posting the original code I am using. Using OpenXmlReader