0

I have a requirement to read an excel document and place the data contained in the cells in the database. However, I problem I notice is that when I got to read the data out of the rows, they do not come out in the order they appear in the form. How do I work this out please?

public void getrowdata(){

 IEnumerable<Row> dataRows = from row in s.worksheetpart.Worksheet.Descendants<Row>()
                                                where row.RowIndex > 6
                                                select row;

// extract the data in the row in order 
                foreach (Row row in dataRows)
                { 
                    var cellValues = from cell in row.Descendants<Cell>()
                                     select ((cell.CellValue != null && cell.DataType != null && cell.DataType.HasValue)
                                         && (sharedString.HasChildren && int.Parse(cell.CellValue.InnerText) < sharedString.ChildElements.Count)
                                         ? sharedString.ChildElements[int.Parse(cell.CellValue.InnerText)].InnerText
                                         : ((cell.CellValue != null && cell.CellValue.InnerText != null) ? cell.CellValue.InnerText : String.Empty));


//--cellValues.toArray() and then access each cell via index in array


}

  public void ReadDSheetsToBuffer()
        {
            try
            { 
                //Open the Excel workbook.
                using (SpreadsheetDocument document = SpreadsheetDocument.Open(file.FullName, true))
                {
                    //References to the workbook and Shared String Table.
                    workBook = document.WorkbookPart.Workbook;
                    workSheets = workBook.Descendants<Sheet>();
                    sharedStrings = document.WorkbookPart.SharedStringTablePart.SharedStringTable; 
                    ExtractSheetstoMemory2(document);

                }
            }
            catch (Exception ex)
            {
                throw ex.GetBaseException();
            }
        }

Sample File found at Sample Excel File I read with the code

And below is the type of way I access the values stored up in the cells in the row. . .

 if (values[228] != null)
                itemdetail.Custom1 = rowvalues[228].Trim();
            if (values[229] != null)
                itemdetail.Custom2 = rowvalues[229].Trim();
            if (values[230] != null)
                itemdetail.Custom3 = rowvalues[230].Trim();
            if (values[231] != null)
                itemdetail.Custom4 = rowvalues[231].Trim();
            if (values[232] != null)
                itemdetail.Custom5 = rowvalues[232].Trim();
            if (values[233] != null)
                itemdetail.Custom6 = rowvalues[233].Trim();

My Attempt at using cell reference to access the cell innertext

foreach (Row row in dataRows)
{
    if (row.RowIndex > 6)
    {
        String theCell = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(1) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell2 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(2) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell3 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(3) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell4 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(4) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell5 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(5) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell6 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(6) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell7 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(7) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell8 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(8) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell9 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(9) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell10 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(10) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell11 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(11) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell112 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(12) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell13 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(13) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell14 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(14) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
        String theCell15 = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(15) + row.RowIndex.ToString()).FirstOrDefault().InnerText;
    }
}
Matt Ellen
  • 11,268
  • 4
  • 68
  • 90
Kobojunkie
  • 6,375
  • 31
  • 109
  • 164
  • Could you upload an example excel file? So I will have a look at it. – Hans Oct 04 '12 at 18:34
  • I have uploaded the file to http://www.filedropper.com/dcttestdata – Kobojunkie Oct 04 '12 at 19:12
  • The row number is the "r" attribute of the "x:row" element. You probably have to use that to map them back into their proper position and not assume that they are using storage order to implicitly position them. – RBarryYoung Oct 04 '12 at 19:46
  • I am not necessarily concerned about row I am processing but the other in which the values in the row are processed. – Kobojunkie Oct 04 '12 at 20:10
  • Is there some sort of row indexing feature that would allow me access elements in the row directly to get at the value without necessarily needing to iterate throuugh the row, one cell at a time? – Kobojunkie Oct 04 '12 at 20:11
  • I would appreciate any help here please. I need to read about 245 cells in each row and I need to find a way to read them in the correct order please. – Kobojunkie Oct 04 '12 at 23:33
  • The fastest way to do this is to process the whole structure yourself once, building your own indexes and structures as you go. – RBarryYoung Oct 05 '12 at 04:11
  • I don't understand what you mean. Process it once in what way? – Kobojunkie Oct 05 '12 at 04:15
  • @Kobojunkie: I mean, read in the whole XML document into your own structure, indexed in whatever way is convenient for you. – RBarryYoung Oct 09 '12 at 19:16

1 Answers1

0

You need to look at the CellReference property (which is the r attribute for the cell in the XML file) as you iterate over the individual cells. This contains the cell's address in A1 format where A is the column and 1 is the row.

Column 233 on, for example, row 68 would be HY68. You may find this question useful in how to generate the column letters. You could then examine the CellReference for each cell which was a descendant of the row you were checking, extract the column letters and use a switch statement to populate the relevant part of your ItemDetail

The only obvious way I can see to avoid iterating over the individual cells might be to use XPath to extract them directly from the underlying XML/.xslx file. Not sure how to do that in C# with the SpreadsheetDocument object and its descendants though.

You could also order one of your existing queries by the CellReference property which would get the cells in order but might cause problems with empty cells (i.e the 10th cell returned might not be the cell from column 10)


edit: dealing with missing cells/null references just requires you to check the reference before accessing the .InnerText property.

Cell theCell = row.Descendants<Cell>().Where(c => c.CellReference == ExcelColumnFromNumber(1) + row.RowIndex.ToString()).FirstOrDefault();

String theCellValue = "";

if (theCell != null)
{
  theCellValue = theCell.InnerText;
}

It would probably be tidier to wrap that lot in a function that takes the Cell as a parameter and returns a String either containing the InnerText or an empty string

Community
  • 1
  • 1
barrowc
  • 10,444
  • 1
  • 40
  • 53
  • I tried accessing via cellreference, but it seems when it hits the "missing" cells, it throws a 'null reference' exception. And I have at least noticed that it happens on column G of some(not all)of my rows. So accessing via cell reference seems a difficult approach or do you have a certain way it should be done? Please see my example. – Kobojunkie Oct 05 '12 at 04:36
  • Updated the answer with a way of dealing with null references – barrowc Oct 06 '12 at 20:04