20

I've been looking around for a while now and cannot seem to find out how to do this. I've got an excel sheet, which I'm reading using OpenXML. Now the normal thing would be to loop through the rows and then loop through the cells to get the values, which is fine. But along with the values I need the location of the cell, which would be in the format (rowindex, ColumnIndex). I've managed to get the rowIndex, but cant seem to figure out getting the column Index.

I actually thought this was going to be easy but apparently it isnt.

QV1
  • 529
  • 2
  • 5
  • 17
  • 2
    For future readers, I recommend [seeing this answer](http://stackoverflow.com/a/667902/906773). – Jesse Good Oct 31 '15 at 22:01
  • https://learn.microsoft.com/en-us/openspecs/office_standards/ms-oi29500/044c2ba4-098b-4b15-b960-3f7f972665df – muhmud Jan 20 '22 at 09:05

11 Answers11

32

This is slightly trickier than you might imagine because the schema allows for empty cells to be omitted.

To get the index you can use the Cell object wihch has a CellReference property that gives the reference in the format A1, B1 etc. You can use that reference to extract the column number.

As you probably know, in Excel A = 1, B = 2 etc up to Z = 26 at which point the cells are prefixed with A to give AA = 27, AB = 28 etc. Note that in the case of AA the first A has a value of 26 times the second; i.e. it is "worth" 26 whilst the second A is "worth" 1 giving a total of 27.

To work out the column index you can reverse the letters then take the value of the first letter and add it to a running total. Then take the value of the second letter and multiply it by 26, adding the total to the first number. For the third you multiply it by 26 twice and add it, for the fourth multiply it by 26 3 times and so on.

So for column ABC you would do:

C = 3
B = 2 * 26 = 52
A = 1 * 26 *26 = 676
3 + 52 + 676 = 731

In C# the following will work:

private static int? GetColumnIndex(string cellReference)
{
    if (string.IsNullOrEmpty(cellReference))
    {
        return null;
    }

    //remove digits
    string columnReference = Regex.Replace(cellReference.ToUpper(), @"[\d]", string.Empty);

    int columnNumber = -1;
    int mulitplier = 1;

    //working from the end of the letters take the ASCII code less 64 (so A = 1, B =2...etc)
    //then multiply that number by our multiplier (which starts at 1)
    //multiply our multiplier by 26 as there are 26 letters
    foreach (char c in columnReference.ToCharArray().Reverse())
    {
        columnNumber += mulitplier * ((int)c - 64);

        mulitplier = mulitplier * 26;
    }

    //the result is zero based so return columnnumber + 1 for a 1 based answer
    //this will match Excel's COLUMN function
    return columnNumber + 1;
}

Note that the CellReference is not guaranteed to be in the XML either (although I've never seen it not there). In the case where the CellReference is null the cell is placed in the leftmost available cell. The RowIndex is also not mandatory in the spec so it too can be omitted in which case the cell is placed in the highest row available. More information can be seen in this question. The answer from @BCdotWEB is correct approach in cases where the CellReference is null.

petelids
  • 12,305
  • 3
  • 47
  • 57
  • You could start with columnNumber = 0 and do not add one when returning. – Sergii Bogomolov Jul 04 '16 at 14:51
  • What would happen with a Z in the second letter? Wouldn't it be the same as A on the third? So `Z = 26*26` and `A = 1*26*26`. I'm asking since I'm trying to do the reverse that is going from cell index to cell name. – bostero2 Dec 06 '17 at 10:39
  • A `Z` in the second letter is "worth" 26. That would be added to 26 times the value of the first letter. E.g. `ZZ` would be `(26 * 26) + 26 = 702`. An `A` in the 3rd column is "worth" `1`. That would be added to 26 times the value of the second letter and 26 * 26 times the value of the first. E.g `AAA` would be `(26 * 26 * 1) + (26 * 1) + 1 = 703`. – petelids Dec 06 '17 at 14:00
  • @bostero2 BTW, if that's still not clear, create a new question and ping me on here and I'll write you up an answer on how to do the reverse. – petelids Dec 07 '17 at 08:28
  • @petelids Yeah, figured it out after posting the comment. Thanks! – bostero2 Dec 07 '17 at 11:25
12

Small is beautiful

int ColumnIndex(string reference)
{
  int ci=0;
  reference=reference.ToUpper();
  for (int ix = 0; ix < reference.Length && reference[ix] >= 'A';ix++ ) 
       ci += (ix * 26) + ((int)reference[ix] - 64);
  return ci;
}
U. Windl
  • 3,480
  • 26
  • 54
6
    public static void CellReferenceToIndex(string reference, out int row_index, out int col_index)
    {
        row_index = 0;
        col_index = 0;

        foreach(char c in reference)
        {
            if (c >= '0' && c <= '9')
            {
                row_index = row_index * 10 + (c - '0');
            }
            if (c >= 'A' && c <= 'Z')
            {
                col_index = col_index * ('Z' - 'A' + 1) + (c - 'A' + 1);
            }
        }
    }
Joe
  • 61
  • 1
  • 2
  • 1
    Providing an answer to a 5 year old question that has a handful of other answers (with many upvotes) does not seem particularly useful. – bichito Apr 03 '20 at 16:51
  • 1
    Maybe not but I do like this answer better. Easier to read and way more compact. – Neil Bostrom Oct 28 '21 at 14:28
5
    [TestCase( 1, 0, "A1" )]
    [TestCase( 2, 25, "Z2" )]
    [TestCase( 2, 38, "AM2" )]
    [TestCase( 2, (26 * 4) + 1, "DB2" )]
    [TestCase( 2, (26 * 26 * 26 * 18) + (26 * 26 * 1) + (26 * 26 * 1) + ( 26 * 1 ) + 2, "RBAC2" )]
    public void CanGetCorrectCellReference( int row, int column, string expected )
        => GetCellReference( (uint)row, (uint)column ).Value.ShouldEqual( expected );

    public static StringValue GetCellReference( uint row, uint column ) =>
        new StringValue($"{GetColumnName("",column)}{row}");

    static string GetColumnName( string prefix, uint column ) => 
        column < 26 ? $"{prefix}{(char)( 65 + column)}" : 
        GetColumnName( GetColumnName( prefix, ( column - column % 26 ) / 26 - 1 ), column % 26 );
3

To start answer , I invite you to look at this first.

As I have explained there is NO easy way to extract Row and Column. The closest you get is the extraction of CellReference of a cell which would have the form of A1 , B2 which is actualy COLUMN_ROW format.

What you can do is extract Row and Column from the CellReference. Yes this would need you to implement a method where you need to check char by charto verify for numbers and strings.

Lets say you have A11 , then when you need to index column you need to extract A which would give as column 1. Yes it's not that easy, but it's the only way unless you simply chose to count the columns when you scan/iterate through cells.

Again look at this questions answer which does the same thing.

Community
  • 1
  • 1
Kavindu Dodanduwa
  • 12,193
  • 3
  • 33
  • 46
  • Hmm.. OK, But I do need that column Index.So what I'm thinking now, is to find a way to replace the letter from the cell reference to get the column Index. – QV1 Mar 05 '15 at 10:48
  • As explained in the answer, no EASY way :) simply count as I have pointed in the linked question :) – Kavindu Dodanduwa Mar 05 '15 at 10:51
  • @QV1 Did you get the answer for the question ? Or you need more explanation :) – Kavindu Dodanduwa Mar 05 '15 at 10:57
  • I got it, Thanks, Actually, since I need just the index, creating a counter for the cell loop works fine. I'll be posting my code soon, once I'm done testing. Thanks for helping out ! – QV1 Mar 05 '15 at 10:59
  • @QV1 Fine ,, don't forget to upvote or mark the question as correct if you think the answer helps/solves the issue :) – Kavindu Dodanduwa Mar 05 '15 at 11:00
1
    Row row = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>().FirstOrDefault();
   var totalnumberOfColumns = 0;
    if (row != null)
        {
            var spans = row.Spans != null ? row.Spans.InnerText : "";
                if (spans != String.Empty)
                        {
                            //spans.Split(':')[1];
                            string[] columns = spans.Split(':');
                            startcolumnInuse = int.Parse(columns[0]);
                            endColumnInUse = int.Parse(columns[1]);
                            totalnumberOfColumns = int.Parse(columns[1]);
                        }
        }

this is to find the total number of columns present/used enter image description here

tejashiwini
  • 81
  • 1
  • 10
0

In my scenario I only needed to deal with column names (no cell numbers), and used LINQ, thought it's worth putting here for the reference.

const int AsciiTrim = 'A' - 1; //64
const int LastChar = 'Z' - AsciiTrim; //26

var colIndex = columnName
    .Reverse()
    .Select(ch => ch - AsciiTrim)
    .Select((ch, i) => ch * Math.Pow(LastChar, i))
    .Sum()
    - 1; //make zero-index based

To revert back, and for the full code and test, see this gist.

Shimmy Weitzhandler
  • 101,809
  • 122
  • 424
  • 632
0

Slightly modified GetColumnIndex function in the @petelids answer. Result will be zero-based index. If need add 1 for a one-based Index.

private static int CellReferenceToIndex(string reference)
{
    foreach (char ch in reference)
    {
        if (Char.IsLetter(ch))
        {
            int value = (int)ch - (int)'A';
            index = (index == 0) ? value : ((index + 1) * 26) + value;
        }
        else
            return index;
    }
    return index;
}
Subodh Wasankar
  • 309
  • 2
  • 6
0
private double CellReferenceToIndex(Cell cell)
    {
        // if Cell is ABC4 => position is
        // = [Aindx * (26^2)] + [BIndx * (27^1)] + [CIndx * (27^0)]
        // = [1     * (26^2)] + [2     * (27^1)] + [3     * (27^0)]

        double index = 0;
        char [] reference = cell.CellReference.ToString().ToUpper().Reverse().ToArray();
        int letterPosition = 0;
       
        foreach (char ch in reference)
        {
            if (char.IsLetter(ch))
            {
                int value = (ch - 'A') + 1; // so A is 1 not 0 
                index += value * Math.Pow(26, letterPosition++);
            }
        }
        return index;
    }
Lino
  • 71
  • 1
  • 1
  • 5
  • A modification to @Subodh Wasankar this will get the index correctly for Cols with index > 26 – Lino Feb 28 '22 at 11:59
0

Just to add a new approach to this old question, I use this as a quick method to get the column index for a cell in a row (assuming you're looping through the Cells in a Row in SheetData, as the OP indicates they were).

You can use the ElementsBefore enumerable to count the Cells before the one you're currently looping on, and since that Count is one-based and Element IEnumerables are zero-based, using the Count will give you the column index of the Cell you're currently on (essentially, ElementsBefore + 1 = the current cell's column index).

So, something like this...

            For Each r In sht.Elements(Of Row)
                For Each c In sht.Elements(Of Row).ElementAt(r.RowIndex).Elements(Of Cell)
                    Dim iColumnIndex = c.ElementsBefore.Count
                Next
            Next
WATYF
  • 409
  • 1
  • 5
  • 16
0

Some of the examples on this thread didn't work beyond Z.

When validating, it's better to perform some Unit tests to confirm the column index is calculated correctly.

Assuming the index count begins from 1, the following can be a useful reference...

Excel letter index calculation

nimblebit
  • 473
  • 3
  • 11
  • 22