11

I want to get value of a merged cell that has range from D3 to H3 using openpyxl library. As per my understanding most libraries read data from 1st cell itself. Thus the merged content is present in it but I get a none value when I read it.

Following is my code:

wb = load_workbook(work_directory_path+'/outputs/report_vap.xlsx')
ws = wb.get_sheet_by_name("Summary")
suite_path = ws.cell('D3').value
if not isinstance(suite_path, unicode):
    value=unicode(suite_path)
value=value.encode('utf8')
print "Suite Location is "+value;

The output is :

Suite Location is None

The value in cell for D3 to H3 is :

c:\users\xyz\desktop\abc\c++\events\comevents

I even tried printing all the values in the sheet but except for integer values all values returned None.

Following is the changed code:

wb = load_workbook(work_directory_path+'/outputs/report_vap.xlsx')
ws = wb.get_sheet_by_name("Summary")
for row_index in range (ws.get_highest_row()):
    for col_index in range (ws.get_highest_column()):
        print ws.cell(row=row_index, column=col_index).value
suite_path = ws.cell('A11').value
print suite_path
if not isinstance(suite_path, unicode):
   value=unicode(suite_path)
value=value.encode('utf8')
print "Suite Location is "+value;

The output is:

None


None
None
None
None
None
None
None
None
None
None
None
None
None
None


None




None


None




None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None


None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
1
1
None
None
None
None
None
None
9
1106
None
None
None
None
None
None
10
1107
None
None
None
None
None
None
None
None
None
None
Suite Location is None
Suite Location is None
12

The Excel file has following content:

Project/module ID   Project/module  Build Analysis  Language    Compiler    Source File Source File

1_1 HTMLEdit.vcxproj    Success C++ Microsoft Visual Studio 2010 ( version 10.0 )   1   1

1_2 HTMLEdit.vcxproj    Success C++ Microsoft Visual Studio 2010 ( version 10.0 )   9   1106
Total                   10  1107
APhillips
  • 1,175
  • 9
  • 17
manisha
  • 665
  • 3
  • 8
  • 15
  • Without access to the original files it's difficult to say much but there may be a problem if hyperlinks are used anywhere. openpyxl removes all cells from a merged range other than the top-left cell. – Charlie Clark Nov 25 '14 at 09:00

8 Answers8

15

I wrote this based on the latest source code from Openpyxl:

def getMergedCellVal(sheet, cell):
    rng = [s for s in sheet.merged_cells.ranges if cell.coordinate in s]
    return sheet.cell(rng[0].min_row, rng[0].min_col).value if len(rng)!=0 else cell.value
Shady
  • 216
  • 2
  • 6
13

As soon as the only answer is incorrect (there is no more cells_from_range function in openpyxl) I suggest alternative way. I tried and it worked for my case:

Input is sheet and Cell. But if you need, it can be easily modified to accept string cell representation like 'A3'.

import openpyxl


def getValueWithMergeLookup(sheet, cell):
    idx = cell.coordinate
    for range_ in sheet.merged_cell_ranges:
        merged_cells = list(openpyxl.utils.rows_from_range(range_))
        for row in merged_cells:
            if idx in row:
                # If this is a merged cell,
                # return  the first cell of the merge range
                return sheet.cell(merged_cells[0][0]).value

    return sheet.cell(idx).value
The Godfather
  • 4,235
  • 4
  • 39
  • 61
  • Thank you so much for this, you're a champion – Aralox Jul 31 '17 at 03:01
  • Work perfect but up from version 2.5 merged_cell_ranges is deprecated, see http://openpyxl.readthedocs.io/en/stable/changes.html?highlight=MultiCellRange Haven't found out how the replacement works until now, though. – A. Rabus Mar 13 '18 at 14:44
2

Here's an approximation of the function that I use for this:

from openpyxl.cell import get_column_letter
from openpyxl.worksheet import cells_from_range

def getValueWithMergeLookup(sheet, col, row):
    idx = '{0}{1}'.format(get_column_letter(col), row)
    for range_ in sheet.merged_cell_ranges:
        cells = list(cells_from_range(range_))[0]
        if idx in cells:
            # If this is a merged cell, you can look up the value 
            # in the first cell of the merge range
            return sheet.cell(cells[0]).value

    return sheet.cell(row=row, column=col).value

The only really dicey bit there is where I extract the list of cells within the range to search against. That returns a generator, so I cast it to a list (because in doesn't work on generators, apparently), which yields a tuple containing a single list element, which I extract using the 0-index.

For my purposes, this is fast enough -- I use it by iterating the cells I want to test. If you wanted to make this more performant, it might be worthwhile to invert the loop, iterating the merge ranges as your outer loop, so you only have to do that conversion once.

Tim Keating
  • 6,443
  • 4
  • 47
  • 53
  • Interesting to see someone has been able to use the API! :-) It was developed for simplicity: `if idx in sheet.merged_cells` as a quick check as to whether a cell has been merged or not. This uses generators for flexibility. – Charlie Clark Nov 25 '14 at 09:09
  • I considered that, but the merged_cells property iterates all the merged ranges, extracts the individual cell coords and then unions them all together, which strikes me as rather expensive. I essentially stole, ahem, repurposed that code from the library, except I only break up each merge range once. – Tim Keating Nov 25 '14 at 19:09
  • Yes, it's a slightly different requirement: wanting to know whether a cell has been merged or not. If there is a need to get the "leader" of a merged range then it might make sense to store that with the range in some way. Still waiting for more information on what the specification means about this. – Charlie Clark Nov 25 '14 at 19:27
2
from openpyxl import cell as xlcell, worksheet
def within_range(bounds: tuple, cell: xlcell) -> bool:
    column_start, row_start, column_end, row_end = bounds
    row = cell.row
    if row >= row_start and row <= row_end:
        column = cell.column
        if column >= column_start and column <= column_end:
            return True
    return False

def get_value_merged(sheet: worksheet, cell: xlcell) -> any:
    for merged in sheet.merged_cells:
        if within_range(merged.bounds, cell):
            return sheet.cell(merged.min_row, merged.min_col).value
    return cell.value

Should do it for current openpyxl version (2.6.3)

Adanteh
  • 41
  • 4
1

One line to get value of a merged cell:

[r for r in sheet.merged_cells.ranges if cell.coordinate in r][0].start_cell.value

Alex
  • 448
  • 6
  • 11
1

Small update of @The Godfather's answer. Worked at openpyxl 3.0.7.

This modification is for the latest version of openpyxl. (The above answer was posted on 2015 and seems only for the older than 2.5.0)

import openpyxl


def getValueWithMergeLookup(sheet, cell):
    idx = cell.coordinate

    # for range_ in sheet.merged_cell_ranges:
    # 'merged_cell_ranges' has been deprecated
    # 'merged_cells.ranges' should be used instead
    for range_ in sheet.merged_cells.ranges:

        # merged_cells = list(openpyxl.utils.rows_from_range(range_))
        # 'rows_from_range' should take a 'str' type argument
        merged_cells = list(openpyxl.utils.rows_from_range(str(range_)))

        for row in merged_cells:
            if idx in row:
                # If this is a merged cell,
                # return  the first cell of the merge range

                # return sheet.cell(merged_cells[0][0]).value
                # You can just use 'sheet[<CELL ADDRESS>]' to take a cell
                # ex) sheet["A1"].value
                return sheet[merged_cells[0][0]].value

    # return sheet.cell(idx).value
    return sheet[idx].value
sssbbbaaa
  • 204
  • 2
  • 12
  • In your answer, can you briefly highlight what the diff is compared to the original answer? Why is this change needed? For which versions of openpyxl is your answer valid? Would it make sense to edit the original answer (or add a note to it)? – avandeursen Nov 28 '21 at 20:16
0

I had to lookup many merged cells and it was quite slow. So I improved the performace with creating a dict before:

def collect_merge_dict(sheet):
    merge_dict = {}
    merge_ranges = sheet.merged_cells.ranges
    for index, merged_range in enumerate(merge_ranges):
        for col in range(merged_range.min_col, merged_range.max_col + 1):
            for row in range(merged_range.min_row, merged_range.max_row + 1):
                coord = (row, col)
                merge_dict[coord] = index

    return merge_dict

This is how I look up the cell values:

def get_merged_cell_value(sheet, merge_dict, cell):
    coord = (cell.row, cell.column)
    index = merge_dict.get(coord, -1)

    if index > -1:
        merged_range = sheet.merged_cells.ranges[index]
        return sheet.cell(merged_range.min_row, merged_range.min_col).value

    return cell.value
nlaan
  • 46
  • 3
-1
from openpyxl import * 
from openpyxl.utils import *

    def getValueWithMergeLookup(sheet, cell):
        if cell == None or sheet == None:
            return None
        for irange in sheet.merged_cell_ranges:
            min_col, min_row, max_col, max_row =range_boundaries(irange)
            if cell.row in range(min_row,max_row+1) and column_index_from_string(cell.column) in range(min_col,max_col+1):
                return sheet.cell(None,min_row,min_col).value
        return cell.value
mhasan
  • 3,703
  • 1
  • 18
  • 37
RealZ
  • 1