For your given input image :

If I use the following piece of code :
import cv2
import numpy as np
img = cv2.imread('test1.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
lines = cv2.HoughLines(edges,1,np.pi/180,200)
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('houghlines3.jpg',img)
I get the following output :

See there is only one big straight line which differentiates the header and the blueprint part. You can get the starting and ending y co-ordinate of this line if you just print y1
and y2
. For this case, they are y1 : 140, y2 : 141
. Now what you need, is just crop the picture up to y pixel 141
value like this :
img = cv2.imread(path/to/original/image)
img_header = img[:141,:]
img_blueprint = img[141:, :]
cv2.imwrite("header.png", img_header)
cv2.imwrite("blueprint.png", img_blueprint)
Now, come to your problem. Here is a possible way. See the biggest straight line differentiating the header and the blueprint has got detected by three different red straight lines through the houghline transform. For these three lines, the starting y co-ordinates will be very much close like for example say 142
, 145
, 143
. You need to append all ending y co-ordinates of the straight lines (y2) in a list and cluster all of them based on adjacency with a threshold value of 5-10 pixels, take the biggest cluster, take the largest ending y co-ordinate from the cluster and crop the original image accordingly.