0

I am working on a skeletal formula image processor in python as a chemistry project. It is still in its very early stages, but I've been stumped by a problem. When I run the image processing, singular lines are counted as multiple; as multiple lines are picked up from a single pen line. Therefore I need a way of discriminating between the lines and making it so there is one line registered per actual pen line so I can accurately count it as a CH3 group as it is in skeletal formula.

Here is my current code:

import cv2 as cv
import numpy as np
import math
 
 
image1 = cv.imread('test2.jpeg')
gray = cv.cvtColor(image1,cv.COLOR_BGR2GRAY)
canimg = cv.Canny(gray, 50, 200)
 
lines = cv.HoughLinesP(canimg, 1, np.pi/180.0, 80, np.array ([]), 70, 20)
N = lines.shape[0]
 
for i in range(N):
    x1 = lines[i][0][0]
    y1 = lines[i][0][1]    
    x2 = lines[i][0][2]
    y2 = lines[i][0][3]    
    cv.line(image1,(x1,y1),(x2,y2),(255,0,0),2)

cv.imshow('Lines Detected',image1)
cv.imshow("Canny Detection", canimg)
cv.waitKey(0)
cv.destroyAllWindows()

See images attached as my problem demonstrated. Close up of one of the lines - see how multiple lines registered - I need a system to count only one line per actual pen line.

Any links / suggestions / comments / criticisms really appreciated to improve line detection in image processing.

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
benjdevacc
  • 23
  • 5
  • Don’t use Canny, it outlines the lines, essentially doubling them. – Cris Luengo Jul 05 '20 at 00:14
  • @CrisLuengo ah yeah good point, thanks I will try this – benjdevacc Jul 05 '20 at 08:22
  • 1
    What is your end goal after detecting the line segments? Do you need the coordinates of the endpoints? Or do you want to extract the lines as a new image? Or something else? – Tyson Jul 08 '20 at 10:43
  • @Tyson my end goal is to know the number of segments, and the coordinates of the endpoints. My initial goal is just getting to accurately know the number of segments, and as I further the project coordinates of the endpoints will be really useful to identify other aspects of the skeletal formula as I implement detection for functional groups, double bonds etc etc. – benjdevacc Jul 08 '20 at 13:59

1 Answers1

1

By looking at the minimap in the top right, I'm assuming this line pattern continues to the left and right of what is shown in your image. I think it will not be easy to find a robust method, but here are some ideas:

  • Slight blur -> threshold to binary image -> skeletonize -> slight dilation -> HoughLinesP -> merge line segments with similar angles and distances between them. Example: How to merge lines after HoughLinesP?
  • Same as above, but use HoughLines instead -> Find intersections -> Find central points among each group of intersection points. Example: find intersection point of two lines drawn using houghlines opencv
  • Slight blur -> threshold to binary image -> some established corner detection algorithm -> find some way to remove false positives. Example: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_features_harris/py_features_harris.html
  • Slight blur -> threshold to binary image -> erosion using custom arrow-shaped kernels corresponding to the shape of the end points of your line segments -> find connected components -> find highest point for each of the components corresponding to the results from erosion using the top arrow shaped kernel and lowest point for each of the components corresponding to the result from erosion using the bottom arrow shaped kernel. I couldn't find an example for this one.

Personally I would first try the erosion method using custom arrow-shaped kernels. At first glance I think it might be the easiest and most robust of the four methods.

Tyson
  • 592
  • 4
  • 13
  • thanks for this answer theres a lot of useful info in here. I've already taken a look at the merge line code and implemented it a bit. Will try the method you've recommended, the erosion method, but thanks for taking the time to respond. – benjdevacc Jul 09 '20 at 10:08