3

I have a multithread program which I run on:

  • Windows 10 PRO x64
  • Python 3.8.2 (x64) (Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32)
  • also try Python 3.8.0 with same error
  • VS Code (x64) 1.43.0
  • ms-python extension for VS Code (ms-python.python-2020.2.64397)

I got this error:

Could not connect to 127.0.0.1: 63323
Could not connect to 127.0.0.1: 63323
Traceback (most recent call last):
Traceback (most recent call last):
  File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 514, in start_client
    s.connect((host, port))
ConnectionRefusedError: [WinError 10061] No connection could be established because the target computer actively rejected it
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 514, in start_client
    s.connect((host, port))
ConnectionRefusedError: [WinError 10061] No connection could be established because the target computer actively rejected it
Traceback (most recent call last):
  File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\pydevd.py", line 2536, in settrace
  File "<string>", line 1, in <module>
  File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\pydevd.py", line 2536, in settrace
Could not connect to 127.0.0.1: 63323
    _locked_settrace(
    _locked_settrace(  File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\pydevd.py", line 2610, in _locked_settrace
Could not connect to 127.0.0.1: 63323

In this App I use:

import functions as fnc
from multiprocessing import freeze_support

from functions.py file:

import sys
import csv
import time
import datetime
import argparse
import itertools as it
from os import system, name
from enum import Enum, unique
from tqdm import tqdm
from math import ceil
from multiprocessing import Pool, cpu_count
import codecs

Programs work great on another PC with Python 3.8.0, then I do not understand of this error

This program only compare 2 files and show diff, they do not use any connection into another server or internet


The only difference is the I now use Intel i9-9900 (8c/16t) and on second computer use i5-7500 with 4 cores only


EDIT

When I set the number of cores to 8 from 16, programs run without error. My processor has 8 physical cores and 16 logical cores, and I use cpu_count() for check number of CPU like:

threads = 8 #cpu_count()
p = Pool(threads)

Where is the problem?


EDIT - 09/03/2020 - SOURCE CODE

main.py

import functions as fnc
from multiprocessing import freeze_support


# Run main program
if __name__ == '__main__':
    freeze_support()

    fnc.main()

functions.py

import sys
import csv
import time
import datetime
import argparse
import itertools as it
from os import system, name
from enum import Enum, unique
from tqdm import tqdm
from math import ceil
from multiprocessing import Pool, cpu_count
import codecs


# ENUM
@unique
class Type(Enum):
    TT = 1


# CLASS
class TextFormat:
    PURPLE = '\033[95m'
    CYAN = '\033[96m'
    DARKCYAN = '\033[36m'
    BLUE = '\033[94m'
    GREEN = '\033[92m'
    YELLOW = '\033[93m'
    RED = '\033[91m'
    BOLD = '\033[1m'
    UNDERLINE = '\033[4m'
    END = '\033[0m'


class CrossReference:
    def __init__(self, pn, comp, comp_name, type, diff):
        self.pn = pn
        self.comp = comp
        self.comp_name = comp_name
        self.type = type
        self.diff = diff

    def __str__(self):
        return f'{self.pn} {get_red("|")} {self.comp} {get_red("|")} {self.comp_name} {get_red("|")} {self.type} {get_red("|")} {self.diff}\n'

    def __repr__(self):
        return str(self)

    def getFullRow(self):
        return self.pn + ';' + self.comp + ';' + self.comp_name + ';' + self.type + ';' + self.diff + '\n'


class CrossDuplication:
    def __init__(self, pn, comp, cnt):
        self.pn = pn
        self.comp = comp
        self.cnt = cnt

    def __str__(self):
        return f'{self.pn};{self.comp};{self.cnt}\n'

    def __repr__(self):
        return str(self)

    def __hash__(self):
        return hash(('pn', self.pn,
                 'competitor', self.comp))

    def __eq__(self, other):
        return self.pn == other.pn and self.comp == other.comp 


# FUNCTIONS
def get_formated_time(mili):
    sec = mili / 1000.0
    return str(datetime.timedelta(seconds = sec)) 

def get_green(text):    # return red text
    return(TextFormat.GREEN + str(text) + TextFormat.END)


def get_red(text):      # return red text
    return(TextFormat.RED + str(text) + TextFormat.END)


def get_yellow(text):   # return yellow text
    return(TextFormat.YELLOW + str(text) + TextFormat.END)


def get_blue(text):     # return blue text
    return(TextFormat.BLUE + str(text) + TextFormat.END)


def get_bold(text):     # return bold text format
    return(TextFormat.BOLD + str(text) + TextFormat.END)


def print_info(text):   # print info text format
    print("=== " + str(text) + " ===")


# ### LOADER ### Load Cross Reference file
def CSVCrossLoader(file_url, type):
    try:
        print(get_yellow("============ LOAD CROSS CSV DATA ==========="))

        print_info(get_green(f"Try to load data from {file_url}"))
        destination = []
        with open(file_url, encoding="utf-8-sig") as csv_file:
            csv_reader = csv.reader(csv_file, delimiter=';')
            line_count = 0
            for row in csv_reader:
                if row[0].startswith('*'):
                    continue
                if Type[row[3]] is not type:
                    continue
                cr = CrossReference(row[0], row[1], row[2], row[3], row[4])
                destination.append(cr)
                line_count += 1
            filename = file_url.rsplit('\\', 1)
            print(
                f'Processed {get_red(line_count)} lines for {get_red(type.name)} from {filename[1]}')
            print_info(get_green(f"Data was loaded successfully"))
            return destination
    except Exception as e:
        print(e)
        print_info(get_red(f"File {file_url} could not be loaded"))
        print_info(get_red("Program End"))
        exit(0)


# ### LOADER ### Load Catalog with PN details (load only first row)
def CSVCatalogLoader(file_url):
    try:
        print(get_yellow("=========== LOAD CATALOG CSV DATA =========="))
        print_info(get_green(f"Try to load data from {file_url}"))
        destination = []
        with open(file_url, encoding="utf-8-sig") as csv_file:
            csv_reader = csv.reader(csv_file, delimiter=';')
            line_count = 0
            for row in csv_reader:
                if row[0].startswith('*'):
                    continue
                destination.append(row[0])
                line_count += 1
            filename = file_url.rsplit('\\', 1)
            print(f'Processed {get_red(line_count)} lines from {filename[1]}')
            print_info(get_green(f"Data was loaded successfully"))
            return destination
    except:
        print_info(get_red(f"File {file_url} could not be loaded"))
        print_info(get_red("Program End"))
        exit(0)



def FindDuplications(tasks):
    dlist, start, count = tasks

    duplicates = []
    for r in tqdm(dlist[start:start + count]):
        matches = [x for x in dlist if r.pn == x.pn and r.comp == x.comp]
        duplicates.append(CrossDuplication(r.pn, r.comp, len(matches)))

    return {d for d in duplicates if d.cnt > 1}


def CheckDuplications(cross_list):
    threads = cpu_count()
    tasks_per_thread = ceil(len(cross_list) / threads)

    tasks = [(cross_list, tasks_per_thread * i, tasks_per_thread) for i in range(threads)]

    p = Pool(threads)
    duplicates = p.map(FindDuplications, tasks)
    p.close()
    p.join() 

    duplicates = {item for sublist in duplicates for item in sublist}
    return duplicates   




def main():

    # Main Title of app
    print_info(get_yellow("Run app"))


    # VARIABLES
    catalog_list = []
    cross_list = []


    # Start calculate program running time
    start_time = int(round(time.time() * 1000))


    # load URL param from program input arguments
    validation_type = Type[sys.argv[1]]
    cross_ref_url = sys.argv[2]
    catalog_url = sys.argv[3]


    # Get info abou tested type
    print_info(get_blue(f"|||   Validate data for {validation_type.name}   |||"))
    print("Number of processors: ", cpu_count())
    print()


    # load data
    cross_list = CSVCrossLoader(cross_ref_url, validation_type)
    catalog_list = CSVCatalogLoader(catalog_url)

    # Chech data in Cross Reference for Duplications [ MULTITHREAD ]
    duplicates = CheckDuplications(cross_list)

    # Print duration of execution script
    mili = int(int(round(time.time() * 1000)) - start_time)
    print(f'Script duration - {mili} ms | {get_formated_time(mili)}')


    # End of program
    print_info(get_yellow(""))
    print()

launch.json - VS Code config file

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "First",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "args": [
                "TT",
                "..\\url\\ref.csv",
                "..\\url\\catalog.csv"
            ]
        }
    ]
}

DATA FILE - ref.csv (example)

xxx;123;
ccc;dgd;
xxx;323;
xxx;dgd;
xxx;123;
...etc:.

DATA FILE - catalog.csv (example)

xxx;
ccc;
vvv;
fff;
xyx;
xxx;
cff;
ccc;
www;
...etc:.

Application load 2 CSV files, and find duplication of rows in ref.csv, in this file is over 100k+ lines, and compare the first and second column of each row with same data in foreach loop

For Loop for list of Objects with use Multithread in Python - My previous question, how to do it multithread


EDIT - 10/03/2020 - Third Computer

Today I try it on my laptop (Lenovo T480s) with Intel Core i7-8550U with 4c/8t

I run it with threads = cpu_count(), this function return 8 cores/threads and everything works fine, with the same config like on last 2 previous PCs but only on Intel Core i9-9900 code get ERROR

Also, I tried on i9-9900 set:

threads = 8   # OK
threads = 12  # OK
threads = 14  # ERROR
threads = 16  # ERROR

=========================================== Run native in CMD or Powershell works fine with 16 threads - OK

C:\Users\test\AppData\Local\Programs\Python\Python38\python.exe 'c:\Users\test\Documents\Work\Sources\APP\src\APP\cross-validator.py' 'TT' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\ref.csv' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\Tan\catalog.csv'

Run via VS Code debug add one more parameter with link to ms-python - ERROR

 ${env:PTVSD_LAUNCHER_PORT}='49376'; & 'C:\Users\test\AppData\Local\Programs\Python\Python38\python.exe' 'c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\launcher' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\cross-validator.py' 'TAN' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\ref.csv' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\Tan\catalog.csv'

Thank you for help

Jan Sršeň
  • 1,045
  • 3
  • 23
  • 46
  • There's either nothing running on that port or there's a firewall blocking the connection. – Klaus D. Feb 27 '20 at 11:42
  • Ok, but why I need the connection for this? On second PC I do not have any server or ethernet setting for localhost – Jan Sršeň Feb 27 '20 at 11:46
  • It looks like you are running debug mode and a connection to a debugging server is being made. Check the [docs](https://code.visualstudio.com/docs/python/debugging#_remote-debugging) if your debugger is set up properly. – Klaus D. Feb 27 '20 at 12:03
  • Ok, now I try to set the number of processors to constant set 8 and it works without error. I had Intel Core i9 9900 with 8 cores and 16 threads, but with 16 threads does not work, but I try the application on third computer with i7-8700 with 6cores and 12 threads and there is all work well. – Jan Sršeň Feb 27 '20 at 12:15
  • @JanSršeň is it possible for you to provide a sample code which showcases what exactly you are trying to achieve? It helps to reproduce the error. – Nagaraj Tantri Mar 09 '20 at 01:15
  • @NagarajTantri I updated my post and add source code, but when I set number of CPUs to 8 it is OK, but if load 16 threads I got THIS error – Jan Sršeň Mar 09 '20 at 11:02
  • Is the debugging the same on both PC's vscode. The pydevd is debugging and perhaps is not running the same version or same debugging on the second pc (pydevd or ptvsd ) or could not be started. – Tyger Guzman Mar 09 '20 at 16:25
  • @TygerGuzman YES Python Version, Windows 10 Built Version, VS Code Version and also PTVSD is same, only diff is processor - how can I write, when I change `threads = cpu_count()` to `threads = 8` everything works fine – Jan Sršeň Mar 09 '20 at 17:20
  • My assumption is then its cpu_count() its causing an error which is then failing because of differences in the debugging. If you set to threads= 8 and force a different error do you get the same "could not connect to exception" thrown? – Tyger Guzman Mar 09 '20 at 17:32
  • @TygerGuzman If I set threads= 8 I do not get any error (8 is number of cores), when I use threads = cpu_count(), then cpu_count() return number 16, because I have 8 cores with HT (2*8 = 16 threads) – Jan Sršeň Mar 09 '20 at 20:47
  • Do you get an error sif you set threads=16 ? – Tyger Guzman Mar 09 '20 at 22:39
  • Probably due to multiprocessing only working with logical cores , it throws an error. There are modules that will allow you to pass a parameter of logical = true/false to only count the true cores. – Tyger Guzman Mar 09 '20 at 22:40
  • @JanSršeň appreciate the code, could also provide the sample input files and args you pass? On passing custom data, it keeps breaking and it's difficult to keep tweaking your code to reproduce this error. – Nagaraj Tantri Mar 10 '20 at 02:04
  • @TygerGuzman When I change code like this `threads = 16 #cpu_count()` then I got error also, cpu_count() function works fine – Jan Sršeň Mar 10 '20 at 05:56
  • @TygerGuzman please, check my last update in main post, I tried 3th PC with i7 processor which has also HT - 4c/8t, and code works fine on 8 threads – Jan Sršeň Mar 10 '20 at 07:26
  • @NagarajTantri Hi, the data is not important and also I am not able to provide them, because there are sensitive company data, but is it possible to generate data with web app **Mockaroo - Random Data Generator** – Jan Sršeň Mar 10 '20 at 07:30
  • Why not only count the logical cores : https://stackoverflow.com/questions/40217873/multiprocessing-use-only-the-physical-cores – Tyger Guzman Mar 10 '20 at 16:44
  • @TygerGuzman I understood, but problem is not in processor but in VS Code (extension), please see my last edit, where I explain diff between run in PowerShell (16 threads are OK, no error) and run via VS Studio Code Debug – Jan Sršeň Mar 11 '20 at 07:49
  • 1
    Good luck! I am not well versed in VS studio, I use IDLE and avoid 3rd party software. – Tyger Guzman Mar 11 '20 at 17:05

1 Answers1

0

Check your localhost settings. Or your port 63323 is busy with some other program.

night vision
  • 124
  • 5