One Cycle Fourier Window optimization. My code is inefficient

Question

Good day

EDIT: What I want: From any current/voltage waveform on a Power System(PS) I want the filtered 50Hz (fundamental) RMS values magnitudes (and effectively their angles). The current as measured contains all harmonics from 100Hz to 1250Hz depending on the equipment. One cannot analyse and calculate using a wave with these harmonics your error gets so big (depending on equipment) that PS protection equipment calculates incorrect quantities. The signal attached also has MANY many other frequency components involved.

My aim: PS protection Relays are special and calculate a 20ms window in a very short time. I.m not trying to get this. I'm using external recording tech and testing what the relays see are true and they operate correctly. Thus I need to do what they do and only keep 50Hz values without any harmonic and DC.

Important expected result: Given any frequency component that MAY be in the signal I want to see the magnitude of any given harmonic (150,250 - 3rd harmonic magnitudes and 5th harmonic of the fundamental) as well as the magnitude of the DC. This will tell me what type of PS equipment possibly injects these frequencies. It is important that I provide a frequency and the answer is a vector of that frequency only with all other values filtered OUT. RMS-of-the-fundamental vs RMS differs with a factor of 4000A (50Hz only) and 4500A (with other freqs included)

This code calculates a One Cycle Fourier value (RMS) for given frequency. Sometimes called a Fourier filter I think? I use it for Power System 50Hz/0Hz/150Hz analogues analysis. (The answers have been tested and are correct fundamental RMS values. (https://users.wpi.edu/~goulet/Matlab/overlap/trigfs.html)

For a large sample the code is very slow. For 55000 data points it takes 12seconds. For 3 voltages and 3 currents this gets to be VERY slow. I look at 100s of records a day.

How do I enhance it? What Python tips and tricks/ libraries are there to append my lists/array. (Also feel free to rewrite or use the code). I use the code to pick out certain values out of a signal at given times. (which is like reading the values from a specialized program for power system analysis) Edited: with how I load the files and use them, code works with pasting it:

import matplotlib.pyplot as plt
import csv
import math
import numpy as np
import cmath

# FILES ATTACHED TO POST
filenamecfg = r"E:/Python_Practise/2019-10-21 13-54-38-482.CFG"
filename = r"E:/Python_Practise/2019-10-21 13-54-38-482.DAT"

t = []
IR = []
newIR=[]
with open(filenamecfg,'r') as csvfile1:
    cfgfile = [row for row in csv.reader(csvfile1, delimiter=',')]
    numberofchannels=int(np.array(cfgfile)[1][0])
    scaleval = float(np.array(cfgfile)[3][5])
    scalevalI = float(np.array(cfgfile)[8][5])
    samplingfreq = float(np.array(cfgfile)[numberofchannels+4][0])
    numsamples = int(np.array(cfgfile)[numberofchannels+4][1])
    freq = float(np.array(cfgfile)[numberofchannels+2][0])
    intsample = int(samplingfreq/freq)
    #TODO neeeed to get number of samples and frequency and detect 
#automatically
    #scaleval = np.array(cfgfile)[3]
    print('multiplier:',scaleval)
    print('SampFrq:',samplingfreq)
    print('NumSamples:',numsamples)
    print('Freq:',freq)


with open(filename,'r') as csvfile:
    plots = csv.reader(csvfile, delimiter=',')
    for row in plots:
        t.append(float(row[1])/1000000) #get time from us to s
        IR.append(float(row[6]))

newIR = np.array(IR) * scalevalI
t = np.array(t)


def mag_and_theta_for_given_freq(f,IVsignal,Tsignal,samples): #samples are the sample window size you want to caclulate for (256 in my case)
    # f in hertz, IVsignal, Tsignal in numpy.array
    timegap = Tsignal[2]-Tsignal[3]
    pi = math.pi
    w = 2*pi*f
    Xr = []
    Xc = []
    Cplx = []
    mag = []
    theta = []
    #print("Calculating for frequency:",f)
    for i in range(len(IVsignal)-samples): 
        newspan = range(i,i+samples)
        timewindow = Tsignal[newspan]
        #print("this is my time: ",timewindow)
        Sig20ms = IVsignal[newspan]
        N = len(Sig20ms) #get number of samples of my current Freq
        RealI = np.multiply(Sig20ms, np.cos(w*timewindow)) #Get Real and Imaginary part of any signal for given frequency
        ImagI = -1*np.multiply(Sig20ms, np.sin(w*timewindow)) #Filters and calculates 1 WINDOW RMS (root mean square value).
        #calculate for whole signal and create a new vector. This is the RMS vector (used everywhere in power system analysis)
        Xr.append((math.sqrt(2)/N)*sum(RealI)) ### TAKES SO MUCH TIME
        Xc.append((math.sqrt(2)/N)*sum(ImagI)) ## these steps make RMS
        Cplx.append(complex(Xr[i],Xc[i]))
        mag.append(abs(Cplx[i]))
        theta.append(np.angle(Cplx[i]))#th*180/pi # this can be used to get Degrees if necessary
        #also for freq 0 (DC) id the offset is negative how do I return a negative to indicate this when i'm using MAGnitude or Absolute value
    return Cplx,mag,theta #mag[:,1]#,theta # BUT THE MAGNITUDE WILL NEVER BE zero

myZ,magn,th = mag_and_theta_for_given_freq(freq,newIR,t,intsample)

plt.plot(newIR[0:30000],'b',linewidth=0.4)#, label='CFG has been loaded!')
plt.plot(magn[0:30000],'r',linewidth=1)

plt.show()

The code as pasted runs smoothly given the files attached Regards

EDIT: Please find a test csvfile and COMTRADE TEST files here: CSV: https://drive.google.com/open?id=18zc4Ms_MtYAeTBm7tNQTcQkTnFWQ4LUu

COMTRADE https://drive.google.com/file/d/1j3mcBrljgerqIeJo7eiwWo9eDu_ocv9x/view?usp=sharing https://drive.google.com/file/d/1pwYm2yj2x8sKYQUcw3dPy_a9GrqAgFtD/view?usp=sharing

Welcome to SO, could you post a synthetic dataset (or a function generating it) and details on what are inputs and the expected outputs. Also could you describe a bit more the operation you intend to perform. Use math or post reference towards a operational definition of your goal. Cheers, — jlandercy, Nov 20 '19 at 11:48
Thank you for updating your post. Unfortunately additional information you provided are not very useful as this. Having the input is a good point, but it is not sufficient. Think about we will need to reproduce your problem and clearly understand what you aim to do. What we are missing is: how you load the data, how you feed them to the function and the most important: what is the expected output. Without having those informations, it will be difficult to investigate your problem. Please consider reading [mcve] to make your post fits the standard. — jlandercy, Nov 20 '19 at 15:08
Having this said, your code mainly relies on a for loop with a lot of indexation and scalar operations. You already imported `numpy` so you should take advantage of vectorization. That would be a good start for improvement. When you will have clearly defined what process you want to apply to you input and what is the expected output I am sure we will find a way to improve performance — jlandercy, Nov 20 '19 at 15:22
Thanks for the great response. I've given my shortened but working code. for the exact operation. It should work. — apemanx, Nov 20 '19 at 17:15
Thank you for this update we are close to the MCVE. I will assume a band pass filter centred on 50 Hz, will it be sufficient. — jlandercy, Nov 20 '19 at 17:48

jlandercy · Accepted Answer · 2019-11-21T10:55:08.380

Forewords

As I said in my previous comment:

Your code mainly relies on a for loop with a lot of indexation and scalar operations. You already have imported numpy so you should take advantage of vectorization.

This answer is a start towards your solution.

Light weight MCVE

First we create a trial signal for the MCVE:

import numpy as np

# Synthetic signal sampler: 5s sampled as 400 Hz
fs = 400 # Hz
t = 5    # s
t = np.linspace(0, t, fs*t+1)

# Synthetic Signal: Amplitude is about 325V @50Hz
A = 325 # V
f = 50  # Hz
y = A*np.sin(2*f*np.pi*t) # V

Then we can compute the RMS of this signal using the usual formulae:

# Actual definition of RMS:
yrms = np.sqrt(np.mean(y**2)) # 229.75 V

Or alternatively we can compute it using DFT (implemented as rfft in numpy.fft):

# RMS using FFT:
Y = np.fft.rfft(y)/y.size
Yrms = np.sqrt(np.real(Y[0]**2 + np.sum(Y[1:]*np.conj(Y[1:]))/2)) # 229.64 V

A demonstration of why this last formulae works can be found here. This is valid because of the Parseval Theorem implies Fourier transform does conserve Energy.

Both versions make use of vectorized functions, no need of splitting real and imaginary part to perform computation and then reassemble into a complex number.

MCVE: Windowing

I suspect you want to apply this function as a window on a long term time serie where RMS value is about to change. Then we can tackle this problem using pandas library which provides time series commodities.

import pandas as pd

We encapsulate the RMS function:

def rms(y):
    Y = 2*np.fft.rfft(y)/y.size
    return np.sqrt(np.real(Y[0]**2 + np.sum(Y[1:]*np.conj(Y[1:]))/2))

We generate a damped signal (variable RMS)

y = np.exp(-0.1*t)*A*np.sin(2*f*np.pi*t)

We wrap our trial signal into a dataframe to take advantage of the rolling or resample methods:

df = pd.DataFrame(y, index=t*pd.Timedelta('1s'), columns=['signal'])

A rolling RMS of your signal is:

df['rms'] = df.rolling(int(fs/f)).agg(rms)

A periodically sampled RMS is:

df['signal'].resample('1s').agg(rms)

The later returns:

00:00:00    2.187840e+02
00:00:01    1.979639e+02
00:00:02    1.791252e+02
00:00:03    1.620792e+02
00:00:04    1.466553e+02

Signal Conditioning

Addressing your need of keeping only fundamental harmonic (50 Hz), a straightforward solution could be a linear detrend (to remove constant and linear trend) followed by a Butterworth filter (bandpass filter).

We generate a synthetic signal with other frequencies and linear trend:

y = np.exp(-0.1*t)*A*(np.sin(2*f*np.pi*t) \
     + 0.2*np.sin(8*f*np.pi*t) + 0.1*np.sin(16*f*np.pi*t)) \
     + A/20*t + A/100

Then we condition the signal:

from scipy import signal
yd = signal.detrend(y, type='linear')
filt = signal.butter(5, [40,60], btype='band', fs=fs, output='sos', analog=False)
yfilt = signal.sosfilt(filt, yd)

Graphically it leads to:

It resumes to apply the signal conditioning before the RMS computation.

I like what you are suggesting! I just need to wrap my head around it. using the rfft... I very very much like the idea. Secondly, I must start to understand the Python implentation of "df.rolling" and .resample. I never even ran into these functions — apemanx, Nov 20 '19 at 17:17
Also this part does not filter out 50 Hz only. yrms = np.sqrt(np.mean(y**2)) # 229.75 V -- This gives a RMS yes but not the RMS of the Fundamental. How would the rfft filter the 50hz ONLY. thus any other frequency I ask for 150Hz or 0Hz or thus any harmonic specific frequency an get that out? — apemanx, Nov 20 '19 at 17:27
@apemanx I would be glad to help you further and show you how to filter but I really need you to be more precise and specific on what you mean by filter and how you intend to do it. Would you mind update your post and state mathematically or at least operationally what you want? Is it a low pass filter? Which order? What kind? — jlandercy, Nov 20 '19 at 17:32
No Problem. Will do! Thanks SO much for looking at this. Maybe bandpass filter will work but it has to be for 50 or 150 Hz and not for 49-51Hz band. Like picking only the nice apples off a tree and leaving the bad behind. ending up with only the good apples. (also when I want the bad apples only then I can take them also leaving the good) — apemanx, Nov 20 '19 at 19:22
Thanks so much. There is a lot for me to go an play with and consider as well as running a few other system recordings through it. A lot of small code snippets that I find fascinating and don't understand YET! But hey! That is why I'm here. — apemanx, Nov 21 '19 at 12:41
@apemanx, You are welcome. Yes SO is one of the Stack community to ask this kind of questions. My answer is a bit mixed because I have not directly understood what you were looking for. This is why MCVE is a necessity, and you managed to produce one, thank you. The `scipy` package contains a lot of interesting functions you should look at it if you aim to process signals. Have a good day. — jlandercy, Nov 21 '19 at 16:03

One Cycle Fourier Window optimization. My code is inefficient

1 Answers1

Forewords

Light weight MCVE

MCVE: Windowing

Signal Conditioning