Metamodel on the field function: strange results

Question

I'm trying to implement the metamodel of the field function: text and the results I am getting are strange!

The metamodel is based on the input and the output files of a system:

The input file contains 5 columns representing the 5 parameters of the system and 'nsamples' row representing the number of Latin hypercube samples.
The output file contains 'nsamples' columns and 'ntimesteps' row

The problem is:

That I'm calculating the error percentage and it's giving me a weird results: i.e. when I use another i/o files with a higher number of samples the error increase instead of decreasing! which is something unexpected,
For 3 parameters system, with degree of polynomial = 6 the error is 0.12 but for a 5 parameters system, the best value I got it for the error is 1.2 for degree =3 (for degree = 6 the error i'm getting is very huge)!

##########################################################################################

nsamples  = 50 # The number of samples
ntimesteps = 111 # the number of timestep for each simulation in the model 
num_param = 5 # number of parameters of the model 

inputs = pd.read_csv(f"samples_{nsamples}.csv", usecols=range(1, num_param+1), skipinitialspace=True)
outputs = pd.read_csv(f'output_{nsamples}.csv', skiprows=1, skipinitialspace=True, header=None )
outputs = outputs.transpose()

mesh = ot.IntervalMesher([ntimesteps - 1]).build(ot.Interval(0, ntimesteps - 1)) # construction a one-dimensional mesh

sample_X = ot.Sample(np.array(inputs.iloc[:,0:num_param]))
X = np.array(outputs.iloc[0,:])
X = X.reshape((ntimesteps,1))

Field = ot.Field(mesh,X) # field with the just first row
sample_Y = ot.ProcessSample(1,Field) # It stores a list of samples in which each value of the mesh is associated to an output value

for k in range(1,nsamples):
    Xi = np.array(outputs.iloc[k,:]).reshape(ntimesteps,1)
    New_Field = ot.Field(mesh,Xi)
    sample_Y.add(New_Field)


threshold = 1e-7
algo_Y = ot.KarhunenLoeveSVDAlgorithm(sample_Y, threshold)
algo_Y.run()
result_Y = algo_Y.getResult()
postProcessing = ot.KarhunenLoeveLifting(result_Y)
outputSampleChaos = result_Y.project(sample_Y)


degree = 6
dimension_xi_X = num_param
dimension_xi_Y = ntimesteps
enumerateFunction = ot.HyperbolicAnisotropicEnumerateFunction(dimension_xi_X, 0.8)
basis = ot.OrthogonalProductPolynomialFactory(
    [ot.StandardDistributionPolynomialFactory(ot.HistogramFactory().build(sample_X[:,i])) for i in range(dimension_xi_X)], enumerateFunction)
basisSize = enumerateFunction.getStrataCumulatedCardinal(degree)
adaptive = ot.FixedStrategy(basis, basisSize)
projection = ot.LeastSquaresStrategy(
    ot.LeastSquaresMetaModelSelectionFactory(ot.LARS(), ot.CorrectedLeaveOneOut()))
ot.ResourceMap.SetAsScalar("LeastSquaresMetaModelSelection-ErrorThreshold", 1.0e-7)
algo_chaos = ot.FunctionalChaosAlgorithm(sample_X,
outputSampleChaos,basis.getMeasure(), adaptive, projection)
algo_chaos.run()
metaModel1 = ot.PointToFieldConnection(postProcessing, algo_chaos.getResult().getMetaModel())

def metaModel(theta):
    out=metaModel1(theta)
    return(((out)[0:ntimesteps]))## Change the number of outputs
 
error = rmse_error() # this is a function that calculate the difference between the model's outputs and the metamodel in percentage
print(error)
##########################################################################################

score 0 · Answer 1 · answered Jun 10 '23 at 21:31

The code:

algo_chaos = ot.FunctionalChaosAlgorithm(sample_X,
outputSampleChaos,basis.getMeasure(), adaptive, projection)

means that we use the standardized distribution from the basis as the distribution that generated sample_X. It may work, but it is unlikely that the input distribution has a standardized distribution suitable for HistogramFactory. Assuming that the distribution of sample_X, I would rather write:

inputDistribution = ot.FunctionalChaosAlgorithm.BuildDistribution(sample_X)
algo_chaos = ot.FunctionalChaosAlgorithm(sample_X,
outputSampleChaos, inputDistribution, adaptive, projection)

The previous code will try to identify the distribution from the sample. The method is described in the doc and has an example.

Metamodel on the field function: strange results

1 Answers1