A very strange phenomenon when coding Bayes classifier using PCA Whitening and LDF(Linear Discriminant function)

Question

It is used with MNIST data. You can find mnist data in this web:

There are four files used in the code, i.e.

train-images-idx3-ubyte.gz: training set images (9912422 bytes)
train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

I firstly use PCA Whitening processing data ,and then use LDF to decide whether the sample is c1 or c2, which c1 and c2 are category.

It is very strange that

I set c1 is 0 and c2 is 1, the correct is 0.8122931442080378, however, when I simply change c2 to 0 and c1 to 1, the correct is changed to 0.5938534278959811 !!!!!!

I try other cases such as swapping 1 and 9, or swapping 1 and 7. In above cases the correct has no change, respectively 0.9925373134328358 and 0.9889042995839112

Please tell me why. Thank you very much!

My code below:

Please remember to change the os path in Line 47/48

import os
import struct
import numpy as np
import math 
def ldf(c1,c2):
  def load_mnist(path, kind):
      """Load MNIST data from `path`"""
      labels_path = os.path.join(path,
                                 '%s-labels-idx1-ubyte'
                                 % kind)
      images_path = os.path.join(path,
                                 '%s-images-idx3-ubyte'
                                 % kind)
      with open(labels_path, 'rb') as lbpath:
          magic, n = struct.unpack('>II',
                                   lbpath.read(8))
          labels = np.fromfile(lbpath,
                               dtype=np.uint8)

      with open(images_path, 'rb') as imgpath:
          magic, num, rows, cols = struct.unpack('>IIII',
                                                 imgpath.read(16))
          images = np.fromfile(imgpath,
                               dtype=np.uint8).reshape(len(labels), 784)

      return images, labels

  X_train,y_train = load_mnist('./num/MNIST/raw','train')
  X_test,y_test = load_mnist('./num/MNIST/raw','t10k')

  X_train0=X_train[y_train==c1]
  X_train1=X_train[y_train==c2]
  y_train0=y_train[y_train==c1]
  y_train1=y_train[y_train==c2]
  X_test0=X_test[y_test==c1]
  X_test1=X_test[y_test==c2]
  y_test0=y_test[y_test==c1]
  y_test1=y_test[y_test==c2]

  X_test01=np.append(X_test0,X_test1,axis=0)
  y_test01=np.append(y_test0,y_test1,axis=0)

  X_train0=[[1 if i >0 else 0 for i in j] for j in X_train0]
  X_train1=[[1 if i >0 else 0 for i in j] for j in X_train1]
  X_test01=[[1 if i >0 else 0 for i in j] for j in X_test01]
  #把大于0的全赋值为1
  X_train0=np.array(X_train0)
  X_train1=np.array(X_train1)
  X_test01=np.array(X_test01)

  X_train01=np.append(X_train0,X_train1,axis=0)
  y_train01=np.append(y_train0,y_train1,axis=0)
  m0=np.size(X_train0,0)
  m1=np.size(X_train1,0)
  m=np.size(X_train01,0)
  m_t=np.size(X_test01,0)
  n=np.size(X_train01,1)

  sumu=np.zeros([n,1])
  for i in X_train01:
    i=i.reshape(n,1)
    sumu+=i
  u=sumu/m

  u=u.reshape(1,n)
  #去中心化
  X_train01_c=X_train01-u
  sigma=np.dot(X_train01_c.T,X_train01_c)/m
  D,U= np.linalg.eig(sigma)
  #对特征矩阵进行变换
  D=np.diag(D)
  D=np.real(D)
  U=np.real(U)
  for i in range(n):
      if D[i][i]==0:
          D[i][i]==10e-10
      elif D[i][i]<0:
          D[i][i]=-D[i][i]
          D[i][i]=1/(math.sqrt(D[i][i]))
      else:
          D[i][i]=1/(math.sqrt(D[i][i]))
          #D^{-1/2}
  W=np.dot(D,U.T)
  Y_train=np.dot(X_train01_c,W.T)#PCA白化

  Y_train0_c=Y_train[y_train01==c1]
  Y_train1_c=Y_train[y_train01==c2]
  sumu0=np.zeros([n,1])
  sumu1=np.zeros([n,1])

  for i in Y_train0_c:
    i=i.reshape(n,1)
    sumu0+=i

  u0=sumu0/m0

  for i in Y_train1_c:
    i=i.reshape(n,1)
    sumu1+=i

  u1=sumu1/m1

  p0=m0/m
  p1=m1/m

  def decision(xk):
    x=xk.reshape(n,1)
    g0=np.dot(u0.T,x)-(1/2)*np.dot(u0.T,u0)+math.log(p0)
    g1=np.dot(u1.T,x)-(1/2)*np.dot(u1.T,u1)+math.log(p1)
    if g0>g1:
      return c1
    else:
      return c2

  ldf=np.zeros(m_t)
  #对测试集进行数据处理
  #去中心化
  X_test01_c=X_test01-u
  Y_test=np.dot(X_test01_c,W.T)#PCA白化
  for i in range(m_t):
    ldf[i]=decision(Y_test[i])

  count=0
  for i in range(m_t):
    if ldf[i]==y_test01[i]:
      count+=1
  correct=count/m_t
  return correct

  ldf(0,1)

Sadly I don't know what mnist data is or where I get it to test. Could you provide a link? Incidentally you seem to be writing js style factories, but python has a (very good, and not prototype based) `class` system which you almost certainly want to use, if only so python coders can read your code better. — 2e0byo, Sep 30 '21 at 09:24
Thank you for your comment. I have added a link to the mnist data, you can see it above. — Zhonghan Wang, Oct 01 '21 at 14:35

A very strange phenomenon when coding Bayes classifier using PCA Whitening and LDF(Linear Discriminant function)

0 Answers0