method_LPP(Locality preserving projections)

云计算 waitig 494℃ 百度已收录 0评论

本文是对何晓飞老师的论文Locality Preserving Projections及其代码的一些简单j介绍,论文及代码均可以在何老师主页上下载。


一、LPP简介

  • 线性投影映射
  • 最优化地保存了数据集的邻近结构
  • 与PCA可作为二选一的技术
  • 在外围空间各处均有定义(不只在训练数据点上有定义,在新的测试数据点上也能够定义)



二、LPP算法实现


设有数据集,现在要找到一个转换矩阵将这m个点映射到新的数据集空间,因此便可以用表示.

1、构建邻接图

定义图G有m个节点,如果“邻近”,则在节点与节点之间连接一条边。两个变量:
(a)neighborhoods
如果满足:<img src="http://latex.codecogs.com/gif.latex?<&space;\epsilon" title=",则节点与节点之间有边连接。

(b) nearest neighbors

如果 nearest
neighbors内,或者 nearest
neighbors内,则节点与节点之间有边连接。


注:如果数据确实是在低维流内,则上述邻接图的构建成立。一旦该邻接图构建成功,LPP会试着将其构建为最优。


2、选择权重

定义为稀疏对称阵,维数为表示顶点与顶点的边的权重,如果之间没有边,则

(a)Heat kernel

如果连接,则有:

(b)Simple-minded

如果当且仅当顶点与顶点被一条边连接时,有:

3、特征映射

计算以下问题的特征值与特征向量:

 
                                                                                                                 (1)

其中,是对角矩阵,其元素值为的列和(或者行和,因为是对称阵),

是Laplacian矩阵。的第列记作

设(1)的解为列向量,按他们的特征值大小排序<img src="http://latex.codecogs.com/gif.latex?\left&space;(&space;\lambda&space;_{0}<&space;\cdots&space;<&space;\lambda&space;_{l-1}&space;\right&space;)" title="\left ( \lambda _{0}< \cdots ,降维过程化为:

其中维向量,维矩阵,即要求的转换矩阵。

三、LPP代码实现

LPP.m

[plain] view plain copy print?

  1. function [eigvector, eigvalue] = LPP(W, options, data)  
  2. % LPP: Locality Preserving Projections  
  3. %  
  4. %       [eigvector, eigvalue] = LPP(W, options, data)  
  5. %   
  6. %             Input:  
  7. %               data       – Data matrix. Each row vector of fea is a data point.  
  8. %               W       – Affinity matrix. You can either call “constructW”  
  9. %                         to construct the W, or construct it by yourself.  
  10. %               options – Struct value in Matlab. The fields in options  
  11. %                         that can be set:  
  12. %                             
  13. %                         Please see LGE.m for other options.  
  14. %  
  15. %             Output:  
  16. %               eigvector – Each column is an embedding function, for a new  
  17. %                           data point (row vector) x,  y = x * eigvector  
  18. %                           will be the embedding result of x.  
  19. %               eigvalue  – The sorted eigvalue of LPP eigen-problem.   
  20. %   
  21. %  
  22. %    Examples:  
  23. %  
  24. %       fea = rand(50,70);  
  25. %       options = [];  
  26. %       options.Metric = ‘Euclidean’;  
  27. %       options.NeighborMode = ‘KNN’;  
  28. %       options.k = 5;  
  29. %       options.WeightMode = ‘HeatKernel’;  
  30. %       options.t = 5;  
  31. %       W = constructW(fea,options);  
  32. %       options.PCARatio = 0.99  
  33. %       [eigvector, eigvalue] = LPP(W, options, fea);  
  34. %       Y = fea * eigvector;  
  35. %           
  36. %       fea = rand(50,70);  
  37. %       gnd = [ones(10,1);ones(15,1)*2;ones(10,1)*3;ones(15,1)*4];  
  38. %       options = [];  
  39. %       options.Metric = ‘Euclidean’;  
  40. %       options.NeighborMode = ‘Supervised’;  
  41. %       options.gnd = gnd;  
  42. %       options.bLDA = 1;  
  43. %       W = constructW(fea,options);        
  44. %       options.PCARatio = 1;  
  45. %       [eigvector, eigvalue] = LPP(W, options, fea);  
  46. %       Y = fea*eigvector;  
  47. %   
  48. %   
  49. % Note: After applying some simple algebra, the smallest eigenvalue problem:  
  50. %               data^T*L*data = \lemda data^T*D*data  
  51. %      is equivalent to the largest eigenvalue problem:  
  52. %               data^T*W*data = \beta data^T*D*data  
  53. %       where L=D-W;  \lemda= 1 – \beta.  
  54. % Thus, the smallest eigenvalue problem can be transformed to a largest   
  55. % eigenvalue problem. Such tricks are adopted in this code for the   
  56. % consideration of calculation precision of Matlab.  
  57. %   
  58. %  
  59. % See also constructW, LGE  
  60. %  
  61. %Reference:  
  62. %   Xiaofei He, and Partha Niyogi, “Locality Preserving Projections”  
  63. %   Advances in Neural Information Processing Systems 16 (NIPS 2003),  
  64. %   Vancouver, Canada, 2003.  
  65. %  
  66. %   Xiaofei He, Shuicheng Yan, Yuxiao Hu, Partha Niyogi, and Hong-Jiang  
  67. %   Zhang, “Face Recognition Using Laplacianfaces”, IEEE PAMI, Vol. 27, No.  
  68. %   3, Mar. 2005.   
  69. %  
  70. %   Deng Cai, Xiaofei He and Jiawei Han, “Document Clustering Using  
  71. %   Locality Preserving Indexing” IEEE TKDE, Dec. 2005.  
  72. %  
  73. %   Deng Cai, Xiaofei He and Jiawei Han, “Using Graph Model for Face Analysis”,  
  74. %   Technical Report, UIUCDCS-R-2005-2636, UIUC, Sept. 2005  
  75. %   
  76. %   Xiaofei He, “Locality Preserving Projections”  
  77. %   PhD’s thesis, Computer Science Department, The University of Chicago,  
  78. %   2005.  
  79. %  
  80. %   version 2.1 –June/2007   
  81. %   version 2.0 –May/2007   
  82. %   version 1.1 –Feb/2006   
  83. %   version 1.0 –April/2004   
  84. %  
  85. %   Written by Deng Cai (dengcai2 AT cs.uiuc.edu)  
  86. %  
  87.   
  88. if (~exist(‘options’,’var’))  
  89.    options = [];  
  90. end  
  91.   
  92. [nSmp,nFea] = size(data);  
  93. if size(W,1) ~= nSmp  
  94.     error(‘W and data mismatch!’);  
  95. end  
  96.   
  97. %====================================================  
  98. % If data is too large, the following centering codes can be commented   
  99. % options.keepMean = 1;  
  100. %====================================================  
  101. if isfield(options,’keepMean’) && options.keepMean  
  102.     ;  
  103. else  
  104.     if issparse(data)  
  105.         data = full(data);  
  106.     end  
  107.     sampleMean = mean(data);  
  108.     data = (data – repmat(sampleMean,nSmp,1));  
  109. end  
  110. %====================================================  
  111.   
  112. D = full(sum(W,2));  
  113.   
  114. if ~isfield(options,’Regu’) || ~options.Regu  
  115.     DToPowerHalf = D.^.5;  
  116.     D_mhalf = DToPowerHalf.^-1;  
  117.   
  118.     if nSmp < 5000  
  119.         tmpD_mhalf = repmat(D_mhalf,1,nSmp);  
  120.         W = (tmpD_mhalf .* W) .* tmpD_mhalf’;  
  121.         clear tmpD_mhalf;  
  122.     else  
  123.         [i_idx,j_idx,v_idx] = find(W);  
  124.         v1_idx = zeros(size(v_idx));  
  125.         for i=1:length(v_idx)  
  126.             v1_idx(i) = v_idx(i) * D_mhalf(i_idx(i)) * D_mhalf(j_idx(i));  
  127.         end  
  128.         W = sparse(i_idx,j_idx,v1_idx);  
  129.         clear i_idx j_idx v_idx v1_idx  
  130.     end  
  131.     W = max(W,W’);  
  132.       
  133.     data = repmat(DToPowerHalf,1,nFea) .* data;  
  134.     [eigvector, eigvalue] = LGE(W, [], options, data);  
  135. else  
  136.     options.ReguAlpha = options.ReguAlpha*sum(D)/length(D);  
  137.   
  138.     D = sparse(1:nSmp,1:nSmp,D,nSmp,nSmp);  
  139.     [eigvector, eigvalue] = LGE(W, D, options, data);  
  140. end  
  141.   
  142. eigIdx = find(eigvalue < 1e-3);  
  143. eigvalue (eigIdx) = [];  
  144. eigvector(:,eigIdx) = [];  
function [eigvector, eigvalue] = LPP(W, options, data)
% LPP: Locality Preserving Projections
%
%       [eigvector, eigvalue] = LPP(W, options, data)
% 
%             Input:
%               data       - Data matrix. Each row vector of fea is a data point.
%               W       - Affinity matrix. You can either call "constructW"
%                         to construct the W, or construct it by yourself.
%               options - Struct value in Matlab. The fields in options
%                         that can be set:
%                           
%                         Please see LGE.m for other options.
%
%             Output:
%               eigvector - Each column is an embedding function, for a new
%                           data point (row vector) x,  y = x * eigvector
%                           will be the embedding result of x.
%               eigvalue  - The sorted eigvalue of LPP eigen-problem. 
% 
%
%    Examples:
%
%       fea = rand(50,70);
%       options = [];
%       options.Metric = 'Euclidean';
%       options.NeighborMode = 'KNN';
%       options.k = 5;
%       options.WeightMode = 'HeatKernel';
%       options.t = 5;
%       W = constructW(fea,options);
%       options.PCARatio = 0.99
%       [eigvector, eigvalue] = LPP(W, options, fea);
%       Y = fea * eigvector;
%         
%       fea = rand(50,70);
%       gnd = [ones(10,1);ones(15,1)*2;ones(10,1)*3;ones(15,1)*4];
%       options = [];
%       options.Metric = 'Euclidean';
%       options.NeighborMode = 'Supervised';
%       options.gnd = gnd;
%       options.bLDA = 1;
%       W = constructW(fea,options);      
%       options.PCARatio = 1;
%       [eigvector, eigvalue] = LPP(W, options, fea);
%       Y = fea*eigvector;
% 
% 
% Note: After applying some simple algebra, the smallest eigenvalue problem:
%               data^T*L*data = \lemda data^T*D*data
%      is equivalent to the largest eigenvalue problem:
%               data^T*W*data = \beta data^T*D*data
%       where L=D-W;  \lemda= 1 - \beta.
% Thus, the smallest eigenvalue problem can be transformed to a largest 
% eigenvalue problem. Such tricks are adopted in this code for the 
% consideration of calculation precision of Matlab.
% 
%
% See also constructW, LGE
%
%Reference:
%   Xiaofei He, and Partha Niyogi, "Locality Preserving Projections"
%   Advances in Neural Information Processing Systems 16 (NIPS 2003),
%   Vancouver, Canada, 2003.
%
%   Xiaofei He, Shuicheng Yan, Yuxiao Hu, Partha Niyogi, and Hong-Jiang
%   Zhang, "Face Recognition Using Laplacianfaces", IEEE PAMI, Vol. 27, No.
%   3, Mar. 2005. 
%
%   Deng Cai, Xiaofei He and Jiawei Han, "Document Clustering Using
%   Locality Preserving Indexing" IEEE TKDE, Dec. 2005.
%
%   Deng Cai, Xiaofei He and Jiawei Han, "Using Graph Model for Face Analysis",
%   Technical Report, UIUCDCS-R-2005-2636, UIUC, Sept. 2005
% 
%   Xiaofei He, "Locality Preserving Projections"
%   PhD's thesis, Computer Science Department, The University of Chicago,
%   2005.
%
%   version 2.1 --June/2007 
%   version 2.0 --May/2007 
%   version 1.1 --Feb/2006 
%   version 1.0 --April/2004 
%
%   Written by Deng Cai (dengcai2 AT cs.uiuc.edu)
%

if (~exist('options','var'))
   options = [];
end

[nSmp,nFea] = size(data);
if size(W,1) ~= nSmp
    error('W and data mismatch!');
end

%====================================================
% If data is too large, the following centering codes can be commented 
% options.keepMean = 1;
%====================================================
if isfield(options,'keepMean') && options.keepMean
    ;
else
    if issparse(data)
        data = full(data);
    end
    sampleMean = mean(data);
    data = (data - repmat(sampleMean,nSmp,1));
end
%====================================================

D = full(sum(W,2));

if ~isfield(options,'Regu') || ~options.Regu
    DToPowerHalf = D.^.5;
    D_mhalf = DToPowerHalf.^-1;

    if nSmp < 5000
        tmpD_mhalf = repmat(D_mhalf,1,nSmp);
        W = (tmpD_mhalf .* W) .* tmpD_mhalf';
        clear tmpD_mhalf;
    else
        [i_idx,j_idx,v_idx] = find(W);
        v1_idx = zeros(size(v_idx));
        for i=1:length(v_idx)
            v1_idx(i) = v_idx(i) * D_mhalf(i_idx(i)) * D_mhalf(j_idx(i));
        end
        W = sparse(i_idx,j_idx,v1_idx);
        clear i_idx j_idx v_idx v1_idx
    end
    W = max(W,W');

    data = repmat(DToPowerHalf,1,nFea) .* data;
    [eigvector, eigvalue] = LGE(W, [], options, data);
else
    options.ReguAlpha = options.ReguAlpha*sum(D)/length(D);

    D = sparse(1:nSmp,1:nSmp,D,nSmp,nSmp);
    [eigvector, eigvalue] = LGE(W, D, options, data);
end

eigIdx = find(eigvalue < 1e-3);
eigvalue (eigIdx) = [];
eigvector(:,eigIdx) = [];

constructW.m

[plain] view plain copy print?

  1. function W = constructW(fea,options)  
  2. %   Usage:  
  3. %   W = constructW(fea,options)  
  4. %  
  5. %   fea: Rows of vectors of data points. Each row is x_i  
  6. %   options: Struct value in Matlab. The fields in options that can be set:  
  7. %                    
  8. %           NeighborMode –  Indicates how to construct the graph. Choices  
  9. %                           are: [Default ‘KNN’]  
  10. %                ‘KNN’            –  k = 0  
  11. %                                       Complete graph  
  12. %                                    k > 0  
  13. %                                      Put an edge between two nodes if and  
  14. %                                      only if they are among the k nearst  
  15. %                                      neighbors of each other. You are  
  16. %                                      required to provide the parameter k in  
  17. %                                      the options. Default k=5.  
  18. %               ‘Supervised’      –  k = 0  
  19. %                                       Put an edge between two nodes if and  
  20. %                                       only if they belong to same class.   
  21. %                                    k > 0  
  22. %                                       Put an edge between two nodes if  
  23. %                                       they belong to same class and they  
  24. %                                       are among the k nearst neighbors of  
  25. %                                       each other.   
  26. %                                    Default: k=0  
  27. %                                   You are required to provide the label  
  28. %                                   information gnd in the options.  
  29. %                                                
  30. %           WeightMode   –  Indicates how to assign weights for each edge  
  31. %                           in the graph. Choices are:  
  32. %               ‘Binary’       – 0-1 weighting. Every edge receiveds weight  
  33. %                                of 1.   
  34. %               ‘HeatKernel’   – If nodes i and j are connected, put weight  
  35. %                                W_ij = exp(-norm(x_i – x_j)/2t^2). You are   
  36. %                                required to provide the parameter t. [Default One]  
  37. %               ‘Cosine’       – If nodes i and j are connected, put weight  
  38. %                                cosine(x_i,x_j).   
  39. %                 
  40. %            k         –   The parameter needed under ‘KNN’ NeighborMode.  
  41. %                          Default will be 5.  
  42. %            gnd       –   The parameter needed under ‘Supervised’  
  43. %                          NeighborMode.  Colunm vector of the label  
  44. %                          information for each data point.  
  45. %            bLDA      –   0 or 1. Only effective under ‘Supervised’  
  46. %                          NeighborMode. If 1, the graph will be constructed  
  47. %                          to make LPP exactly same as LDA. Default will be  
  48. %                          0.   
  49. %            t         –   The parameter needed under ‘HeatKernel’  
  50. %                          WeightMode. Default will be 1  
  51. %         bNormalized  –   0 or 1. Only effective under ‘Cosine’ WeightMode.  
  52. %                          Indicates whether the fea are already be  
  53. %                          normalized to 1. Default will be 0  
  54. %      bSelfConnected  –   0 or 1. Indicates whether W(i,i) == 1. Default 0  
  55. %                          if ‘Supervised’ NeighborMode & bLDA == 1,  
  56. %                          bSelfConnected will always be 1. Default 0.  
  57. %            bTrueKNN  –   0 or 1. If 1, will construct a truly kNN graph  
  58. %                          (Not symmetric!). Default will be 0. Only valid  
  59. %                          for ‘KNN’ NeighborMode  
  60. %  
  61. %  
  62. %    Examples:  
  63. %  
  64. %       fea = rand(50,15);  
  65. %       options = [];  
  66. %       options.NeighborMode = ‘KNN’;  
  67. %       options.k = 5;  
  68. %       options.WeightMode = ‘HeatKernel’;  
  69. %       options.t = 1;  
  70. %       W = constructW(fea,options);  
  71. %         
  72. %         
  73. %       fea = rand(50,15);  
  74. %       gnd = [ones(10,1);ones(15,1)*2;ones(10,1)*3;ones(15,1)*4];  
  75. %       options = [];  
  76. %       options.NeighborMode = ‘Supervised’;  
  77. %       options.gnd = gnd;  
  78. %       options.WeightMode = ‘HeatKernel’;  
  79. %       options.t = 1;  
  80. %       W = constructW(fea,options);  
  81. %         
  82. %         
  83. %       fea = rand(50,15);  
  84. %       gnd = [ones(10,1);ones(15,1)*2;ones(10,1)*3;ones(15,1)*4];  
  85. %       options = [];  
  86. %       options.NeighborMode = ‘Supervised’;  
  87. %       options.gnd = gnd;  
  88. %       options.bLDA = 1;  
  89. %       W = constructW(fea,options);        
  90. %         
  91. %  
  92. %    For more details about the different ways to construct the W, please  
  93. %    refer:  
  94. %       Deng Cai, Xiaofei He and Jiawei Han, “Document Clustering Using  
  95. %       Locality Preserving Indexing” IEEE TKDE, Dec. 2005.  
  96. %      
  97. %  
  98. %    Written by Deng Cai (dengcai2 AT cs.uiuc.edu), April/2004, Feb/2006,  
  99. %                                             May/2007  
  100. %   
  101.   
  102. bSpeed  = 1;  
  103.   
  104. if (~exist(‘options’,’var’))  
  105.    options = [];  
  106. end  
  107.   
  108. if isfield(options,’Metric’)  
  109.     warning(‘This function has been changed and the Metric is no longer be supported’);  
  110. end  
  111.   
  112.   
  113. if ~isfield(options,’bNormalized’)  
  114.     options.bNormalized = 0;  
  115. end  
  116.   
  117. %=================================================  
  118. if ~isfield(options,’NeighborMode’)  
  119.     options.NeighborMode = ‘KNN’;  
  120. end  
  121.   
  122. switch lower(options.NeighborMode)  
  123.     case {lower(‘KNN’)}  %For simplicity, we include the data point itself in the kNN  
  124.         if ~isfield(options,’k’)  
  125.             options.k = 5;  
  126.         end  
  127.     case {lower(‘Supervised’)}  
  128.         if ~isfield(options,’bLDA’)  
  129.             options.bLDA = 0;  
  130.         end  
  131.         if options.bLDA  
  132.             options.bSelfConnected = 1;  
  133.         end  
  134.         if ~isfield(options,’k’)  
  135.             options.k = 0;  
  136.         end  
  137.         if ~isfield(options,’gnd’)  
  138.             error(‘Label(gnd) should be provided under ”Supervised” NeighborMode!’);  
  139.         end  
  140.         if ~isempty(fea) && length(options.gnd) ~= size(fea,1)  
  141.             error(‘gnd doesn”t match with fea!’);  
  142.         end  
  143.     otherwise  
  144.         error(‘NeighborMode does not exist!’);  
  145. end  
  146.   
  147. %=================================================  
  148.   
  149. if ~isfield(options,’WeightMode’)  
  150.     options.WeightMode = ‘HeatKernel’;  
  151. end  
  152.   
  153. bBinary = 0;  
  154. bCosine = 0;  
  155. switch lower(options.WeightMode)  
  156.     case {lower(‘Binary’)}  
  157.         bBinary = 1;   
  158.     case {lower(‘HeatKernel’)}  
  159.         if ~isfield(options,’t’)  
  160.             nSmp = size(fea,1);  
  161.             if nSmp > 3000  
  162.                 D = EuDist2(fea(randsample(nSmp,3000),:));  
  163.             else  
  164.                 D = EuDist2(fea);  
  165.             end  
  166.             options.t = mean(mean(D));  
  167.         end  
  168.     case {lower(‘Cosine’)}  
  169.         bCosine = 1;  
  170.     otherwise  
  171.         error(‘WeightMode does not exist!’);  
  172. end  
  173.   
  174. %=================================================  
  175.   
  176. if ~isfield(options,’bSelfConnected’)  
  177.     options.bSelfConnected = 0;  
  178. end  
  179.   
  180. %=================================================  
  181.   
  182. if isfield(options,’gnd’)   
  183.     nSmp = length(options.gnd);  
  184. else  
  185.     nSmp = size(fea,1);  
  186. end  
  187. maxM = 62500000; %500M  
  188. BlockSize = floor(maxM/(nSmp*3));  
  189.   
  190.   
  191. if strcmpi(options.NeighborMode,’Supervised’)  
  192.     Label = unique(options.gnd);  
  193.     nLabel = length(Label);  
  194.     if options.bLDA  
  195.         G = zeros(nSmp,nSmp);  
  196.         for idx=1:nLabel  
  197.             classIdx = options.gnd==Label(idx);  
  198.             G(classIdx,classIdx) = 1/sum(classIdx);  
  199.         end  
  200.         W = sparse(G);  
  201.         return;  
  202.     end  
  203.       
  204.     switch lower(options.WeightMode)  
  205.         case {lower(‘Binary’)}  
  206.             if options.k > 0  
  207.                 G = zeros(nSmp*(options.k+1),3);  
  208.                 idNow = 0;  
  209.                 for i=1:nLabel  
  210.                     classIdx = find(options.gnd==Label(i));  
  211.                     D = EuDist2(fea(classIdx,:),[],0);  
  212.                     [dump idx] = sort(D,2); % sort each row  
  213.                     clear D dump;  
  214.                     idx = idx(:,1:options.k+1);  
  215.                       
  216.                     nSmpClass = length(classIdx)*(options.k+1);  
  217.                     G(idNow+1:nSmpClass+idNow,1) = repmat(classIdx,[options.k+1,1]);  
  218.                     G(idNow+1:nSmpClass+idNow,2) = classIdx(idx(:));  
  219.                     G(idNow+1:nSmpClass+idNow,3) = 1;  
  220.                     idNow = idNow+nSmpClass;  
  221.                     clear idx  
  222.                 end  
  223.                 G = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);  
  224.                 G = max(G,G’);  
  225.             else  
  226.                 G = zeros(nSmp,nSmp);  
  227.                 for i=1:nLabel  
  228.                     classIdx = find(options.gnd==Label(i));  
  229.                     G(classIdx,classIdx) = 1;  
  230.                 end  
  231.             end  
  232.               
  233.             if ~options.bSelfConnected  
  234.                 for i=1:size(G,1)  
  235.                     G(i,i) = 0;  
  236.                 end  
  237.             end  
  238.               
  239.             W = sparse(G);  
  240.         case {lower(‘HeatKernel’)}  
  241.             if options.k > 0  
  242.                 G = zeros(nSmp*(options.k+1),3);  
  243.                 idNow = 0;  
  244.                 for i=1:nLabel  
  245.                     classIdx = find(options.gnd==Label(i));  
  246.                     D = EuDist2(fea(classIdx,:),[],0);  
  247.                     [dump idx] = sort(D,2); % sort each row  
  248.                     clear D;  
  249.                     idx = idx(:,1:options.k+1);  
  250.                     dump = dump(:,1:options.k+1);  
  251.                     dump = exp(-dump/(2*options.t^2));  
  252.                       
  253.                     nSmpClass = length(classIdx)*(options.k+1);  
  254.                     G(idNow+1:nSmpClass+idNow,1) = repmat(classIdx,[options.k+1,1]);  
  255.                     G(idNow+1:nSmpClass+idNow,2) = classIdx(idx(:));  
  256.                     G(idNow+1:nSmpClass+idNow,3) = dump(:);  
  257.                     idNow = idNow+nSmpClass;  
  258.                     clear dump idx  
  259.                 end  
  260.                 G = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);  
  261.             else  
  262.                 G = zeros(nSmp,nSmp);  
  263.                 for i=1:nLabel  
  264.                     classIdx = find(options.gnd==Label(i));  
  265.                     D = EuDist2(fea(classIdx,:),[],0);  
  266.                     D = exp(-D/(2*options.t^2));  
  267.                     G(classIdx,classIdx) = D;  
  268.                 end  
  269.             end  
  270.               
  271.             if ~options.bSelfConnected  
  272.                 for i=1:size(G,1)  
  273.                     G(i,i) = 0;  
  274.                 end  
  275.             end  
  276.   
  277.             W = sparse(max(G,G’));  
  278.         case {lower(‘Cosine’)}  
  279.             if ~options.bNormalized  
  280.                 fea = NormalizeFea(fea);  
  281.             end  
  282.   
  283.             if options.k > 0  
  284.                 G = zeros(nSmp*(options.k+1),3);  
  285.                 idNow = 0;  
  286.                 for i=1:nLabel  
  287.                     classIdx = find(options.gnd==Label(i));  
  288.                     D = fea(classIdx,:)*fea(classIdx,:)’;  
  289.                     [dump idx] = sort(-D,2); % sort each row  
  290.                     clear D;  
  291.                     idx = idx(:,1:options.k+1);  
  292.                     dump = -dump(:,1:options.k+1);  
  293.                       
  294.                     nSmpClass = length(classIdx)*(options.k+1);  
  295.                     G(idNow+1:nSmpClass+idNow,1) = repmat(classIdx,[options.k+1,1]);  
  296.                     G(idNow+1:nSmpClass+idNow,2) = classIdx(idx(:));  
  297.                     G(idNow+1:nSmpClass+idNow,3) = dump(:);  
  298.                     idNow = idNow+nSmpClass;  
  299.                     clear dump idx  
  300.                 end  
  301.                 G = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);  
  302.             else  
  303.                 G = zeros(nSmp,nSmp);  
  304.                 for i=1:nLabel  
  305.                     classIdx = find(options.gnd==Label(i));  
  306.                     G(classIdx,classIdx) = fea(classIdx,:)*fea(classIdx,:)’;  
  307.                 end  
  308.             end  
  309.   
  310.             if ~options.bSelfConnected  
  311.                 for i=1:size(G,1)  
  312.                     G(i,i) = 0;  
  313.                 end  
  314.             end  
  315.   
  316.             W = sparse(max(G,G’));  
  317.         otherwise  
  318.             error(‘WeightMode does not exist!’);  
  319.     end  
  320.     return;  
  321. end  
  322.   
  323.   
  324. if bCosine && ~options.bNormalized  
  325.     Normfea = NormalizeFea(fea);  
  326. end  
  327.   
  328. if strcmpi(options.NeighborMode,’KNN’) && (options.k > 0)  
  329.     if ~(bCosine && options.bNormalized)  
  330.         G = zeros(nSmp*(options.k+1),3);  
  331.         for i = 1:ceil(nSmp/BlockSize)  
  332.             if i == ceil(nSmp/BlockSize)  
  333.                 smpIdx = (i-1)*BlockSize+1:nSmp;  
  334.                 dist = EuDist2(fea(smpIdx,:),fea,0);  
  335.   
  336.                 if bSpeed  
  337.                     nSmpNow = length(smpIdx);  
  338.                     dump = zeros(nSmpNow,options.k+1);  
  339.                     idx = dump;  
  340.                     for j = 1:options.k+1  
  341.                         [dump(:,j),idx(:,j)] = min(dist,[],2);  
  342.                         temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]’;  
  343.                         dist(temp) = 1e100;  
  344.                     end  
  345.                 else  
  346.                     [dump idx] = sort(dist,2); % sort each row  
  347.                     idx = idx(:,1:options.k+1);  
  348.                     dump = dump(:,1:options.k+1);  
  349.                 end  
  350.                   
  351.                 if ~bBinary  
  352.                     if bCosine  
  353.                         dist = Normfea(smpIdx,:)*Normfea’;  
  354.                         dist = full(dist);  
  355.                         linidx = [1:size(idx,1)]’;  
  356.                         dump = dist(sub2ind(size(dist),linidx(:,ones(1,size(idx,2))),idx));  
  357.                     else  
  358.                         dump = exp(-dump/(2*options.t^2));  
  359.                     end  
  360.                 end  
  361.                   
  362.                 G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),1) = repmat(smpIdx’,[options.k+1,1]);  
  363.                 G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),2) = idx(:);  
  364.                 if ~bBinary  
  365.                     G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),3) = dump(:);  
  366.                 else  
  367.                     G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),3) = 1;  
  368.                 end  
  369.             else  
  370.                 smpIdx = (i-1)*BlockSize+1:i*BlockSize;  
  371.               
  372.                 dist = EuDist2(fea(smpIdx,:),fea,0);  
  373.                   
  374.                 if bSpeed  
  375.                     nSmpNow = length(smpIdx);  
  376.                     dump = zeros(nSmpNow,options.k+1);  
  377.                     idx = dump;  
  378.                     for j = 1:options.k+1  
  379.                         [dump(:,j),idx(:,j)] = min(dist,[],2);  
  380.                         temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]’;  
  381.                         dist(temp) = 1e100;  
  382.                     end  
  383.                 else  
  384.                     [dump idx] = sort(dist,2); % sort each row  
  385.                     idx = idx(:,1:options.k+1);  
  386.                     dump = dump(:,1:options.k+1);  
  387.                 end  
  388.                   
  389.                 if ~bBinary  
  390.                     if bCosine  
  391.                         dist = Normfea(smpIdx,:)*Normfea’;  
  392.                         dist = full(dist);  
  393.                         linidx = [1:size(idx,1)]’;  
  394.                         dump = dist(sub2ind(size(dist),linidx(:,ones(1,size(idx,2))),idx));  
  395.                     else  
  396.                         dump = exp(-dump/(2*options.t^2));  
  397.                     end  
  398.                 end  
  399.                   
  400.                 G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),1) = repmat(smpIdx’,[options.k+1,1]);  
  401.                 G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),2) = idx(:);  
  402.                 if ~bBinary  
  403.                     G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),3) = dump(:);  
  404.                 else  
  405.                     G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),3) = 1;  
  406.                 end  
  407.             end  
  408.         end  
  409.   
  410.         W = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);  
  411.     else  
  412.         G = zeros(nSmp*(options.k+1),3);  
  413.         for i = 1:ceil(nSmp/BlockSize)  
  414.             if i == ceil(nSmp/BlockSize)  
  415.                 smpIdx = (i-1)*BlockSize+1:nSmp;  
  416.                 dist = fea(smpIdx,:)*fea’;  
  417.                 dist = full(dist);  
  418.   
  419.                 if bSpeed  
  420.                     nSmpNow = length(smpIdx);  
  421.                     dump = zeros(nSmpNow,options.k+1);  
  422.                     idx = dump;  
  423.                     for j = 1:options.k+1  
  424.                         [dump(:,j),idx(:,j)] = max(dist,[],2);  
  425.                         temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]’;  
  426.                         dist(temp) = 0;  
  427.                     end  
  428.                 else  
  429.                     [dump idx] = sort(-dist,2); % sort each row  
  430.                     idx = idx(:,1:options.k+1);  
  431.                     dump = -dump(:,1:options.k+1);  
  432.                 end  
  433.   
  434.                 G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),1) = repmat(smpIdx’,[options.k+1,1]);  
  435.                 G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),2) = idx(:);  
  436.                 G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),3) = dump(:);  
  437.             else  
  438.                 smpIdx = (i-1)*BlockSize+1:i*BlockSize;  
  439.                 dist = fea(smpIdx,:)*fea’;  
  440.                 dist = full(dist);  
  441.                   
  442.                 if bSpeed  
  443.                     nSmpNow = length(smpIdx);  
  444.                     dump = zeros(nSmpNow,options.k+1);  
  445.                     idx = dump;  
  446.                     for j = 1:options.k+1  
  447.                         [dump(:,j),idx(:,j)] = max(dist,[],2);  
  448.                         temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]’;  
  449.                         dist(temp) = 0;  
  450.                     end  
  451.                 else  
  452.                     [dump idx] = sort(-dist,2); % sort each row  
  453.                     idx = idx(:,1:options.k+1);  
  454.                     dump = -dump(:,1:options.k+1);  
  455.                 end  
  456.   
  457.                 G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),1) = repmat(smpIdx’,[options.k+1,1]);  
  458.                 G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),2) = idx(:);  
  459.                 G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),3) = dump(:);  
  460.             end  
  461.         end  
  462.   
  463.         W = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);  
  464.     end  
  465.       
  466.     if bBinary  
  467.         W(logical(W)) = 1;  
  468.     end  
  469.       
  470.     if isfield(options,’bSemiSupervised’) && options.bSemiSupervised  
  471.         tmpgnd = options.gnd(options.semiSplit);  
  472.           
  473.         Label = unique(tmpgnd);  
  474.         nLabel = length(Label);  
  475.         G = zeros(sum(options.semiSplit),sum(options.semiSplit));  
  476.         for idx=1:nLabel  
  477.             classIdx = tmpgnd==Label(idx);  
  478.             G(classIdx,classIdx) = 1;  
  479.         end  
  480.         Wsup = sparse(G);  
  481.         if ~isfield(options,’SameCategoryWeight’)  
  482.             options.SameCategoryWeight = 1;  
  483.         end  
  484.         W(options.semiSplit,options.semiSplit) = (Wsup>0)*options.SameCategoryWeight;  
  485.     end  
  486.       
  487.     if ~options.bSelfConnected  
  488.         W = W – diag(diag(W));  
  489.     end  
  490.   
  491.     if isfield(options,’bTrueKNN’) && options.bTrueKNN  
  492.           
  493.     else  
  494.         W = max(W,W’);  
  495.     end  
  496.       
  497.     return;  
  498. end  
  499.   
  500.   
  501. % strcmpi(options.NeighborMode,’KNN’) & (options.k == 0)  
  502. % Complete Graph  
  503.   
  504. switch lower(options.WeightMode)  
  505.     case {lower(‘Binary’)}  
  506.         error(‘Binary weight can not be used for complete graph!’);  
  507.     case {lower(‘HeatKernel’)}  
  508.         W = EuDist2(fea,[],0);  
  509.         W = exp(-W/(2*options.t^2));  
  510.     case {lower(‘Cosine’)}  
  511.         W = full(Normfea*Normfea’);  
  512.     otherwise  
  513.         error(‘WeightMode does not exist!’);  
  514. end  
  515.   
  516. if ~options.bSelfConnected  
  517.     for i=1:size(W,1)  
  518.         W(i,i) = 0;  
  519.     end  
  520. end  
  521.   
  522. W = max(W,W’);  
function W = constructW(fea,options)
%   Usage:
%   W = constructW(fea,options)
%
%   fea: Rows of vectors of data points. Each row is x_i
%   options: Struct value in Matlab. The fields in options that can be set:
%                  
%           NeighborMode -  Indicates how to construct the graph. Choices
%                           are: [Default 'KNN']
%                'KNN'            -  k = 0
%                                       Complete graph
%                                    k > 0
%                                      Put an edge between two nodes if and
%                                      only if they are among the k nearst
%                                      neighbors of each other. You are
%                                      required to provide the parameter k in
%                                      the options. Default k=5.
%               'Supervised'      -  k = 0
%                                       Put an edge between two nodes if and
%                                       only if they belong to same class. 
%                                    k > 0
%                                       Put an edge between two nodes if
%                                       they belong to same class and they
%                                       are among the k nearst neighbors of
%                                       each other. 
%                                    Default: k=0
%                                   You are required to provide the label
%                                   information gnd in the options.
%                                              
%           WeightMode   -  Indicates how to assign weights for each edge
%                           in the graph. Choices are:
%               'Binary'       - 0-1 weighting. Every edge receiveds weight
%                                of 1. 
%               'HeatKernel'   - If nodes i and j are connected, put weight
%                                W_ij = exp(-norm(x_i - x_j)/2t^2). You are 
%                                required to provide the parameter t. [Default One]
%               'Cosine'       - If nodes i and j are connected, put weight
%                                cosine(x_i,x_j). 
%               
%            k         -   The parameter needed under 'KNN' NeighborMode.
%                          Default will be 5.
%            gnd       -   The parameter needed under 'Supervised'
%                          NeighborMode.  Colunm vector of the label
%                          information for each data point.
%            bLDA      -   0 or 1. Only effective under 'Supervised'
%                          NeighborMode. If 1, the graph will be constructed
%                          to make LPP exactly same as LDA. Default will be
%                          0. 
%            t         -   The parameter needed under 'HeatKernel'
%                          WeightMode. Default will be 1
%         bNormalized  -   0 or 1. Only effective under 'Cosine' WeightMode.
%                          Indicates whether the fea are already be
%                          normalized to 1. Default will be 0
%      bSelfConnected  -   0 or 1. Indicates whether W(i,i) == 1. Default 0
%                          if 'Supervised' NeighborMode & bLDA == 1,
%                          bSelfConnected will always be 1. Default 0.
%            bTrueKNN  -   0 or 1. If 1, will construct a truly kNN graph
%                          (Not symmetric!). Default will be 0. Only valid
%                          for 'KNN' NeighborMode
%
%
%    Examples:
%
%       fea = rand(50,15);
%       options = [];
%       options.NeighborMode = 'KNN';
%       options.k = 5;
%       options.WeightMode = 'HeatKernel';
%       options.t = 1;
%       W = constructW(fea,options);
%       
%       
%       fea = rand(50,15);
%       gnd = [ones(10,1);ones(15,1)*2;ones(10,1)*3;ones(15,1)*4];
%       options = [];
%       options.NeighborMode = 'Supervised';
%       options.gnd = gnd;
%       options.WeightMode = 'HeatKernel';
%       options.t = 1;
%       W = constructW(fea,options);
%       
%       
%       fea = rand(50,15);
%       gnd = [ones(10,1);ones(15,1)*2;ones(10,1)*3;ones(15,1)*4];
%       options = [];
%       options.NeighborMode = 'Supervised';
%       options.gnd = gnd;
%       options.bLDA = 1;
%       W = constructW(fea,options);      
%       
%
%    For more details about the different ways to construct the W, please
%    refer:
%       Deng Cai, Xiaofei He and Jiawei Han, "Document Clustering Using
%       Locality Preserving Indexing" IEEE TKDE, Dec. 2005.
%    
%
%    Written by Deng Cai (dengcai2 AT cs.uiuc.edu), April/2004, Feb/2006,
%                                             May/2007
% 

bSpeed  = 1;

if (~exist('options','var'))
   options = [];
end

if isfield(options,'Metric')
    warning('This function has been changed and the Metric is no longer be supported');
end


if ~isfield(options,'bNormalized')
    options.bNormalized = 0;
end

%=================================================
if ~isfield(options,'NeighborMode')
    options.NeighborMode = 'KNN';
end

switch lower(options.NeighborMode)
    case {lower('KNN')}  %For simplicity, we include the data point itself in the kNN
        if ~isfield(options,'k')
            options.k = 5;
        end
    case {lower('Supervised')}
        if ~isfield(options,'bLDA')
            options.bLDA = 0;
        end
        if options.bLDA
            options.bSelfConnected = 1;
        end
        if ~isfield(options,'k')
            options.k = 0;
        end
        if ~isfield(options,'gnd')
            error('Label(gnd) should be provided under ''Supervised'' NeighborMode!');
        end
        if ~isempty(fea) && length(options.gnd) ~= size(fea,1)
            error('gnd doesn''t match with fea!');
        end
    otherwise
        error('NeighborMode does not exist!');
end

%=================================================

if ~isfield(options,'WeightMode')
    options.WeightMode = 'HeatKernel';
end

bBinary = 0;
bCosine = 0;
switch lower(options.WeightMode)
    case {lower('Binary')}
        bBinary = 1; 
    case {lower('HeatKernel')}
        if ~isfield(options,'t')
            nSmp = size(fea,1);
            if nSmp > 3000
                D = EuDist2(fea(randsample(nSmp,3000),:));
            else
                D = EuDist2(fea);
            end
            options.t = mean(mean(D));
        end
    case {lower('Cosine')}
        bCosine = 1;
    otherwise
        error('WeightMode does not exist!');
end

%=================================================

if ~isfield(options,'bSelfConnected')
    options.bSelfConnected = 0;
end

%=================================================

if isfield(options,'gnd') 
    nSmp = length(options.gnd);
else
    nSmp = size(fea,1);
end
maxM = 62500000; %500M
BlockSize = floor(maxM/(nSmp*3));


if strcmpi(options.NeighborMode,'Supervised')
    Label = unique(options.gnd);
    nLabel = length(Label);
    if options.bLDA
        G = zeros(nSmp,nSmp);
        for idx=1:nLabel
            classIdx = options.gnd==Label(idx);
            G(classIdx,classIdx) = 1/sum(classIdx);
        end
        W = sparse(G);
        return;
    end

    switch lower(options.WeightMode)
        case {lower('Binary')}
            if options.k > 0
                G = zeros(nSmp*(options.k+1),3);
                idNow = 0;
                for i=1:nLabel
                    classIdx = find(options.gnd==Label(i));
                    D = EuDist2(fea(classIdx,:),[],0);
                    [dump idx] = sort(D,2); % sort each row
                    clear D dump;
                    idx = idx(:,1:options.k+1);

                    nSmpClass = length(classIdx)*(options.k+1);
                    G(idNow+1:nSmpClass+idNow,1) = repmat(classIdx,[options.k+1,1]);
                    G(idNow+1:nSmpClass+idNow,2) = classIdx(idx(:));
                    G(idNow+1:nSmpClass+idNow,3) = 1;
                    idNow = idNow+nSmpClass;
                    clear idx
                end
                G = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);
                G = max(G,G');
            else
                G = zeros(nSmp,nSmp);
                for i=1:nLabel
                    classIdx = find(options.gnd==Label(i));
                    G(classIdx,classIdx) = 1;
                end
            end

            if ~options.bSelfConnected
                for i=1:size(G,1)
                    G(i,i) = 0;
                end
            end

            W = sparse(G);
        case {lower('HeatKernel')}
            if options.k > 0
                G = zeros(nSmp*(options.k+1),3);
                idNow = 0;
                for i=1:nLabel
                    classIdx = find(options.gnd==Label(i));
                    D = EuDist2(fea(classIdx,:),[],0);
                    [dump idx] = sort(D,2); % sort each row
                    clear D;
                    idx = idx(:,1:options.k+1);
                    dump = dump(:,1:options.k+1);
                    dump = exp(-dump/(2*options.t^2));

                    nSmpClass = length(classIdx)*(options.k+1);
                    G(idNow+1:nSmpClass+idNow,1) = repmat(classIdx,[options.k+1,1]);
                    G(idNow+1:nSmpClass+idNow,2) = classIdx(idx(:));
                    G(idNow+1:nSmpClass+idNow,3) = dump(:);
                    idNow = idNow+nSmpClass;
                    clear dump idx
                end
                G = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);
            else
                G = zeros(nSmp,nSmp);
                for i=1:nLabel
                    classIdx = find(options.gnd==Label(i));
                    D = EuDist2(fea(classIdx,:),[],0);
                    D = exp(-D/(2*options.t^2));
                    G(classIdx,classIdx) = D;
                end
            end

            if ~options.bSelfConnected
                for i=1:size(G,1)
                    G(i,i) = 0;
                end
            end

            W = sparse(max(G,G'));
        case {lower('Cosine')}
            if ~options.bNormalized
                fea = NormalizeFea(fea);
            end

            if options.k > 0
                G = zeros(nSmp*(options.k+1),3);
                idNow = 0;
                for i=1:nLabel
                    classIdx = find(options.gnd==Label(i));
                    D = fea(classIdx,:)*fea(classIdx,:)';
                    [dump idx] = sort(-D,2); % sort each row
                    clear D;
                    idx = idx(:,1:options.k+1);
                    dump = -dump(:,1:options.k+1);

                    nSmpClass = length(classIdx)*(options.k+1);
                    G(idNow+1:nSmpClass+idNow,1) = repmat(classIdx,[options.k+1,1]);
                    G(idNow+1:nSmpClass+idNow,2) = classIdx(idx(:));
                    G(idNow+1:nSmpClass+idNow,3) = dump(:);
                    idNow = idNow+nSmpClass;
                    clear dump idx
                end
                G = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);
            else
                G = zeros(nSmp,nSmp);
                for i=1:nLabel
                    classIdx = find(options.gnd==Label(i));
                    G(classIdx,classIdx) = fea(classIdx,:)*fea(classIdx,:)';
                end
            end

            if ~options.bSelfConnected
                for i=1:size(G,1)
                    G(i,i) = 0;
                end
            end

            W = sparse(max(G,G'));
        otherwise
            error('WeightMode does not exist!');
    end
    return;
end


if bCosine && ~options.bNormalized
    Normfea = NormalizeFea(fea);
end

if strcmpi(options.NeighborMode,'KNN') && (options.k > 0)
    if ~(bCosine && options.bNormalized)
        G = zeros(nSmp*(options.k+1),3);
        for i = 1:ceil(nSmp/BlockSize)
            if i == ceil(nSmp/BlockSize)
                smpIdx = (i-1)*BlockSize+1:nSmp;
                dist = EuDist2(fea(smpIdx,:),fea,0);

                if bSpeed
                    nSmpNow = length(smpIdx);
                    dump = zeros(nSmpNow,options.k+1);
                    idx = dump;
                    for j = 1:options.k+1
                        [dump(:,j),idx(:,j)] = min(dist,[],2);
                        temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]';
                        dist(temp) = 1e100;
                    end
                else
                    [dump idx] = sort(dist,2); % sort each row
                    idx = idx(:,1:options.k+1);
                    dump = dump(:,1:options.k+1);
                end

                if ~bBinary
                    if bCosine
                        dist = Normfea(smpIdx,:)*Normfea';
                        dist = full(dist);
                        linidx = [1:size(idx,1)]';
                        dump = dist(sub2ind(size(dist),linidx(:,ones(1,size(idx,2))),idx));
                    else
                        dump = exp(-dump/(2*options.t^2));
                    end
                end

                G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),1) = repmat(smpIdx',[options.k+1,1]);
                G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),2) = idx(:);
                if ~bBinary
                    G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),3) = dump(:);
                else
                    G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),3) = 1;
                end
            else
                smpIdx = (i-1)*BlockSize+1:i*BlockSize;

                dist = EuDist2(fea(smpIdx,:),fea,0);

                if bSpeed
                    nSmpNow = length(smpIdx);
                    dump = zeros(nSmpNow,options.k+1);
                    idx = dump;
                    for j = 1:options.k+1
                        [dump(:,j),idx(:,j)] = min(dist,[],2);
                        temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]';
                        dist(temp) = 1e100;
                    end
                else
                    [dump idx] = sort(dist,2); % sort each row
                    idx = idx(:,1:options.k+1);
                    dump = dump(:,1:options.k+1);
                end

                if ~bBinary
                    if bCosine
                        dist = Normfea(smpIdx,:)*Normfea';
                        dist = full(dist);
                        linidx = [1:size(idx,1)]';
                        dump = dist(sub2ind(size(dist),linidx(:,ones(1,size(idx,2))),idx));
                    else
                        dump = exp(-dump/(2*options.t^2));
                    end
                end

                G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),1) = repmat(smpIdx',[options.k+1,1]);
                G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),2) = idx(:);
                if ~bBinary
                    G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),3) = dump(:);
                else
                    G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),3) = 1;
                end
            end
        end

        W = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);
    else
        G = zeros(nSmp*(options.k+1),3);
        for i = 1:ceil(nSmp/BlockSize)
            if i == ceil(nSmp/BlockSize)
                smpIdx = (i-1)*BlockSize+1:nSmp;
                dist = fea(smpIdx,:)*fea';
                dist = full(dist);

                if bSpeed
                    nSmpNow = length(smpIdx);
                    dump = zeros(nSmpNow,options.k+1);
                    idx = dump;
                    for j = 1:options.k+1
                        [dump(:,j),idx(:,j)] = max(dist,[],2);
                        temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]';
                        dist(temp) = 0;
                    end
                else
                    [dump idx] = sort(-dist,2); % sort each row
                    idx = idx(:,1:options.k+1);
                    dump = -dump(:,1:options.k+1);
                end

                G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),1) = repmat(smpIdx',[options.k+1,1]);
                G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),2) = idx(:);
                G((i-1)*BlockSize*(options.k+1)+1:nSmp*(options.k+1),3) = dump(:);
            else
                smpIdx = (i-1)*BlockSize+1:i*BlockSize;
                dist = fea(smpIdx,:)*fea';
                dist = full(dist);

                if bSpeed
                    nSmpNow = length(smpIdx);
                    dump = zeros(nSmpNow,options.k+1);
                    idx = dump;
                    for j = 1:options.k+1
                        [dump(:,j),idx(:,j)] = max(dist,[],2);
                        temp = (idx(:,j)-1)*nSmpNow+[1:nSmpNow]';
                        dist(temp) = 0;
                    end
                else
                    [dump idx] = sort(-dist,2); % sort each row
                    idx = idx(:,1:options.k+1);
                    dump = -dump(:,1:options.k+1);
                end

                G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),1) = repmat(smpIdx',[options.k+1,1]);
                G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),2) = idx(:);
                G((i-1)*BlockSize*(options.k+1)+1:i*BlockSize*(options.k+1),3) = dump(:);
            end
        end

        W = sparse(G(:,1),G(:,2),G(:,3),nSmp,nSmp);
    end

    if bBinary
        W(logical(W)) = 1;
    end

    if isfield(options,'bSemiSupervised') && options.bSemiSupervised
        tmpgnd = options.gnd(options.semiSplit);

        Label = unique(tmpgnd);
        nLabel = length(Label);
        G = zeros(sum(options.semiSplit),sum(options.semiSplit));
        for idx=1:nLabel
            classIdx = tmpgnd==Label(idx);
            G(classIdx,classIdx) = 1;
        end
        Wsup = sparse(G);
        if ~isfield(options,'SameCategoryWeight')
            options.SameCategoryWeight = 1;
        end
        W(options.semiSplit,options.semiSplit) = (Wsup>0)*options.SameCategoryWeight;
    end

    if ~options.bSelfConnected
        W = W - diag(diag(W));
    end

    if isfield(options,'bTrueKNN') && options.bTrueKNN

    else
        W = max(W,W');
    end

    return;
end


% strcmpi(options.NeighborMode,'KNN') & (options.k == 0)
% Complete Graph

switch lower(options.WeightMode)
    case {lower('Binary')}
        error('Binary weight can not be used for complete graph!');
    case {lower('HeatKernel')}
        W = EuDist2(fea,[],0);
        W = exp(-W/(2*options.t^2));
    case {lower('Cosine')}
        W = full(Normfea*Normfea');
    otherwise
        error('WeightMode does not exist!');
end

if ~options.bSelfConnected
    for i=1:size(W,1)
        W(i,i) = 0;
    end
end

W = max(W,W');


LGE.w

[plain] view plain copy print?

  1. function [eigvector, eigvalue] = LGE(W, D, options, data)  
  2. % LGE: Linear Graph Embedding  
  3. %  
  4. %       [eigvector, eigvalue] = LGE(W, D, options, data)  
  5. %   
  6. %             Input:  
  7. %               data       – data matrix. Each row vector of data is a  
  8. %                         sample vector.   
  9. %               W       – Affinity graph matrix.   
  10. %               D       – Constraint graph matrix.   
  11. %                         LGE solves the optimization problem of   
  12. %                         a* = argmax (a’data’WXa)/(a’data’DXa)   
  13. %                         Default: D = I   
  14. %  
  15. %               options – Struct value in Matlab. The fields in options  
  16. %                         that can be set:  
  17. %  
  18. %                     ReducedDim   –  The dimensionality of the reduced  
  19. %                                     subspace. If 0, all the dimensions  
  20. %                                     will be kept. Default is 30.   
  21. %  
  22. %                            Regu  –  1: regularized solution,   
  23. %                                        a* = argmax (a’X’WXa)/(a’X’DXa+ReguAlpha*I)   
  24. %                                     0: solve the sinularity problem by SVD (PCA)   
  25. %                                     Default: 0   
  26. %  
  27. %                        ReguAlpha –  The regularization parameter. Valid  
  28. %                                     when Regu==1. Default value is 0.1.   
  29. %  
  30. %                            ReguType  –  ‘Ridge’: Tikhonov regularization  
  31. %                                         ‘Custom’: User provided  
  32. %                                                   regularization matrix  
  33. %                                          Default: ‘Ridge’   
  34. %                        regularizerR  –   (nFea x nFea) regularization  
  35. %                                          matrix which should be provided  
  36. %                                          if ReguType is ‘Custom’. nFea is  
  37. %                                          the feature number of data  
  38. %                                          matrix  
  39. %  
  40. %                            PCARatio     –  The percentage of principal  
  41. %                                            component kept in the PCA  
  42. %                                            step. The percentage is  
  43. %                                            calculated based on the  
  44. %                                            eigenvalue. Default is 1  
  45. %                                            (100%, all the non-zero  
  46. %                                            eigenvalues will be kept.  
  47. %                                            If PCARatio > 1, the PCA step  
  48. %                                            will keep exactly PCARatio principle  
  49. %                                            components (does not exceed the  
  50. %                                            exact number of non-zero components).    
  51. %  
  52. %             Output:  
  53. %               eigvector – Each column is an embedding function, for a new  
  54. %                           sample vector (row vector) x,  y = x*eigvector  
  55. %                           will be the embedding result of x.  
  56. %               eigvalue  – The sorted eigvalue of the eigen-problem.  
  57. %               elapse    – Time spent on different steps   
  58. %  
  59. %    Examples:  
  60. %  
  61. % See also LPP, NPE, IsoProjection, LSDA.  
  62. %  
  63. % Reference:  
  64. %  
  65. %   1. Deng Cai, Xiaofei He, Jiawei Han, “Spectral Regression for Efficient  
  66. %   Regularized Subspace Learning”, IEEE International Conference on  
  67. %   Computer Vision (ICCV), Rio de Janeiro, Brazil, Oct. 2007.   
  68. %  
  69. %   2. Deng Cai, Xiaofei He, Yuxiao Hu, Jiawei Han, and Thomas Huang,   
  70. %   “Learning a Spatially Smooth Subspace for Face Recognition”, CVPR’2007  
  71. %   
  72. %   3. Deng Cai, Xiaofei He, Jiawei Han, “Spectral Regression: A Unified  
  73. %   Subspace Learning Framework for Content-Based Image Retrieval”, ACM  
  74. %   Multimedia 2007, Augsburg, Germany, Sep. 2007.  
  75. %  
  76. %   4. Deng Cai, “Spectral Regression: A Regression Framework for  
  77. %   Efficient Regularized Subspace Learning”, PhD Thesis, Department of  
  78. %   Computer Science, UIUC, 2009.     
  79. %  
  80. %   version 3.0 –Dec/2011   
  81. %   version 2.1 –June/2007   
  82. %   version 2.0 –May/2007   
  83. %   version 1.0 –Sep/2006   
  84. %  
  85. %   Written by Deng Cai (dengcai AT gmail.com)  
  86. %  
  87.   
  88. MAX_MATRIX_SIZE = 1600; % You can change this number according your machine computational power  
  89. EIGVECTOR_RATIO = 0.1; % You can change this number according your machine computational power  
  90.   
  91. if (~exist(‘options’,’var’))  
  92.    options = [];  
  93. end  
  94.   
  95. ReducedDim = 30;  
  96. if isfield(options,’ReducedDim’)  
  97.     ReducedDim = options.ReducedDim;  
  98. end  
  99.   
  100. if ~isfield(options,’Regu’) || ~options.Regu  
  101.     bPCA = 1;  
  102.     if ~isfield(options,’PCARatio’)  
  103.         options.PCARatio = 1;  
  104.     end  
  105. else  
  106.     bPCA = 0;  
  107.     if ~isfield(options,’ReguType’)  
  108.         options.ReguType = ‘Ridge’;  
  109.     end  
  110.     if ~isfield(options,’ReguAlpha’)  
  111.         options.ReguAlpha = 0.1;  
  112.     end  
  113. end  
  114.   
  115. bD = 1;  
  116. if ~exist(‘D’,’var’) || isempty(D)  
  117.     bD = 0;  
  118. end  
  119.   
  120.   
  121. [nSmp,nFea] = size(data);  
  122. if size(W,1) ~= nSmp  
  123.     error(‘W and data mismatch!’);  
  124. end  
  125. if bD && (size(D,1) ~= nSmp)  
  126.     error(‘D and data mismatch!’);  
  127. end  
  128.   
  129. bChol = 0;  
  130. if bPCA && (nSmp > nFea) && (options.PCARatio >= 1)  
  131.     if bD  
  132.         DPrime = data’ * D * data;  
  133.     else  
  134.         DPrime = data’ * data;  
  135.     end  
  136.     DPrime = full(DPrime);  
  137.     DPrime = max(DPrime,DPrime’);  
  138.     [R,p] = chol(DPrime);  
  139.     if p == 0  
  140.         bPCA = 0;  
  141.         bChol = 1;  
  142.     end  
  143. end  
  144.   
  145. %======================================  
  146. % SVD  
  147. %======================================  
  148.   
  149. if bPCA      
  150.     [U, S, V] = mySVD(data);  
  151.     [U, S, V] = CutonRatio(U,S,V,options);  
  152.     eigvalue_PCA = full(diag(S));  
  153.     if bD  
  154.         data = U * S;  
  155.         eigvector_PCA = V;  
  156.   
  157.         DPrime = data’ * D * data;  
  158.         DPrime = max(DPrime,DPrime’);  
  159.     else  
  160.         data = U;  
  161.         eigvector_PCA = V*spdiags(eigvalue_PCA.^-1,0,length(eigvalue_PCA),length(eigvalue_PCA));  
  162.     end  
  163. else  
  164.     if ~bChol  
  165.         if bD  
  166.             DPrime = data’*D*data;  
  167.         else  
  168.             DPrime = data’*data;  
  169.         end  
  170.   
  171.         switch lower(options.ReguType)  
  172.             case {lower(‘Ridge’)}  
  173.                 if options.ReguAlpha > 0  
  174.                     for i=1:size(DPrime,1)  
  175.                         DPrime(i,i) = DPrime(i,i) + options.ReguAlpha;  
  176.                     end  
  177.                 end  
  178.             case {lower(‘Tensor’)}  
  179.                 if options.ReguAlpha > 0  
  180.                     DPrime = DPrime + options.ReguAlpha*options.regularizerR;  
  181.                 end  
  182.             case {lower(‘Custom’)}  
  183.                 if options.ReguAlpha > 0  
  184.                     DPrime = DPrime + options.ReguAlpha*options.regularizerR;  
  185.                 end  
  186.             otherwise  
  187.                 error(‘ReguType does not exist!’);  
  188.         end  
  189.   
  190.         DPrime = max(DPrime,DPrime’);  
  191.     end  
  192. end  
  193.   
  194. WPrime = data’ * W * data;  
  195. WPrime = max(WPrime,WPrime’);  
  196.   
  197. %======================================  
  198. % Generalized Eigen  
  199. %======================================  
  200.   
  201. dimMatrix = size(WPrime,2);  
  202.   
  203. if ReducedDim > dimMatrix  
  204.     ReducedDim = dimMatrix;   
  205. end  
  206.   
  207. if isfield(options,’bEigs’)  
  208.     bEigs = options.bEigs;  
  209. else  
  210.     if (dimMatrix > MAX_MATRIX_SIZE) && (ReducedDim < dimMatrix * EIGVECTOR_RATIO)  
  211.         bEigs = 1;  
  212.     else  
  213.         bEigs = 0;  
  214.     end  
  215. end  
  216.   
  217. if bEigs  
  218.     %disp(‘use eigs to speed up!’);  
  219.     option = struct(‘disp’,0);  
  220.     if bPCA && ~bD  
  221.         [eigvector, eigvalue] = eigs(WPrime,ReducedDim,’la’,option);  
  222.     else  
  223.         if bChol  
  224.             option.cholB = 1;  
  225.             [eigvector, eigvalue] = eigs(WPrime,R,ReducedDim,’la’,option);  
  226.         else  
  227.             [eigvector, eigvalue] = eigs(WPrime,DPrime,ReducedDim,’la’,option);  
  228.         end  
  229.     end  
  230.     eigvalue = diag(eigvalue);  
  231. else  
  232.     if bPCA && ~bD   
  233.         [eigvector, eigvalue] = eig(WPrime);  
  234.     else  
  235.         [eigvector, eigvalue] = eig(WPrime,DPrime);  
  236.     end  
  237.     eigvalue = diag(eigvalue);  
  238.       
  239.     [junk, index] = sort(-eigvalue);  
  240.     eigvalue = eigvalue(index);  
  241.     eigvector = eigvector(:,index);  
  242.   
  243.     if ReducedDim < size(eigvector,2)  
  244.         eigvector = eigvector(:, 1:ReducedDim);  
  245.         eigvalue = eigvalue(1:ReducedDim);  
  246.     end  
  247. end  
  248.   
  249. if bPCA  
  250.     eigvector = eigvector_PCA*eigvector;  
  251. end  
  252.   
  253. for i = 1:size(eigvector,2)  
  254.     eigvector(:,i) = eigvector(:,i)./norm(eigvector(:,i));  
  255. end     
  256.   
  257.   
  258. function [U, S, V] = CutonRatio(U,S,V,options)  
  259.     if  ~isfield(options, ‘PCARatio’)  
  260.         options.PCARatio = 1;  
  261.     end  
  262.   
  263.     eigvalue_PCA = full(diag(S));  
  264.     if options.PCARatio > 1  
  265.         idx = options.PCARatio;  
  266.         if idx < length(eigvalue_PCA)  
  267.             U = U(:,1:idx);  
  268.             V = V(:,1:idx);  
  269.             S = S(1:idx,1:idx);  
  270.         end  
  271.     elseif options.PCARatio < 1  
  272.         sumEig = sum(eigvalue_PCA);  
  273.         sumEig = sumEig*options.PCARatio;  
  274.         sumNow = 0;  
  275.         for idx = 1:length(eigvalue_PCA)  
  276.             sumNow = sumNow + eigvalue_PCA(idx);  
  277.             if sumNow >= sumEig  
  278.                 break;  
  279.             end  
  280.         end  
  281.         U = U(:,1:idx);  
  282.         V = V(:,1:idx);  
  283.         S = S(1:idx,1:idx);  
  284.     end  
function [eigvector, eigvalue] = LGE(W, D, options, data)
% LGE: Linear Graph Embedding
%
%       [eigvector, eigvalue] = LGE(W, D, options, data)
% 
%             Input:
%               data       - data matrix. Each row vector of data is a
%                         sample vector. 
%               W       - Affinity graph matrix. 
%               D       - Constraint graph matrix. 
%                         LGE solves the optimization problem of 
%                         a* = argmax (a'data'WXa)/(a'data'DXa) 
%                         Default: D = I 
%
%               options - Struct value in Matlab. The fields in options
%                         that can be set:
%
%                     ReducedDim   -  The dimensionality of the reduced
%                                     subspace. If 0, all the dimensions
%                                     will be kept. Default is 30. 
%
%                            Regu  -  1: regularized solution, 
%                                        a* = argmax (a'X'WXa)/(a'X'DXa+ReguAlpha*I) 
%                                     0: solve the sinularity problem by SVD (PCA) 
%                                     Default: 0 
%
%                        ReguAlpha -  The regularization parameter. Valid
%                                     when Regu==1. Default value is 0.1. 
%
%                            ReguType  -  'Ridge': Tikhonov regularization
%                                         'Custom': User provided
%                                                   regularization matrix
%                                          Default: 'Ridge' 
%                        regularizerR  -   (nFea x nFea) regularization
%                                          matrix which should be provided
%                                          if ReguType is 'Custom'. nFea is
%                                          the feature number of data
%                                          matrix
%
%                            PCARatio     -  The percentage of principal
%                                            component kept in the PCA
%                                            step. The percentage is
%                                            calculated based on the
%                                            eigenvalue. Default is 1
%                                            (100%, all the non-zero
%                                            eigenvalues will be kept.
%                                            If PCARatio > 1, the PCA step
%                                            will keep exactly PCARatio principle
%                                            components (does not exceed the
%                                            exact number of non-zero components).  
%
%             Output:
%               eigvector - Each column is an embedding function, for a new
%                           sample vector (row vector) x,  y = x*eigvector
%                           will be the embedding result of x.
%               eigvalue  - The sorted eigvalue of the eigen-problem.
%               elapse    - Time spent on different steps 
%
%    Examples:
%
% See also LPP, NPE, IsoProjection, LSDA.
%
% Reference:
%
%   1. Deng Cai, Xiaofei He, Jiawei Han, "Spectral Regression for Efficient
%   Regularized Subspace Learning", IEEE International Conference on
%   Computer Vision (ICCV), Rio de Janeiro, Brazil, Oct. 2007. 
%
%   2. Deng Cai, Xiaofei He, Yuxiao Hu, Jiawei Han, and Thomas Huang, 
%   "Learning a Spatially Smooth Subspace for Face Recognition", CVPR'2007
% 
%   3. Deng Cai, Xiaofei He, Jiawei Han, "Spectral Regression: A Unified
%   Subspace Learning Framework for Content-Based Image Retrieval", ACM
%   Multimedia 2007, Augsburg, Germany, Sep. 2007.
%
%   4. Deng Cai, "Spectral Regression: A Regression Framework for
%   Efficient Regularized Subspace Learning", PhD Thesis, Department of
%   Computer Science, UIUC, 2009.   
%
%   version 3.0 --Dec/2011 
%   version 2.1 --June/2007 
%   version 2.0 --May/2007 
%   version 1.0 --Sep/2006 
%
%   Written by Deng Cai (dengcai AT gmail.com)
%

MAX_MATRIX_SIZE = 1600; % You can change this number according your machine computational power
EIGVECTOR_RATIO = 0.1; % You can change this number according your machine computational power

if (~exist('options','var'))
   options = [];
end

ReducedDim = 30;
if isfield(options,'ReducedDim')
    ReducedDim = options.ReducedDim;
end

if ~isfield(options,'Regu') || ~options.Regu
    bPCA = 1;
    if ~isfield(options,'PCARatio')
        options.PCARatio = 1;
    end
else
    bPCA = 0;
    if ~isfield(options,'ReguType')
        options.ReguType = 'Ridge';
    end
    if ~isfield(options,'ReguAlpha')
        options.ReguAlpha = 0.1;
    end
end

bD = 1;
if ~exist('D','var') || isempty(D)
    bD = 0;
end


[nSmp,nFea] = size(data);
if size(W,1) ~= nSmp
    error('W and data mismatch!');
end
if bD && (size(D,1) ~= nSmp)
    error('D and data mismatch!');
end

bChol = 0;
if bPCA && (nSmp > nFea) && (options.PCARatio >= 1)
    if bD
        DPrime = data' * D * data;
    else
        DPrime = data' * data;
    end
    DPrime = full(DPrime);
    DPrime = max(DPrime,DPrime');
    [R,p] = chol(DPrime);
    if p == 0
        bPCA = 0;
        bChol = 1;
    end
end

%======================================
% SVD
%======================================

if bPCA    
    [U, S, V] = mySVD(data);
    [U, S, V] = CutonRatio(U,S,V,options);
    eigvalue_PCA = full(diag(S));
    if bD
        data = U * S;
        eigvector_PCA = V;

        DPrime = data' * D * data;
        DPrime = max(DPrime,DPrime');
    else
        data = U;
        eigvector_PCA = V*spdiags(eigvalue_PCA.^-1,0,length(eigvalue_PCA),length(eigvalue_PCA));
    end
else
    if ~bChol
        if bD
            DPrime = data'*D*data;
        else
            DPrime = data'*data;
        end

        switch lower(options.ReguType)
            case {lower('Ridge')}
                if options.ReguAlpha > 0
                    for i=1:size(DPrime,1)
                        DPrime(i,i) = DPrime(i,i) + options.ReguAlpha;
                    end
                end
            case {lower('Tensor')}
                if options.ReguAlpha > 0
                    DPrime = DPrime + options.ReguAlpha*options.regularizerR;
                end
            case {lower('Custom')}
                if options.ReguAlpha > 0
                    DPrime = DPrime + options.ReguAlpha*options.regularizerR;
                end
            otherwise
                error('ReguType does not exist!');
        end

        DPrime = max(DPrime,DPrime');
    end
end

WPrime = data' * W * data;
WPrime = max(WPrime,WPrime');

%======================================
% Generalized Eigen
%======================================

dimMatrix = size(WPrime,2);

if ReducedDim > dimMatrix
    ReducedDim = dimMatrix; 
end

if isfield(options,'bEigs')
    bEigs = options.bEigs;
else
    if (dimMatrix > MAX_MATRIX_SIZE) && (ReducedDim < dimMatrix * EIGVECTOR_RATIO)
        bEigs = 1;
    else
        bEigs = 0;
    end
end

if bEigs
    %disp('use eigs to speed up!');
    option = struct('disp',0);
    if bPCA && ~bD
        [eigvector, eigvalue] = eigs(WPrime,ReducedDim,'la',option);
    else
        if bChol
            option.cholB = 1;
            [eigvector, eigvalue] = eigs(WPrime,R,ReducedDim,'la',option);
        else
            [eigvector, eigvalue] = eigs(WPrime,DPrime,ReducedDim,'la',option);
        end
    end
    eigvalue = diag(eigvalue);
else
    if bPCA && ~bD 
        [eigvector, eigvalue] = eig(WPrime);
    else
        [eigvector, eigvalue] = eig(WPrime,DPrime);
    end
    eigvalue = diag(eigvalue);

    [junk, index] = sort(-eigvalue);
    eigvalue = eigvalue(index);
    eigvector = eigvector(:,index);

    if ReducedDim < size(eigvector,2)
        eigvector = eigvector(:, 1:ReducedDim);
        eigvalue = eigvalue(1:ReducedDim);
    end
end

if bPCA
    eigvector = eigvector_PCA*eigvector;
end

for i = 1:size(eigvector,2)
    eigvector(:,i) = eigvector(:,i)./norm(eigvector(:,i));
end   


function [U, S, V] = CutonRatio(U,S,V,options)
    if  ~isfield(options, 'PCARatio')
        options.PCARatio = 1;
    end

    eigvalue_PCA = full(diag(S));
    if options.PCARatio > 1
        idx = options.PCARatio;
        if idx < length(eigvalue_PCA)
            U = U(:,1:idx);
            V = V(:,1:idx);
            S = S(1:idx,1:idx);
        end
    elseif options.PCARatio < 1
        sumEig = sum(eigvalue_PCA);
        sumEig = sumEig*options.PCARatio;
        sumNow = 0;
        for idx = 1:length(eigvalue_PCA)
            sumNow = sumNow + eigvalue_PCA(idx);
            if sumNow >= sumEig
                break;
            end
        end
        U = U(:,1:idx);
        V = V(:,1:idx);
        S = S(1:idx,1:idx);
    end

EuDist2.m

[plain] view plain copy print?

  1. function D = EuDist2(fea_a,fea_b,bSqrt)  
  2. %EUDIST2 Efficiently Compute the Euclidean Distance Matrix by Exploring the  
  3. %Matlab matrix operations.  
  4. %  
  5. %   D = EuDist(fea_a,fea_b)  
  6. %   fea_a:    nSample_a * nFeature  
  7. %   fea_b:    nSample_b * nFeature  
  8. %   D:      nSample_a * nSample_a  
  9. %       or  nSample_a * nSample_b  
  10. %  
  11. %    Examples:  
  12. %  
  13. %       a = rand(500,10);  
  14. %       b = rand(1000,10);  
  15. %  
  16. %       A = EuDist2(a); % A: 500*500  
  17. %       D = EuDist2(a,b); % D: 500*1000  
  18. %  
  19. %   version 2.1 –November/2011  
  20. %   version 2.0 –May/2009  
  21. %   version 1.0 –November/2005  
  22. %  
  23. %   Written by Deng Cai (dengcai AT gmail.com)  
  24.   
  25.   
  26. if ~exist(‘bSqrt’,’var’)  
  27.     bSqrt = 1;  
  28. end  
  29.   
  30. if (~exist(‘fea_b’,’var’)) || isempty(fea_b)  
  31.     aa = sum(fea_a.*fea_a,2);  
  32.     ab = fea_a*fea_a’;  
  33.       
  34.     if issparse(aa)  
  35.         aa = full(aa);  
  36.     end  
  37.       
  38.     D = bsxfun(@plus,aa,aa’) – 2*ab;  
  39.     D(D<0) = 0;  
  40.     if bSqrt  
  41.         D = sqrt(D);  
  42.     end  
  43.     D = max(D,D’);  
  44. else  
  45.     aa = sum(fea_a.*fea_a,2);  
  46.     bb = sum(fea_b.*fea_b,2);  
  47.     ab = fea_a*fea_b’;  
  48.   
  49.     if issparse(aa)  
  50.         aa = full(aa);  
  51.         bb = full(bb);  
  52.     end  
  53.   
  54.     D = bsxfun(@plus,aa,bb’) – 2*ab;  
  55.     D(D<0) = 0;  
  56.     if bSqrt  
  57.         D = sqrt(D);  
  58.     end  
  59. end  
function D = EuDist2(fea_a,fea_b,bSqrt)
%EUDIST2 Efficiently Compute the Euclidean Distance Matrix by Exploring the
%Matlab matrix operations.
%
%   D = EuDist(fea_a,fea_b)
%   fea_a:    nSample_a * nFeature
%   fea_b:    nSample_b * nFeature
%   D:      nSample_a * nSample_a
%       or  nSample_a * nSample_b
%
%    Examples:
%
%       a = rand(500,10);
%       b = rand(1000,10);
%
%       A = EuDist2(a); % A: 500*500
%       D = EuDist2(a,b); % D: 500*1000
%
%   version 2.1 --November/2011
%   version 2.0 --May/2009
%   version 1.0 --November/2005
%
%   Written by Deng Cai (dengcai AT gmail.com)


if ~exist('bSqrt','var')
    bSqrt = 1;
end

if (~exist('fea_b','var')) || isempty(fea_b)
    aa = sum(fea_a.*fea_a,2);
    ab = fea_a*fea_a';

    if issparse(aa)
        aa = full(aa);
    end

    D = bsxfun(@plus,aa,aa') - 2*ab;
    D(D<0) = 0;
    if bSqrt
        D = sqrt(D);
    end
    D = max(D,D');
else
    aa = sum(fea_a.*fea_a,2);
    bb = sum(fea_b.*fea_b,2);
    ab = fea_a*fea_b';

    if issparse(aa)
        aa = full(aa);
        bb = full(bb);
    end

    D = bsxfun(@plus,aa,bb') - 2*ab;
    D(D<0) = 0;
    if bSqrt
        D = sqrt(D);
    end
end

 mySVD.m

[plain] view plain copy print?

  1. function [U, S, V] = mySVD(X,ReducedDim)  
  2. %mySVD    Accelerated singular value decomposition.  
  3. %   [U,S,V] = mySVD(X) produces a diagonal matrix S, of the    
  4. %   dimension as the rank of X and with nonnegative diagonal elements in  
  5. %   decreasing order, and unitary matrices U and V so that  
  6. %   X = U*S*V’.  
  7. %  
  8. %   [U,S,V] = mySVD(X,ReducedDim) produces a diagonal matrix S, of the    
  9. %   dimension as ReducedDim and with nonnegative diagonal elements in  
  10. %   decreasing order, and unitary matrices U and V so that  
  11. %   Xhat = U*S*V’ is the best approximation (with respect to F norm) of X  
  12. %   among all the matrices with rank no larger than ReducedDim.  
  13. %  
  14. %   Based on the size of X, mySVD computes the eigvectors of X*X^T or X^T*X  
  15. %   first, and then convert them to the eigenvectors of the other.    
  16. %  
  17. %   See also SVD.  
  18. %  
  19. %   version 2.0 –Feb/2009   
  20. %   version 1.0 –April/2004   
  21. %  
  22. %   Written by Deng Cai (dengcai AT gmail.com)  
  23. %                                                     
  24.   
  25. MAX_MATRIX_SIZE = 1600; % You can change this number according your machine computational power  
  26. EIGVECTOR_RATIO = 0.1; % You can change this number according your machine computational power  
  27.   
  28.   
  29. if ~exist(‘ReducedDim’,’var’)  
  30.     ReducedDim = 0;  
  31. end  
  32.   
  33. [nSmp, mFea] = size(X);  
  34. if mFea/nSmp > 1.0713  
  35.     ddata = X*X’;  
  36.     ddata = max(ddata,ddata’);  
  37.       
  38.     dimMatrix = size(ddata,1);  
  39.     if (ReducedDim > 0) && (dimMatrix > MAX_MATRIX_SIZE) && (ReducedDim < dimMatrix*EIGVECTOR_RATIO)  
  40.         option = struct(‘disp’,0);  
  41.         [U, eigvalue] = eigs(ddata,ReducedDim,’la’,option);  
  42.         eigvalue = diag(eigvalue);  
  43.     else  
  44.         if issparse(ddata)  
  45.             ddata = full(ddata);  
  46.         end  
  47.           
  48.         [U, eigvalue] = eig(ddata);  
  49.         eigvalue = diag(eigvalue);  
  50.         [dump, index] = sort(-eigvalue);  
  51.         eigvalue = eigvalue(index);  
  52.         U = U(:, index);  
  53.     end  
  54.     clear ddata;  
  55.       
  56.     maxEigValue = max(abs(eigvalue));  
  57.     eigIdx = find(abs(eigvalue)/maxEigValue < 1e-10);  
  58.     eigvalue(eigIdx) = [];  
  59.     U(:,eigIdx) = [];  
  60.       
  61.     if (ReducedDim > 0) && (ReducedDim < length(eigvalue))  
  62.         eigvalue = eigvalue(1:ReducedDim);  
  63.         U = U(:,1:ReducedDim);  
  64.     end  
  65.       
  66.     eigvalue_Half = eigvalue.^.5;  
  67.     S =  spdiags(eigvalue_Half,0,length(eigvalue_Half),length(eigvalue_Half));  
  68.   
  69.     if nargout >= 3  
  70.         eigvalue_MinusHalf = eigvalue_Half.^-1;  
  71.         V = X’*(U.*repmat(eigvalue_MinusHalf’,size(U,1),1));  
  72.     end  
  73. else  
  74.     ddata = X’*X;  
  75.     ddata = max(ddata,ddata’);  
  76.       
  77.     dimMatrix = size(ddata,1);  
  78.     if (ReducedDim > 0) && (dimMatrix > MAX_MATRIX_SIZE) && (ReducedDim < dimMatrix*EIGVECTOR_RATIO)  
  79.         option = struct(‘disp’,0);  
  80.         [V, eigvalue] = eigs(ddata,ReducedDim,’la’,option);  
  81.         eigvalue = diag(eigvalue);  
  82.     else  
  83.         if issparse(ddata)  
  84.             ddata = full(ddata);  
  85.         end  
  86.           
  87.         [V, eigvalue] = eig(ddata);  
  88.         eigvalue = diag(eigvalue);  
  89.           
  90.         [dump, index] = sort(-eigvalue);  
  91.         eigvalue = eigvalue(index);  
  92.         V = V(:, index);  
  93.     end  
  94.     clear ddata;  
  95.       
  96.     maxEigValue = max(abs(eigvalue));  
  97.     eigIdx = find(abs(eigvalue)/maxEigValue < 1e-10);  
  98.     eigvalue(eigIdx) = [];  
  99.     V(:,eigIdx) = [];  
  100.       
  101.     if (ReducedDim > 0) && (ReducedDim < length(eigvalue))  
  102.         eigvalue = eigvalue(1:ReducedDim);  
  103.         V = V(:,1:ReducedDim);  
  104.     end  
  105.       
  106.     eigvalue_Half = eigvalue.^.5;  
  107.     S =  spdiags(eigvalue_Half,0,length(eigvalue_Half),length(eigvalue_Half));  
  108.       
  109.     eigvalue_MinusHalf = eigvalue_Half.^-1;  
  110.     U = X*(V.*repmat(eigvalue_MinusHalf’,size(V,1),1));  
  111. end  
function [U, S, V] = mySVD(X,ReducedDim)
%mySVD    Accelerated singular value decomposition.
%   [U,S,V] = mySVD(X) produces a diagonal matrix S, of the  
%   dimension as the rank of X and with nonnegative diagonal elements in
%   decreasing order, and unitary matrices U and V so that
%   X = U*S*V'.
%
%   [U,S,V] = mySVD(X,ReducedDim) produces a diagonal matrix S, of the  
%   dimension as ReducedDim and with nonnegative diagonal elements in
%   decreasing order, and unitary matrices U and V so that
%   Xhat = U*S*V' is the best approximation (with respect to F norm) of X
%   among all the matrices with rank no larger than ReducedDim.
%
%   Based on the size of X, mySVD computes the eigvectors of X*X^T or X^T*X
%   first, and then convert them to the eigenvectors of the other.  
%
%   See also SVD.
%
%   version 2.0 --Feb/2009 
%   version 1.0 --April/2004 
%
%   Written by Deng Cai (dengcai AT gmail.com)
%                                                   

MAX_MATRIX_SIZE = 1600; % You can change this number according your machine computational power
EIGVECTOR_RATIO = 0.1; % You can change this number according your machine computational power


if ~exist('ReducedDim','var')
    ReducedDim = 0;
end

[nSmp, mFea] = size(X);
if mFea/nSmp > 1.0713
    ddata = X*X';
    ddata = max(ddata,ddata');

    dimMatrix = size(ddata,1);
    if (ReducedDim > 0) && (dimMatrix > MAX_MATRIX_SIZE) && (ReducedDim < dimMatrix*EIGVECTOR_RATIO)
        option = struct('disp',0);
        [U, eigvalue] = eigs(ddata,ReducedDim,'la',option);
        eigvalue = diag(eigvalue);
    else
        if issparse(ddata)
            ddata = full(ddata);
        end

        [U, eigvalue] = eig(ddata);
        eigvalue = diag(eigvalue);
        [dump, index] = sort(-eigvalue);
        eigvalue = eigvalue(index);
        U = U(:, index);
    end
    clear ddata;

    maxEigValue = max(abs(eigvalue));
    eigIdx = find(abs(eigvalue)/maxEigValue < 1e-10);
    eigvalue(eigIdx) = [];
    U(:,eigIdx) = [];

    if (ReducedDim > 0) && (ReducedDim < length(eigvalue))
        eigvalue = eigvalue(1:ReducedDim);
        U = U(:,1:ReducedDim);
    end

    eigvalue_Half = eigvalue.^.5;
    S =  spdiags(eigvalue_Half,0,length(eigvalue_Half),length(eigvalue_Half));

    if nargout >= 3
        eigvalue_MinusHalf = eigvalue_Half.^-1;
        V = X'*(U.*repmat(eigvalue_MinusHalf',size(U,1),1));
    end
else
    ddata = X'*X;
    ddata = max(ddata,ddata');

    dimMatrix = size(ddata,1);
    if (ReducedDim > 0) && (dimMatrix > MAX_MATRIX_SIZE) && (ReducedDim < dimMatrix*EIGVECTOR_RATIO)
        option = struct('disp',0);
        [V, eigvalue] = eigs(ddata,ReducedDim,'la',option);
        eigvalue = diag(eigvalue);
    else
        if issparse(ddata)
            ddata = full(ddata);
        end

        [V, eigvalue] = eig(ddata);
        eigvalue = diag(eigvalue);

        [dump, index] = sort(-eigvalue);
        eigvalue = eigvalue(index);
        V = V(:, index);
    end
    clear ddata;

    maxEigValue = max(abs(eigvalue));
    eigIdx = find(abs(eigvalue)/maxEigValue < 1e-10);
    eigvalue(eigIdx) = [];
    V(:,eigIdx) = [];

    if (ReducedDim > 0) && (ReducedDim < length(eigvalue))
        eigvalue = eigvalue(1:ReducedDim);
        V = V(:,1:ReducedDim);
    end

    eigvalue_Half = eigvalue.^.5;
    S =  spdiags(eigvalue_Half,0,length(eigvalue_Half),length(eigvalue_Half));

    eigvalue_MinusHalf = eigvalue_Half.^-1;
    U = X*(V.*repmat(eigvalue_MinusHalf',size(V,1),1));
end

NormalizeFea.m

[plain] view plain copy print?

  1. function fea = NormalizeFea(fea,row)  
  2. % if row == 1, normalize each row of fea to have unit norm;  
  3. % if row == 0, normalize each column of fea to have unit norm;  
  4. %  
  5. %   version 3.0 –Jan/2012   
  6. %   version 2.0 –Jan/2012   
  7. %   version 1.0 –Oct/2003   
  8. %  
  9. %   Written by Deng Cai (dengcai AT gmail.com)  
  10. %  
  11.   
  12. if ~exist(‘row’,’var’)  
  13.     row = 1;  
  14. end  
  15.   
  16. if row  
  17.     nSmp = size(fea,1);  
  18.     feaNorm = max(1e-14,full(sum(fea.^2,2)));  
  19.     fea = spdiags(feaNorm.^-.5,0,nSmp,nSmp)*fea;  
  20. else  
  21.     nSmp = size(fea,2);  
  22.     feaNorm = max(1e-14,full(sum(fea.^2,1))’);  
  23.     fea = fea*spdiags(feaNorm.^-.5,0,nSmp,nSmp);  
  24. end  
  25.               
  26. return;  
  27.   
  28.   
  29.   
  30.   
  31.   
  32.   
  33.   
  34. if row  
  35.     [nSmp, mFea] = size(fea);  
  36.     if issparse(fea)  
  37.         fea2 = fea’;  
  38.         feaNorm = mynorm(fea2,1);  
  39.         for i = 1:nSmp  
  40.             fea2(:,i) = fea2(:,i) ./ max(1e-10,feaNorm(i));  
  41.         end  
  42.         fea = fea2’;  
  43.     else  
  44.         feaNorm = sum(fea.^2,2).^.5;  
  45.         fea = fea./feaNorm(:,ones(1,mFea));  
  46.     end  
  47. else  
  48.     [mFea, nSmp] = size(fea);  
  49.     if issparse(fea)  
  50.         feaNorm = mynorm(fea,1);  
  51.         for i = 1:nSmp  
  52.             fea(:,i) = fea(:,i) ./ max(1e-10,feaNorm(i));  
  53.         end  
  54.     else  
  55.         feaNorm = sum(fea.^2,1).^.5;  
  56.         fea = fea./feaNorm(ones(1,mFea),:);  
  57.     end  
  58. end  
  59.               
function fea = NormalizeFea(fea,row)
% if row == 1, normalize each row of fea to have unit norm;
% if row == 0, normalize each column of fea to have unit norm;
%
%   version 3.0 --Jan/2012 
%   version 2.0 --Jan/2012 
%   version 1.0 --Oct/2003 
%
%   Written by Deng Cai (dengcai AT gmail.com)
%

if ~exist('row','var')
    row = 1;
end

if row
    nSmp = size(fea,1);
    feaNorm = max(1e-14,full(sum(fea.^2,2)));
    fea = spdiags(feaNorm.^-.5,0,nSmp,nSmp)*fea;
else
    nSmp = size(fea,2);
    feaNorm = max(1e-14,full(sum(fea.^2,1))');
    fea = fea*spdiags(feaNorm.^-.5,0,nSmp,nSmp);
end

return;







if row
    [nSmp, mFea] = size(fea);
    if issparse(fea)
        fea2 = fea';
        feaNorm = mynorm(fea2,1);
        for i = 1:nSmp
            fea2(:,i) = fea2(:,i) ./ max(1e-10,feaNorm(i));
        end
        fea = fea2';
    else
        feaNorm = sum(fea.^2,2).^.5;
        fea = fea./feaNorm(:,ones(1,mFea));
    end
else
    [mFea, nSmp] = size(fea);
    if issparse(fea)
        feaNorm = mynorm(fea,1);
        for i = 1:nSmp
            fea(:,i) = fea(:,i) ./ max(1e-10,feaNorm(i));
        end
    else
        feaNorm = sum(fea.^2,1).^.5;
        fea = fea./feaNorm(ones(1,mFea),:);
    end
end
            

Reference:

LPP(Locality
Preserving Projection),局部保留投影


本文由【waitig】发表在等英博客
本文固定链接:method_LPP(Locality preserving projections)
欢迎关注本站官方公众号,每日都有干货分享!
等英博客官方公众号
点赞 (0)分享 (0)