当前位置: 首页 > 工具软件 > TAU > 使用案例 >

matlab sen slope,科学网—Mann-Kendall Tau (aka Tau-b) with Sen's Method - 王国杰的博文

朱硕
2023-12-01

%% Mann-Kendall Tau (aka Tau-b) with Sen's Method (enhanced)

% A non-parametric monotonic trend test computing Mann-Kendall Tau, Tau-b,

% and Sen抯 Slope written in Mathworks-MATLAB implemented using matrix

% rotations.

%

% Suggested citation:

%

% Burkey, Jeff. May 2006.  A non-parametric monotonic trend test computing

%      Mann-Kendall Tau, Tau-b, and Sen抯 Slope written in Mathworks-MATLAB

%      implemented using matrix rotations. King County, Department of Natural

%      Resources and Parks, Science and Technical Services section.

%      Seattle, Washington. USA.

%      http://www.mathworks.com/matlabcentral/fileexchange/authors/23983

%

%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

%

% Important Note:

%      I have also posted a Seasonal Kendell function at Mathworks

%             sktt.m

%

%http://www.mathworks.com/matlabcentral/fileexchange/22389-seasonal-kendall

%-test-with-slope-for-serial-dependent-data

%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

%

%   revised 12/1/2008- Computed variance now takes into account ties in the

%      time index with multiple observations per index.

%      Added in confidence intervals for Sens Slope

%

% Syntax:

%     [taub tau h sig Z S sigma sen n senplot CIlower CIupper D Dall C3]

%               = ktaub(datain, alpha, wantplot)

%

% where:

%     datain = (N x 2) double

%     alpha = (scalar) double

%     wantplot is a flag

%             ~= 0 means create plot, otherwise do not plot

%

%     taub = Mann-Kendall coefficient adjusted for ties

%     tau = Mann-Kendall coefficient not adjusted for ties

%                n(n-1)/2

%     h = hypothesis test (h=1 : is significant)

%     sig = p value (two tailed)

%     Z = Z score

%     sigma = standard deviation

%     sen = sen's slope

%     plotofslope = data used to plot data and sen's slope

%     cilower = lower confidence interval for sen's slope

%     ciupper = upper confidence interval for sen's slope

%

% These next two variables are output because they are needed in the

% Seasonal Kendall function: sktt.m

%     D = denominator used for calculating Tau-b

%     Dall = denominator used for calculating Tau

%     C3 = individual seasonal slopes aggregated for Sens Seasonal Slope

%     nsigma = an assumed variance with all ties reconsidered but set to

%              equal number of positive and negative differences.

%

%

% Modifications:

% 12/1/2008 - Taub = denominator is adjusted

%             Tau  = denominator is NOT adjusted

%                    (matches USGS Kendall.exe output)

%

% 1/17/2009 - checks for anomalies and provides solutions and/or

%             notifications.  In support of this for the Seasonal Kendall

%             (which was also updated 1/17/2009), another term was added to

%             the output of this function-- nsigma.

%

% 6/15/2011 - updated plotting confidence intervals on Sen's slope.  I

%             don't recall my source on developing it, so use at your own

%             peril. That said using the 95/5 percentiles of the comptued

%             individual slopes seems to be reasonable-- I just don't have

%             a paper to back it up.

%

% 7/23/2011 - added a test for matlab version.

%

% 8/21/2013 - added citation supporting method of computing confidence

%             intervals for Sen's slope. Thanks goes to Dr. Jeff Thompson

%             for providing the source that backs up my ginned up method

%             (Line 320).

%

% When calculating trends, there are a few situations that create anomalies

% in the estimates. Foremost, when S = 0, significance will always be

% 100-percent, which is not possible. When this occurs the p-value is

% adjusted by using the computed variance, but assuming S = 1.  However,

% the output from the function will still show S=0 when the case arises.

%

% Secondly, a statistically significant slope = 0 can occur when there are

% a large number of ties in the data.  In this case, a second test is done

% assuming the number of ties is equal an even number positive and negative

% differences.  Significance is tested again, but the output is only sent

% to the screen.  The p-value returned in the function is still the

% original p-value (and the adjusted p-value for serial correlation).

%

% Kendall Tau-b is useful when there are a significant number of

% artificially induced tied values.  For example, water quality data that

% has infilled non-detects with half MDL's.  This would create a false

% number of ties simply because of the common technique of infilling an

% unknown quantity with a known estimate of a quantity.

%

% These test for anomalies and solutions are based on an unpublished paper:

%

% Anomalies and remedies in non-parametric seasonal trend tests and

% estimates by Graham McBride, National Institute of Water & Atmospheric

% Research, Hamilton New Zealand.  March 2000.

%

%

%  Note:  if data are not temporally evenly spaced, Sen's slope becomes

%  inaccurate (a future TODO).

%

% Requirements: Statistics Toolbox

%     or comment out ztest and manually determine significance

%

% This function is coded without any loops. While this is extremely fast,

% it does have some limitations to computer memory. For example, if a data

% set is around 10,000 in length, 1.0 GB RAM may or may not work.

%

% However, I would imagine if memory is a problem for someone with large

% datasets, a person could convert the necessary variables and statements

% to work with sparse matrices.  Which may slow down this function on

% smaller datasets, but the memory issue would greatly diminish.

%

% Sen's methodology was added in by Curtis DeGasperi- King County, DNRP.

%  reference: Statistical Methods for Environmental Pollution Monitoring,

%  Richard O. Gilbert 1987, ISBN: 0-442-23050-8

%

% A couple of resources for Mann-Kendall statistics.

%

%  Statistical Methods in Water Resources

%   By D.R. Helsel and R.M. Hirsch

%        http://water.usgs.gov/pubs/twri/twri4a3/

%

%  Computer Program for the Kendall Family of Trend Tests

%   by Dennis R. Helsel, David K. Mueller, and James R. Slack

%  Scientific Investigations Report 2005-5275

%  http://pubs.usgs.gov/sir/2005/5275/downloads/

%

%

% Written by Jeff Burkey

% King County, Department of Natural Resources and Parks

% 4/18/2006

% 12/4/2008 (significantly revised)

% email: jeff.burkey@kingcounty.gov

%

% [taub tau h sig Z S sigma sen n senplot CIlower CIupper D Dall C3] = ktaub(datain, alpha, wantplot)

function [taub tau h sig Z S sigma sen n senplot CIlower CIupper D Dall C3 nsigma] = ktaub(datain, alpha, wantplot)

try

%% Check MATLAB version for compatibility

% Added this version trap since the most common comment I get is

% they syntax is in error.  So far when people get this error it's

% because they are using an old version of matlab that does not

% accept some of the variable names in this function.

% 7/23/2011 - JJB

vmat = regexp(version,'d+','match');

vr = [str2double(vmat{1}) str2double(vmat{2})];

if vr(1) == 7

if vr(2) >=9 || vr(1) > 7

% Then matlab version should work

else

txt = 'Your version of matlab is %s. nYou need at least version 7.9 to run.n Some of the syntax will not parse correctly.';

warning(txt,version)

end

end

catch msg

error('Matlab version is way too old to run this function.');

end

% wantplot is a flag to create a figure or not default set to no

if exist('wantplot','var') == 0

% user didn't provide assume zero (i.e. no plot)

wantplot = 0;

end

% Data are assumed to be in long columns, hence 'sortrows'

sorted = sortrows(datain,1);

% remove any NaNs, if data are missing they should not be included as

% NaNs.

sorted(any(isnan(sorted),2),:) = [];

% return n after removing any NaNs

n = size(sorted,1);

% set to NaN if trend is not significant

senplot = NaN;

% extract out the data

row1 = sorted(:,1)';

row2 = sorted(:,2)';

clear sorted;

L1 = length(row1);

L2 = L1 - 1;

% find ties

ro1 = sort(row1)';

ro2 = sort(row2)';

[~,b] = unique(ro1);

[~,e] = unique(ro2);

clear a c d f ro1 ro2;

% correcting loss of first value using diff on with unique

if b(1,1) > 1

ta = b(1,1);

else

ta = 1;

end

if e(1,1) > 1

tb = e(1,1);

else

tb = 1;

end

bdiff = [ta; diff(b)];

ediff = [tb; diff(e)];

clear ta tb b e;

%% Determine ties used for computing adjusted variance

tp = sum(bdiff .* (bdiff - 1) .* (2 .* bdiff + 5));

uq = sum(ediff .* (ediff - 1) .* (2 .* ediff + 5));

% modified 12/1/2008

% adjustments when time index has multiple observations

d1 = 9 * L1 * (L1-1) * (L1-2);

tu1 = sum(bdiff .* (bdiff - 1) .* (bdiff - 2)) * sum(ediff .* (ediff - 1) .* (ediff - 2)) / d1;

d2 = 2 * L1 * (L1-1);

tu2 = sum(bdiff .* (bdiff - 1)) * sum(ediff .* (ediff -1)) / d2;

% ties used for adjusting denominator in Tau

t1a = (sum(bdiff .* (bdiff - 1))) / 2;

t2a = (sum(ediff .* (ediff - 1))) / 2;

% create matricies to be used for substituting values as indicies

m1 = repmat((1:L2)',[1 L2]);

m2 = repmat((2:L1)',[1 L2])';

% populate matrixes for analysis

A1 = triu(row1(m1));

A2 = triu(row1(m2));

B1 = triu(row2(m1));

B2 = triu(row2(m2));

clear m1 m2 row1 row2;

% Perform pair comparison and convert to sign

A = sign(A1 - A2);

B = sign(B1 - B2);

%% Perform operations to calculate Sen's Slope

% Median of rate of change among all data points- CLD added 5/3/2006

A3 = reshape((A1 - A2),L2*L2,1);

B3 = reshape((B1 - B2),L2*L2,1);

a = find(A3~=0);

C3 = sort(B3(a)./A3(a));

sen = median(C3);

clear A1 A2 B1 B2 A3 B3 a;

%% Evaluate concordant and discordant

% +1 = concordant

% -1 = discordant

%  0 = tie

C = A.*B;

% Compute S

S = sum(sum(C,2));

clear A B C;

% Calculate denominator with ties removed Tau-b

D = sqrt(((.5*L1*(L1-1))-t1a)*((.5*L1*(L1-1))-t2a));

% Calcuation denominator no ties removed Tau

Dall = L1 * (L1 - 1) / 2;

% (modified 12/1/2008: added tau)

tau = S / Dall;

taub = S / D;

% adjust for normal approximations and continuity

if S > 0

s = S -1;

elseif S < 0

s = S + 1;

elseif S == 0

s = 0;

elseif isnan(S)

error('ErrorTrend:ktaub', 'This function cannot process NaNs. nPlease remove data records with NaNs.n');

end

%% Test for abnormalities in data

% Certain conditions can occur that potentially invalidate this

% statistic. Conditional statements conduct some tests and provide the

% user some feedback.

if S==1

% Notify user continuity correction is setting S = 0

fprintf('nTaub Message:  When absolute value S=1,');

fprintf('n               Continuity correction is setting S = 0.');

fprintf('n               This will affect calculated significance.n');

end

% compute square-root of variance with all ties accounted for in time

% index and in observation values. - JJB 12/1/2008

sigma = sqrt(((L1*(L1-1)*(2*L1 + 5) - tp - uq) / 18) + tu1 + tu2);

% nsigma is used if slope is zero and determined significant.  It is

% hypothesized that all ties can be represented as an equal number of

% postive and negative slopes.

nsigma = sqrt(L1*(L1-1)*(2*L1+5));

Z = s / sigma;

% Estimate confidence intervals of Sen's slope

%      The next line requires STATISTICS Toolbox (norminv)

%      Zup is a 2-tail Z (i.e. alpha/2)

%

% Hollander, M. and Wolfe, D. 1973, Nonparametric statistical methods,

% Wiley, New York. Chapter 9 (Regression problems involving slope),

%      Section 3 (A distribution-free confidence interval based on the

%      Theil test; p. 207 - 208)

Zup = norminv(1-alpha/2,0,1);

Calpha = Zup * sigma;

Nprime = length(C3);

M1 = (Nprime - Calpha)/2;

M2 = (Nprime + Calpha)/2 + 1;

% 2-tail limits

CIlower = interp1q((1:Nprime),C3,M1);

CIupper = interp1q((1:Nprime),C3,M2);

%     clear M1 M2 NPrime Zup Calpha

% h = 1 : means significance

% h = 0 : means not significant (i.e. sig < z(sig))

if s==0

% Not possible to be 100% certain, force S = 1 and compute p-value

% using sigma.

[h, sig] = ztest(1,0,sigma,alpha);

fprintf('nTaub Message: S = 0. P-value cannot = 100-percent. ');

fprintf('n              P-value is adjusted using S = 1 and should be reported as p > %1.5f.n',sig);

if sen~=0

fprintf('nTaub Message: A non-zero Sens slope occurred when S =0.');

fprintf('n              This is not an error, more a notification.');

fprintf('n              This anomaly may occur because the median may be computed');

fprintf('n              on one value equal to zero and one non-zero, etc.n');

end

else

[h, sig] = ztest(s,0,sigma,alpha);

end

% Notify for Sens slope = 0 but is determined significant

if h==1 && sen==0

[hh, nsig] = ztest(s,0,nsigma,alpha);

fprintf('nTaub Message:  There was a significant trend = 0 found.n');

fprintf('               Retested with ties set to equal number of positve and negative values.n');

fprintf('               New p-value = %1.5f',nsig);

if hh==1

fprintf('.  However trend still found to be significant.n');

else

fprintf(', but trend is not found to be significant.n');

end

end

%% Plotting routine

% Below is a very simplistic plotting routine to plot the Sen slope if

% the significance is less than 0.05. Uncomment or delete at your

% leisure.

if sig<=alpha && wantplot ~= 0  %A plotting example CLD added 5/3/2006

% Revised plotting 6/14/2011 - JJB

% Plots the slopes using the median value as the focus point for

% all three slopes (Sen's and confidence slopes). Is this the

% correct method? Don't know but seems reasonable. - JJB 6/15/2011

hold on

%generate points to represent median slope

%zero time for the calculation is the first time point

vv = median(datain(:,2));

middata = datain(round(length(datain)/2),1);

slope = vv + sen*(datain(:,1)-middata);

senplot = [datain(:,1) slope];

plot(datain(:,1),datain(:,2),'o')

plot(datain(:,1),slope,'-')

% add confidence intervals

slope = vv + CIlower*(datain(:,1)-middata);

plot(datain(:,1),slope,'--');

slope = vv + CIupper*(datain(:,1)-middata);

plot(datain(:,1),slope,'--');

box on

grid on

hold off

%         pause

end

end

转载本文请联系原作者获取授权,同时请注明本文来自王国杰科学网博客。

链接地址:http://blog.sciencenet.cn/blog-569118-724657.html

上一篇:Theil Sen Trend Estimator

下一篇:remapping ERA_land soil moisture reanalysis

 类似资料: