Reviews & Opinions

# Manual

Here you can find all about Matlab Statistics Toolbox 7 like manual and other informations. For example: review.

On the bottom of page users can write a review. If you own a Matlab Statistics Toolbox 7 please write about it to help other people.

[ Report abuse or wrong photo | Share your Matlab Statistics Toolbox 7 photo ]

### User reviews and opinions

 master_Den 4:53pm on Thursday, October 28th, 2010 While the design of the TI-89 has stayed constant for many years (and realistically should be updated and modernized at some point) it really does wha... Alucard 2:19am on Wednesday, August 18th, 2010 I got this for \$128 after taxes and rebates from CircuitCity and I really like it. The resolution is decent. arnie 1:19pm on Monday, July 19th, 2010 Easy to use and programme. When first receiving this calculator I found it very easy to use, with a symbolic navigation. bryancole 2:41pm on Tuesday, April 20th, 2010 The TI-89 is the best calculator on the market other than the TI-Cas. It does 3d graphing and integrals but partials would be nice. Needs a polar. I love this calculator! It does everything that I need! I makes Calculus a breeze. it seems i have to keep talking and talking until this box is full.

Comments posted on www.ps2netdrivers.net are solely the views and opinions of the people posting them and do not necessarily reflect the views or opinions of us.

### Documents

#### Normal Distribution

The following sections provide an overview of the normal distribution.
Background of the Normal Distribution
The normal distribution is a two parameter family of curves. The first parameter, , is the mean. The second, , is the standard deviation. The standard normal distribution (written (x)) sets to 0 and to 1. (x) is functionally related to the error function, erf. erf ( x ) = 2 ( x 2 ) 1 The first use of the normal distribution was as a continuous approximation to the binomial. The usual justification for using the normal distribution for modeling is the Central Limit Theorem, which states (roughly) that the sum of independent samples from any distribution with finite mean and variance converges to the normal distribution as the sample size goes to infinity.
Definition of the Normal Distribution

#### The normal pdf is

1 y = f ( x , ) = -------------- e 2
( x ) ---------------------2 2
Parameter Estimation for the Normal Distribution
To use statistical parameters such as mean and standard deviation reliably, you need to have a good estimator for them. The maximum likelihood estimates (MLEs) provide one such estimator. However, an MLE might be biased, which means that its expected value of the parameter might not equal the parameter being estimated. For example, an MLE is biased for estimating the variance of a normal distribution. An unbiased estimator that is commonly used to estimate the parameters of the normal distribution is the minimum variance unbiased estimator (MVUE). The MVUE has the minimum variance of all unbiased estimators of a parameter. The MVUEs of parameters and 2 for the normal distribution are the sample average and variance. The sample average is also the MLE for. The following are two common formulas for the variance.

( xi x )

#### 1 s = -----------n1

( xi x)
Equation 1 is the maximum likelihood estimator for 2, and equation 2 is the MVUE. As an example, suppose you want to estimate the mean, , and the variance, 2, of the heights of all 4th grade children in the United States.The function normfit returns the MVUE for , the square root of the MVUE for 2, and
confidence intervals for and 2. Here is a playful example modeling the heights in inches of a randomly chosen 4th grade class.
height = normrnd(50,2,30,1); [mu,s,muci,sci] = normfit(height) mu = 50.2025 s = 1.7946 muci = 49.5210 50.8841 sci = 1.4292 2.4125 % Simulate heights.

Continuing the example from the preceding section, suppose you are studying a few factories but you want information about what would happen if you build these same car models in a different factory either one that you already have or another that you might construct. To get this information, fit the analysis of variance model, specifying a model that includes an interaction term and that the factory factor is random.
[pvals,tbl,stats] = anovan(mileage, {factory carmod},. 'model',2, 'random',1,'varnames',{'Factory' 'Car Model'});
In the fixed effects version of this fit, which you get by omitting the inputs 'random',1 in the preceding code, the effect of car model is significant, with a p-value of 0.0039. But in this example, which takes into account the random variation of the effect of the variable 'Car Model' from one factory to another, the effect is still significant, but with a higher p-value of 0.0136.
F Statistics for Models with Random Effects
The F statistic in a model having random effects is defined differently than in a model having all fixed effects. In the fixed effects model, you compute the F statistic for any term by taking the ratio of the mean square for that term with the mean square for error. In a random effects model, however, some F statistics use a different mean square in the denominator. In the example described in Setting Up the Model on page 4-18, the effect of the variable 'Factory' could vary across car models. In this case, the interaction mean square takes the place of the error mean square in the F statistic. The F statistic for factory is
F = 1.445 / 0.02 F = 72.2500
The degrees of freedom for the statistic are the degrees of freedom for the numerator (1) and denominator (2) mean squares. Therefore the p-value for the statistic is

#### pval = 1 - fcdf(F,1,2)

pval = 0.0136
With random effects, the expected value of each mean square depends not only on the variance of the error term, but also on the variances contributed by the random effects. You can see these dependencies by writing the expected values as linear combinations of contributions from the various model terms. To find the coefficients of these linear combinations, enter stats.ems, which returns the ems field of the stats structure.

classified as primarily technology, the next three as financial, and the last three as retail. It seems reasonable that the stock prices for companies that are in the same sector might vary together as economic conditions change. Factor Analysis can provide quantitative evidence that companies within each sector do experience similar week-to-week changes in stock price. In this example, you first load the data, and then call factoran, specifying a model fit with three common factors. By default, factoran computes rotated estimates of the loadings to try and make their interpretation simpler. But in this example, you specify an unrotated solution.
The first two factoran return arguments are the estimated loadings and the estimated specific variances. Each row of the loadings matrix represents one of the ten stocks, and each column corresponds to a common factor. With unrotated estimates, interpretation of the factors in this fit is difficult because most of the stocks contain fairly large coefficients for two or more factors.
0.2367 0.3862 0.2784 0.1113 -0.6643 -0.6383 -0.5416 0.1669 0.5293 0.1680
-0.2354 0.0034 -0.0211 -0.1905 0.1478 0.0133 0.0322 0.4960 0.5770 0.5524
Note Factor Rotation on page 6-15 helps to simplify the structure in the Loadings matrix, to make it easier to assign meaningful interpretations to the factors.
From the estimated specific variances, you can see that the model indicates that a particular stock price varies quite a lot beyond the variation due to the common factors.
specificVar specificVar = 0.0991 0.3431 0.8097 0.8559 0.1429 0.3691 0.6928 0.3162 0.3311 0.6544
A specific variance of 1 would indicate that there is no common factor component in that variable, while a specific variance of 0 would indicate that the variable is entirely determined by common factors. These data seem to fall somewhere in between. The p-value returned in the stats structure fails to reject the null hypothesis of three common factors, suggesting that this model provides a satisfactory explanation of the covariation in these data.

Z = linkage(Y) Z = 1.0000 3.0000 4.0000 5.0000 6.0000 7.0000 8.0000 2.0000
1.0000 1.0000 2.0616 2.5000
In this output, each row identifies a link. The first two columns identify the objects that have been linked, that is, object 1, object 2, and so on. The third column contains the distance between these objects. For the sample data set of x and y coordinates, the linkage function begins by grouping together objects 1 and 3, which have the closest proximity (distance value = 1.0000). The linkage function continues by grouping objects 4 and 5, which also have a distance value of 1.0000. The third row indicates that the linkage function grouped together objects 6 and 7. If the original sample data set contained only five objects, what are objects 6 and 7? Object 6 is the newly formed binary cluster created by the grouping of objects 1 and 3. When the linkage function groups two objects together into a new cluster, it must assign the cluster a unique index value, starting with the value m+1, where m is the number of objects in the original data set. (Values 1 through m are already used by the original data set.) Object 7 is the index for the cluster formed by objects 4 and 5.
As the final cluster, the linkage function grouped object 8, the newly formed cluster made up of objects 6 and 7, with object 2 from the original data set. The following figure graphically illustrates the way linkage groups the objects into a hierarchy of clusters.
Plotting the Cluster Tree
The hierarchical, binary cluster tree created by the linkage function is most easily understood when viewed graphically. The Statistics Toolbox includes the dendrogram function that plots this hierarchical tree information as a graph, as in the following example.

#### dendrogram(Z)

In the figure, the numbers along the horizontal axis represent the indices of the objects in the original data set. The links between objects are represented as upside down U-shaped lines. The height of the U indicates the distance between the objects. For example, the link representing the cluster containing objects 1 and 3 has a height of 1. For more information about creating a dendrogram diagram, see the dendrogram function reference page.

selection of the input file.
Read the file months.dat created using the function casewrite on the next page.
type months.dat January February March April May names = caseread('months.dat') names = January February March April May

#### casewrite

12casewrite
Write casenames from a string matrix to a file
casewrite(strmat,'filename') casewrite(strmat) casewrite(strmat,'filename') writes the contents of string matrix strmat to filename. Each row of strmat represents one casename. filename is the
name of a file in the current directory, or the complete pathname of any file elsewhere. casewrite writes each name to a separate line in filename.
casewrite(strmat) displays the Select File to Write dialog box for interactive
specification of the output file.
strmat = str2mat('January','February','March','April','May') strmat = January February March April May casewrite(strmat,'months.dat') type months.dat January February March April May

#### ccdesign

12ccdesign
Generate central composite design
D = ccdesign(nfactors) D = ccdesign(nfactors,'pname1',pvalue1,'pname2',pvalue2,.) [D,blk] = ccdesign(.) D = ccdesign(nfactors) generates a central composite design for nfactors factors. The output matrix D is n-by-nfactors, where n is the number of points
in the design. Each row represents one run of the design, and it has the settings of all factors for that run. Factor values are normalized so that the cube points take values between -1 and 1.
[D,blk] = ccdesign(nfactors) requests a blocked design. The output vector blk is a vector of block numbers. Blocks are groups of runs that are to be
[.] = ccdesign(nfactors,'pname1',pvalue1,'pname2',pvalue2,.)
enables you to specify additional parameters and their values. Valid parameters are:

#### 'center'

Number of center points: Integer

#### 'uniform' 'orthogonal'

Specific number of center points to include Number of center points is selected to give uniform precision Number of center points is selected to give an orthogonal design (default)

#### 'fraction'

Fraction of full factorial for cube portion expressed as an exponent of 1/2. For example:
Whole design 1/2 fraction 1/4 fraction

load popcorn popcorn popcorn = 5.5000 5.5000 6.0000 6.5000 7.0000 7.0000
4.5000 4.5000 4.0000 5.0000 5.5000 5.0000
3.5000 4.0000 3.0000 4.0000 5.0000 4.5000
p = friedman(popcorn,3) p = 0.0010

#### 12-154

The small p-value of 0.001 indicates the popcorn brand affects the yield of popcorn. This is consistent with the results from anova2. You could also test popper type by permuting the popcorn array as described on Friedmans Test on page 4-60 and repeating the test.
[1] Hogg, R. V. and J. Ledolter, Engineering Statistics, MacMillan Publishing Company, 1987. [2] Hollander, M. and D. A. Wolfe, Nonparametric Statistical Methods, Wiley, 1973.
anova2, multcompare, kruskalwallis

#### 12-155

12frnd
Random numbers from the F distribution
R = frnd(V1,V2) R = frnd(V1,V2,m) R = frnd(V1,V2,m,n) R = frnd(V1,V2) generates random numbers from the F distribution with numerator degrees of freedom V1 and denominator degrees of freedom V2. Vector or matrix inputs for V1 and V2 must have the same size, which is also the size of R. A scalar input for V1 or V2 is expanded to a constant matrix with
the same dimensions as the other input.
R = frnd(V1,V2,m) generates random numbers from the F distribution with parameters V1 and V2, where m is a 1-by-2 vector that contains the row and column dimensions of R. R = frnd(V1,V2,m,n) generates random numbers from the F distribution with parameters V1 and V2, where scalars m and n are the row and column dimensions of R.
Reproducing the Output of frnd
frnd uses the MATLAB functions rand and randn to generate random numbers. When you call frnd, you change the current states of rand and randn, and thereby alter the output of subsequent calls to frnd or any other functions that depend on rand or randn. If you want to reproduce the output of frnd, reset the states of rand and randn to the same fixed values each time you call frnd.
For an example of how to do this, and a list of the Statistics Toolbox functions that depend on rand or randn, see Reproducing the Output of Random Number Functions on page 2-10.

12-156

0.3121

0.3189

#### 0.2715

0.9539
n2 = frnd(2,2,[2 3]) n2 = 0.3186 0.2052

#### 0.9727 148.5816

3.0268 0.2191
n3 = frnd([3;6],1,2,3) n3 = 0.6233 2.5848

31.5458 4.4955

12-157

#### 12fstat

Mean and variance for the F distribution
[M,V] = fstat(V1,V2) [M,V] = fstat(V1,V2) returns the mean and variance for the F distribution with parameters specified by V1 and V2. Vector or matrix inputs for V1 and V2 must have the same size, which is also the size of M and V. A scalar input for V1 or V2 is expanded to a constant matrix with the same dimensions as the other
input. The mean of the F distribution for values of 2 greater than 2 is 2 -----------The variance of the F distribution for values of 2 greater than 4 is ( 1 + ) ------------------------------------------------( ) ( ) The mean of the F distribution is undefined if 2 is less than 3. The variance is undefined for 2 less than 5.
fstat returns NaN when the mean and variance are undefined. [m,v] = fstat(1:5,1:5) m = NaN v = NaN NaN NaN NaN 8.8889 NaN 3.0000 2.0000 1.6667

12-158

#### fsurfht

12fsurfht
Interactive contour plot of a function
fsurfht('fun',xlims,ylims) fsurfht('fun',xlims,ylims,p1,p2,p3,p4,p5) fsurfht('fun',xlims,ylims) is an interactive contour plot of the function specified by the text variable fun. The x-axis limits are specified by xlims in the form [xmin xmax], and the y-axis limits are specified by ylims in the form [ymin ymax]. fsurfht('fun',xlims,ylims,p1,p2,p3,p4,p5) allows for five optional parameters that you can supply to the function fun.
The intersection of the vertical and horizontal reference lines on the plot defines the current x-value and y-value. You can drag these reference lines and watch the calculated z-values (at the top of the plot) update simultaneously. Alternatively, you can type the x-value and y-value into editable text fields on the x-axis and y-axis.
Plot the Gaussian likelihood function for the gas.mat data.

Create a function containing the following commands, and name it gauslike.m.
function z = gauslike(mu,sigma,p1) n = length(p1); z = ones(size(mu)); for i = 1:n z = z.* (normpdf(p1(i),mu,sigma)); end
The gauslike function calls normpdf, treating the data sample as fixed and the parameters and as variables. Assume that the gas prices are normally distributed, and plot the likelihood surface of the sample.

(The default confidence level is 0.95 for 95% confidence.) The interval
[yfit-dlo, yfit+dhi] is a confidence bound for the true parameter value at the specified X values. [yhat,dlo,dhi] = glmval(beta,X,'link',stats,clev,N,offset,'const')
specifies three additional arguments that may be needed if you used certain arguments to glmfit. If you fit a binomial distribution using glmfit, specify N as the value of the binomial N parameter for the predictions. If you included an offset variable, specify offset as the new value of this variable. Use the same 'const' value ('on' or 'off') that you used with glmfit.
Lets model the number of cars with poor gasoline mileage using the binomial distribution. First, use the binomial distribution with the default logit link to model the probability of having poor mileage as a function of the weight and squared weight of the cars. Then you compute a vector wnew of new car weights at which you want to make predictions. Next you compute the expected number of cars, out of a total of 30 cars of each weight, that would have poor mileage. Finally you graph the predicted values and 95% confidence bounds as a function of weight.
w = [4100 4300]'; poor = [17 21]'; total = [17 21]';

#### 12-185

[b2,d2,s2] = glmfit([w w.^2],[poor total],'binomial') wnew = (3000:100:4000)'; [yfit,dlo,dhi] = glmval(b2,[wnew wnew.^2],'logit',s2,0.95,30) errorbar(wnew,yfit,dlo,dhi);

glmfit, glmdemo

#### 12-186

12gname
Label plotted points with their case names or case number
gname(cases) gname h = gname(cases,line_handle) gname(cases) displays a figure window and waits for you to press a mouse button or a keyboard key. The input argument cases is a character array or a
cell array of strings, in which each row of the character array or each element of the cell array contains the case name of a point. Moving the mouse over the graph displays a pair of cross-hairs. If you position the cross-hairs near a point with the mouse and click once, the graph displays the name of the city corresponding to that point. Alternatively, you can click and drag the mouse to create a rectangle around several points. When you release the mouse button, the graph displays the labels for all points in the rectangle. Right-click a point to remove its label. When you are done labelling points, press the Enter or Escape key to stop labeling.

[h p l c] = lillietest(log(Weight)) ans = 0 0.13481 0.077924 0.0886
Now the p-value is approximately 0.13, so you do not reject the hypothesis.
[1] Conover, W. J. (1980). Practical Nonparametric Statistics. New York, Wiley.

#### hist, jbtest, kstest2

12-248

Create hierarchical cluster tree
Z = linkage(Y) Z = linkage(Y,'method') Z = linkage(Y) creates a hierarchical cluster tree, using the Single Linkage algorithm. The input matrix, Y, is a distance vector of length
( ( m 1 ) m 2 ) -by-1, where m is the number of objects in the original data set. You can generate such a vector with the pdist function. Y can also be a more general dissimilarity matrix conforming to the output format of pdist.
Z = linkage(Y,'method') computes a hierarchical cluster tree using the algorithm specified by 'method', where 'method' can be any of the following
character strings that identify ways to create the cluster hierarchy. Their definitions are explained in Mathematical Definitions on page 12-250.
'single' 'complete' 'average' 'centroid' 'ward'
Shortest distance (default) Largest distance Average distance Centroid distance. The output Z is meaningful only if Y contains Euclidean distances. Incremental sum of squares
The output, Z, is an (m-1)-by-3 matrix containing cluster tree information. The leaf nodes in the cluster hierarchy are the objects in the original data set, numbered from 1 to m. They are the singleton clusters from which all higher clusters are built. Each newly formed cluster, corresponding to row i in Z, is assigned the index m+i, where m is the total number of initial leaves. Columns 1 and 2, Z(i,1:2), contain the indices of the objects that were linked in pairs to form a new cluster. This new cluster is assigned the index value m+i. There are m-1 higher clusters that correspond to the interior nodes of the hierarchical cluster tree. Column 3, Z(i,3), contains the corresponding linkage distances between the objects paired in the clusters at each row i.

#### 12-249

For example, consider a case with 30 initial nodes. If the tenth cluster formed by the linkage function combines object 5 and object 7 and their distance is 1.5, then row 10 of Z will contain the values (5, 7, 1.5). This newly formed cluster will have the index 10+30=40. If cluster 40 shows up in a later row, that means this newly formed cluster is being combined again into some bigger cluster.

#### Mathematical Definitions

The 'method' argument is a character string that specifies the algorithm used to generate the hierarchical cluster tree information. These linkage algorithms are based on various measurements of proximity between two groups of objects. If nr is the number of objects in cluster r and ns is the number of objects in cluster s, and xri is the ith object in cluster r, the definitions of these various measurements are as follows: Single linkage, also called nearest neighbor, uses the smallest distance between objects in the two groups. d ( r, s ) = min ( dist ( x ri, x sj ) ), i ( i, , n r ), j ( 1, , n s ) Complete linkage, also called furthest neighbor, uses the largest distance between objects in the two groups. d ( r, s ) = max ( dist ( x ri, x sj ) ), i ( 1, , n r ), j ( 1, , n s ) Average linkage uses the average distance between all pairs of objects in cluster r and cluster s. 1 d ( r, s ) = ----------nr ns

cdf, nbinfit, nbininv, nbinpdf, nbinrnd, nbinstat

nbinfit

#### 12nbinfit

Parameter estimates and confidence intervals for negative binomial data
parmhat = nbinfit(data) [parmhat,parmci] = nbinfit(data,alpha) [.] = nbinfit(data,alpha,options) parmhat = nbinfit(data) returns the maximum likelihood estimates (MLEs)
of the parameters of the negative binomial distribution given the data in the vector data.
[parmhat,parmci] = nbinfit(data,alpha) returns MLEs and 100*(1-alpha) percent confidence intervals. By default, alpha = 0.05, which
corresponds to 95% confidence intervals.
[.] = nbinfit(data,alpha,options) saccepts a structure, options, that
specifies control parameters for the iterative algorithm the function uses to compute maximum likelihood estimates. You can create options using the function statset. Enter statset('nbinfit') to see the names and default values of the parameters that nbinfit accepts in the options structure. See the reference page for statset for more information about these options.
Note The variance of a negative binomial distribution is greater than its mean. If the sample variance of the data in data is less than its sample mean, nbinfit cannot compute MLEs. You should use the poissfit function instead.
nbincdf, nbininv, nbinpdf, nbinrnd, nbinstat, mle, statset

nbininv

#### 12nbininv

Inverse of the negative binomial cumulative distribution function (cdf)
X = nbininv(Y,R,P) X = nbininv(Y,R,P) returns the inverse of the negative binomial cdf with parameters R and P at the corresponding probabilities in P. Since the binomial distribution is discrete, nbininv returns the least integer X such that the negative binomial cdf evaluated at X equals or exceeds Y. Vector or matrix inputs for Y, R, and P must have the same size, which is also the size of X. A scalar input for Y, R, or P is expanded to a constant matrix with the same
dimensions as the other inputs. The simplest motivation for the negative binomial is the case of successive random trials, each having a constant probability P of success. The number of extra trials you must perform in order to observe a given number R of successes has a negative binomial distribution. However, consistent with a more general interpretation of the negative binomial, nbininv allows R to be any positive value, including nonintegers.
How many times would you need to flip a fair coin to have a 99% probability of having observed 10 heads?
flips = nbininv(0.99,10,0.5) + 10 flips = 33
Note that you have to flip at least 10 times to get 10 heads. That is why the second term on the right side of the equals sign is a 10.

Reproducing the Output of nctrnd
nctrnd uses the MATLAB functions rand and randn to generate random numbers. When you call nctrnd, you change the current states of rand and randn, and thereby alter the output of subsequent calls to nctrnd or any other functions that depend on rand or randn. If you want to reproduce the output of nctrnd, reset the states of rand and randn to the same fixed values each time you call nctrnd. For an example of how to do this, and a list of the Statistics Toolbox functions that depend on rand or randn, see Reproducing the Output
Note The result in the following example depends on the current states of rand and randn. If you run the code in these examples, your results may differ from the answer shown here.

#### nctrnd(10,1,5,1) ans =

12-316
1.6576 1.0617 1.4491 0.2930 3.6297
nctcdf, nctinv, nctpdf, nctstat

nctstat

#### 12nctstat

Mean and variance for the noncentral t distribution
[M,V] = nctstat(NU,DELTA) [M,V] = nctstat(NU,DELTA) returns the mean and variance of the noncentral t pdf with NU degrees of freedom and noncentrality parameter DELTA. Vector or matrix inputs for NU and DELTA must have the same size, which is also the size of M and V. A scalar input for NU or DELTA is expanded to a
constant matrix with the same dimensions as the other input. The mean of the noncentral t distribution with parameters and is ( 2 ) (( 1) 2 ) -----------------------------------------------------------( 2 ) where > 1. The variance is (( 1 ) 2) ---------------- ( 1 + 2 ) -- 2 -------------------------------( 2) ( 2 ) 2 where > 2.
[m,v] = nctstat(10,1) m = 1.0837 v = 1.3255
nctcdf, nctinv, nctpdf, nctrnd

ncx2cdf

#### 12ncx2cdf

Noncentral chi-square cumulative distribution function (cdf)
P = ncx2cdf(X,V,DELTA) P = ncx2cdf(X,V,DELTA) computes the noncentral chi-square cdf at each of the values in X using the corresponding degrees of freedom in V and positive noncentrality parameters in DELTA. Vector or matrix inputs for X, V, and DELTA must have the same size, which is also the size of P. A scalar input for X, V, or DELTA is expanded to a constant matrix with the same dimensions as the other
inputs. Some texts refer to this distribution as the generalized Rayleigh, Rayleigh-Rice, or Rice distribution. The noncentral chi-square cdf is 1 -- - 2 F ( x , ) = ------------- e Pr [ + 2j x ] j! j = 0
x = (0:0.1:10)'; p1 = ncx2cdf(x,4,2); p = chi2cdf(x,4); plot(x,p,'--',x,p1,'-')
[1] Johnson, N., and S. Kotz, Distributions in Statistics: Continuous Univariate Distributions-2, John Wiley and Sons, 1970. pp. 130148.

#### 12-412

'Jaccard'
Percentage of non-zero coordinates that differ A numeric distance matrix in upper triangular vector form, such as is created by pdist. X is not used in this case, and can safely be set to [].
[.] = silhouette(X,clust,distfun,p1,p2,.) accepts a distance
d = distfun(X0,X,p1,p2,.)
where X0 is a 1-by-p point, X is an n-by-p matrix of points, and p1,p2,. are optional additional arguments. The function distfun returns an n-by-1 vector d of distances between X0 and each point (row) in X. The arguments p1, p2,. are passed directly to the function distfun.
The silhouette value for each point is a measure of how similar that point is to points in its own cluster compared to points in other clusters, and ranges from -1 to +1. It is defined as
S(i) = (min(b(i,:),2) - a(i))./ max(a(i),min(b(i,:),2))
where a(i) is the average distance from the ith point to the other points in its cluster, and b(i,k) is the average distance from the ith point to points in another cluster k.
X = [randn(10,2)+ones(10,2); randn(10,2)-ones(10,2)]; cidx = kmeans(X,2,'distance','sqeuclid'); s = silhouette(X,cidx,'sqeuclid'); dendrogram, kmeans, linkage, pdist
[1] Kaufman L. and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, 1990

skewness

#### 12skewness

Sample skewness
y = skewness(X) y = skewness(X,flag) y = skewness(X) returns the sample skewness of X. For vectors, skewness(x) is the skewness of the elements of x. For matrices, skewness(X) is a row vector
containing the sample skewness of each column. Skewness is a measure of the asymmetry of the data around the sample mean. If skewness is negative, the data are spread out more to the left of the mean than to the right. If skewness is positive, the data are spread out more to the right. The skewness of the normal distribution (or any perfectly symmetric distribution) is zero. The skewness of a distribution is defined as E(x ) y = -----------------------3
y = skewness(X,flag) specifies whether to correct for bias (flag = 0) or not (flag = 1, the default). When X represents a sample from a population, the skewness of X is biased; that is, it will tend to differ from the population

Statistics Toolbox Release Notes

#### How to Contact MathWorks

Web Newsgroup www.mathworks.com/contact_TS.html Technical Support
www.mathworks.com comp.soft-sys.matlab suggest@mathworks.com bugs@mathworks.com doc@mathworks.com service@mathworks.com info@mathworks.com
Product enhancement suggestions Bug reports Documentation error reports Order status, license renewals, passcodes Sales, pricing, and general information
508-647-7000 (Phone) 508-647-7001 (Fax) The MathWorks, Inc. 3 Apple Hill Drive Natick, MA 01760-2098
For contact information about worldwide offices, see the MathWorks Web site. Statistics Toolbox Release Notes COPYRIGHT 20052011 by The MathWorks, Inc.
The software described in this document is furnished under a license agreement. The software may be used or copied only under the terms of the license agreement. No part of this manual may be photocopied or reproduced in any form without prior written consent from The MathWorks, Inc. FEDERAL ACQUISITION: This provision applies to all acquisitions of the Program and Documentation by, for, or through the federal government of the United States. By accepting delivery of the Program or Documentation, the government hereby agrees that this software or documentation qualifies as commercial computer software or commercial computer software documentation as such terms are used or defined in FAR 12.212, DFARS Part 227.72, and DFARS 252.227-7014. Accordingly, the terms and conditions of this Agreement and only those rights specified in this Agreement, shall pertain to and govern the use, modification, reproduction, release, performance, display, and disclosure of the Program and Documentation by the federal government (or other entity acquiring for or through the federal government) and shall supersede any conflicting contractual terms or conditions. If this License fails to meet the governments needs or is inconsistent in any respect with federal procurement law, the government agrees to return the Program and Documentation, unused, to The MathWorks, Inc.

#### Contents

Summary by Version. Version 7.5 (R2011a) Statistics Toolbox Software. Version 7.4 (R2010b) Statistics Toolbox Software. Version 7.3 (R2010a) Statistics Toolbox Software. Version 7.2 (R2009b) Statistics Toolbox Software. Version 7.1 (R2009a) Statistics Toolbox Software. Version 7.0 (R2008b) Statistics Toolbox Software. Version 6.2 (R2008a) Statistics Toolbox Software. Version 6.1 (R2007b) Statistics Toolbox Software. Version 6.0 (R2007a) Statistics Toolbox Software. Version 5.3 (R2006b) Statistics Toolbox Software. Version 5.2 (R2006a) Statistics Toolbox Software. Version 5.1 (R14SP3) Statistics Toolbox Software. Version 5.0.2 (R14SP2) Statistics Toolbox Software. Compatibility Summary for Statistics Toolbox Software. 40 44

#### Summary by Version

This table provides quick access to whats new in each version. For clarification, see Using Release Notes on page 2. Version (Release) New Features and Changes Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Yes Details Version Compatibility Considerations No Yes Summary No No No Yes Summary Yes Summary Yes Summary Yes Summary Yes Summary No No No Fixed Bugs and Known Problems Bug Reports Includes fixes Bug Reports Includes fixes Bug Reports Includes fixes Bug Reports Includes fixes Bug Reports Includes fixes No Bug Reports Includes fixes Bug Reports Includes fixes Bug Reports Includes fixes Bug Reports Includes fixes Bug Reports Includes fixes No Bug Reports Includes fixes
Latest Version V7.5 (R2011a) V7.4 (R2010b) V7.3 (R2010a) V7.2 (R2009b) V7.1 (R2009a) V7.0 (R2008b) V6.2 (R2008a) V6.1 (R2007b) V6.0 (R2007a) V5.3 (R2006b) V5.2 (R2006a) V5.1 (R14SP3) V5.0.2 (R14SP2)

#### Using Release Notes

Use release notes when upgrading to a newer version to learn about: New features Changes Potential impact on your existing files and practices Review the release notes for other MathWorks products required for this product (for example, MATLAB or Simulink). Determine if enhancements, bugs, or compatibility considerations in other products impact you. If you are upgrading from a software version other than the most recent one, review the current release notes and all interim versions. For example, when you upgrade from V1.0 to V1.2, review the release notes for V1.1 and V1.2.

Boosted Decision Trees for Classification and Regression
The new fitensemble function constructs ensembles of decision trees. It provides: Several popular boosting algorithms (AdaBoostM1, AdaBoostM2, GentleBoost, LogitBoost, and RobustBoost) for classification Least-squares boosting (LSBoost) for regression Most TreeBagger functionality for ensembles of bagged decision trees
There is also an improved interface for classification trees (ClassificationTree) and regression trees (RegressionTree), encompassing the functionality of classregtree. For details, see Ensemble Methods.
Memory and Performance Improvements in Linkage Methods
The linkage and clusterdata functions have a new savememory option that can use less memory than before. With savememory set to 'on', the functions do not build a pairwise distance matrix, so use less memory and, depending on problem size, can use less time. You can use the savememory option when: The linkage method is 'ward', 'centroid', or 'median' The linkage distance metric is 'euclidean' (default) For details, see the linkage and clusterdata function reference pages.
Conditional Weighted Residuals and Derivative Step Control in nlmefit and nlmefitsa
The nlmefit and nlmefitsa functions now provide the conditional weighted residuals of the fit. Use this information to assess the quality of the model; see Example: Examining Residuals for Model Verification. The statset Options structure now includes 'DerivStep', which enables you to set finite differences for gradient estimation.
Detecting Ties in k-Nearest Neighbor Search
knnsearch now optionally returns all kth nearest neighbors of points, instead of just one. The knnsearch methods for ExhaustiveSearcher and KDTreeSearcher also have this option.
Distribution Fitting Tool Uses fitdist Function
MATLAB functions generated with the Distribution Fitting Tool now use the fitdist function to create fitted probability distribution objects.
The generated functions return probability distribution objects as output arguments.
Speed and Accuracy Improvements in Noncentral Chi-Square CDF
ncx2cdf is now faster and more accurate for large values of the noncentrality

#### parameter.

Perfect Separation in Binomial Regression
If the two categories in a binomial regression model (such as logit or probit) are perfectly separated, the best-fitting model is degenerate with infinite coefficients. In this case, the glmfit function is likely to exceed its iteration limit. glmfit now tries to detect this perfect separation and display a diagnostic message.

Compatibility Considerations
In the previous release, the rmse field was used by nlmefitsa for both mean squared residual and the estimated error parameter. Change your code, if necessary, to address the appropriate field in the stats structure. As described in nlmefit Support for Error Models, and nlmefitsa changes on page 8, nlmefit now calculates different bic values than in previous releases.
Surrogate Splits for Decision Trees
The new surrogate splits feature in classregtree allows for better handling of missing values, more accurate estimation of variable importance, and calculation of the predictive measure of association between variables.
New Bagged Decision Tree Properties
TreeBagger and CompactTreeBagger classes have two new properties:
NVarSplit provides the number of decision splits for each predictor variable. VarAssoc provides a measure of association between pairs of predictor variables.
Enhanced Cluster Analysis Performance
The linkage function has improved performance for the centroid, median, and single linkage methods. The linkage and pdist hierarchical cluster analysis functions support larger array dimensions with 64-bit platforms, so can handle larger problems.
Export Probability Objects with dfittool
The distribution fitting GUI (dfittool) now allows you to export fits to the MATLAB workspace as probability distribution fit objects. For more information, see Modeling Data Using the Distribution Fitting Tool.
If you load a distribution fitting session that was created with previous versions of Statistics Toolbox, you cannot save an existing fit. Fit the distribution again to enable saving.
Compute Partial Correlation of Two Variables Correcting for All Other Variables
partialcorr now accepts a new syntax, RHO = partialcorr(X), which
returns the sample linear partial correlation coefficients between pairs of variables in X, controlling for the remaining variables in X. For more information, see the function reference page.
Specify Number of Evenly Spaced Quantiles
quantile now accepts a new syntax, Y = quantile(X,N,.), which returns quantiles at the cumulative probabilities (1:N)/(N+1) where N is a scalar

#### positive integer value.

Control Location and Orientation of Marginal Histograms with scatterhist
scatterhist now accepts three parameter name/value pairs that control
where and how the histogram plots appear. The new parameter names are

#### reference page.

Return Bootstrapped Statistics with bootci
bootci has a new output option which returns the bootstrapped statistic computed for each of the NBoot bootstrap replicate samples. For more

information, see the function reference page.
Version 7.3 (R2010a) Statistics Toolbox Software
This table summarizes whats new in Version 7.3 (R2010a): New Features and Changes Yes Details below Version Compatibility Considerations No Fixed Bugs and Known Problems Bug Reports Includes fixes
New features and changes introduced in this version are: Stochastic Algorithm Functionality in NLME Models on page 11 k-Nearest Neighbor Searching on page 11 Confidence Intervals Option in perfcurve on page 11 Observation Weights Options in Resampling Functions on page 12
Stochastic Algorithm Functionality in NLME Models
New stochastic algorithm for fitting NLME models is more robust with respect to starting values, enables parameter transformations, and relaxes assumption of constant error variance. See nlmefitsa.
k-Nearest Neighbor Searching
New functions for k-Nearest Neighbor (kNN) search efficiently to find the closest points to any query point. For information, see k-Nearest Neighbor Search.
Confidence Intervals Option in perfcurve
A new option in the perfcurve function computes confidence intervals for classifier performance curves.
Observation Weights Options in Resampling Functions
New options to weight resampling probabilities broaden the range of models supported by bootstrp, bootci, and perfcurve functions.
Version 7.2 (R2009b) Statistics Toolbox Software
This table summarizes whats new in Version 7.2 (R2009b): New Features and Changes Yes Details below Version Compatibility Considerations No Fixed Bugs and Known Problems Bug Reports Includes fixes
New features and changes introduced in this version are: New Parallel Computing Support for Certain Functions on page 13 New Stack and Unstack Methods for Dataset Arrays on page 13 New Support for SAS Transport (.xpt) Files on page 14 New Output Function in nlmefit for Monitoring or Canceling Calculations on page 14
New Parallel Computing Support for Certain Functions
Statistics Toolbox now supports parallel execution for the following functions: bootci bootstrp crossval jackknife TreeBagger For more information on parallel computing in the Statistics Toolbox, see Parallel Computing Support for Resampling Methods.
New Stack and Unstack Methods for Dataset Arrays
dataset.unstack converts a tall dataset array to an equivalent dataset array that is in "wide format", by "unstacking" a single variable in the tall
dataset array into multiple variables in wide. dataset.stack reverses this manipulation by converting a wide dataset array to an equivalent dataset array that is in "tall format", by "stacking up" multiple variables in the wide dataset array into a single variable in tall.

New Performance Curve Function
New perfcurve function provides graphical method to evaluate classification results. Includes ROC (receiver operating characteristic) and other curves.
New Probability Distribution Objects
Provides a consistent interface for working with probability distributions. Can be created directly using the ProbDistUnivParam constructor, or fit to data using the fitdist function. Option to fit distributions by group. Includes kernel object methods and parametric object methods that you can use to analyze the distribution represented by the object.
Includes kernel object properties and parametric object properties that you can access to determine the fit results and evaluate their accuracy. Related enhancements in the chi2gof, histfit, kstest, probplot, and qqplot functions.
Version 7.0 (R2008b) Statistics Toolbox Software
This table summarizes whats new in Version 7.0 (R2008b): New Features and Changes Yes Details below Version Compatibility Considerations Yes Summary Fixed Bugs and Known Problems No
New features and changes introduced in this version are organized by these topics: Classification on page 18 Data Organization on page 18 Model Assessment on page 19 Multivariate Methods on page 19 Probability Distributions on page 19 Regression Analysis on page 20 Statistical Visualization on page 20 Utility Functions on page 21

#### Classification

The new confusionmat function tabulates misclassifications by comparing known and predicted classes of observations.

#### Data Organization

Dataset arrays constructed by the dataset function can now be written to an external text file using the new export function. When reading external text files into a dataset array, dataset has a new 'TreatAsEmpty' parameter for specifying strings to be treated as empty.
In previous versions, dataset used eval to evaluate strings in external text files before writing them into a dataset array. As a result, strings such as '1/1/2008' were treated as numerical expressions with two divides. Now, dataset treats such expressions as strings, and writes a string variable into the dataset array whenever a column in the external file contains a string that does not represent a valid scalar value.

#### Model Assessment

The cross-validation function, crossval, has new options for directly specifying loss functions for mean-squared error or misclassification rate, without having to provide a separate function M-file.

#### Multivariate Methods

The procrustes function has new options for computing linear transformations without scale or reflection components.
Probability Distributions
The multivariate normal functions mvnpdf, mvncdf, and mvnrnd now accept vector specification of diagonal covariance matrices, with corresponding gains in computational efficiency. The hypergeometric distribution has been added to both the disttool and randtool graphical user interfaces.
The ksdensity function may give different answers for the case where there are censoring times beyond the last observed value. In this case, ksdensity tries to reduce the bias in its density estimate by folding kernel functions across a folding point so that they do not extend into the area that is completely censored. Two things have changed for this release:
1 In previous releases the folding point was the last observed value. In this
release it is the first censoring time after the last observed value.
2 The folding procedure is applied not just when the 'function' parameter
is 'pdf', but for all 'function' values.

#### Regression Analysis

The new nlmefit function fits nonlinear mixed-effects models to data with both fixed and random sources of variation. Mixed-effects models are commonly used with data over multiple groups, where measurements are correlated within groups but independent between groups.
Statistical Visualization
The boxplot function has new options for handling multiple grouping variables and extreme outliers. The lsline, gline, refline, and refcurve functions now work with scatter plots produced by the scatter function. In previous versions, these functions worked only with scatter plots produced by the plot function. The following visualization functions now have custom data cursors, displaying information such as observation numbers, group numbers, and the values of related variables: andrewsplot biplot ecdf glyphplot gplotmatrix gscatter normplot parallelcoords probplot qqplot scatterhist wblplot
Changes to boxplot have altered a number of default behaviors: Box labels are now drawn as text objects rather than tick labels. Any code that customizes the box labels by changing tick marks should now set the tick locations as well as the tick labels. The function no longer returns a handles array with a fixed number handles, and the order and meaning of the handles now depends on which options are selected. To locate a handle of interest, search for its 'Tag' property using findobj. 'Tag' values for box plot components are listed on the boxplot reference page. There are now valid handles for outliers, even when boxes have no outliers. In previous releases, the handles array returned by the function had NaN values in place of handles when boxes had no outliers. Now the 'xdata' and 'ydata' for outliers are NaN when there are no outliers. For small groups, the 'notch' parameter sometimes produces notches that extend outside of the box. In previous releases, the notch was truncated to the extent of the box, which could produce a misleading display. A new value of 'markers' for this parameter avoids the display issue. As a consequence, the anova1 function, which displays notched box plots for grouped data, may show notches that extend outside the boxes.

#### Utility Functions

The statistics options structure created by statset now includes a Jacobian field to specify whether or not an objective function can return the Jacobian as a second output.
Version 6.2 (R2008a) Statistics Toolbox Software
This table summarizes whats new in Version 6.2 (R2008a): New Features and Changes Yes Details below Version Compatibility Considerations Yes Summary Fixed Bugs and Known Problems Bug Reports Includes fixes
New features and changes introduced in this version are organized by these topics: Descriptive Statistics on page 22 Model Assessment on page 23 Multivariate Methods on page 23 Probability Distributions on page 23 Regression Analysis on page 23 Statistical Visualization on page 23 Utility Functions on page 23

#### Descriptive Statistics

Bootstrap confidence intervals computed by bootci are now more accurate for lumpy data.
The formula for bootci confidence intervals of type 'bca' or 'cper' involves the proportion of bootstrap statistics less than the observed statistic. The formula now takes into account cases where there are many bootstrap statistics exactly equal to the observed statistic.
Two new cross-validation functions, cvpartition and crossval, partition data and assess models in regression, classification, and clustering applications.
A new sequential feature selection function, sequentialfs, selects predictor subsets that optimize user-defined prediction criteria. The new nnmf function performs nonnegative matrix factorization (NMF) for dimension reduction.
The new sobolset and haltonset functions produce quasi-random point sets for applications in Monte Carlo integration, space-filling experimental designs, and global optimization. Options allow you to skip, leap over, and scramble the points. The qrandstream function provides corresponding quasi-random number streams for intermittent sampling.
The new plsregress function performs partial least-squares regression for data with correlated predictors.
The normspec function now shades regions of a normal density curve that are either inside or outside specification limits.
The statistics options structure created by statset now includes fields for TolTypeFun and TolTypeX, to specify tolerances on objective functions and parameter values, respectively.
Version 6.1 (R2007b) Statistics Toolbox Software

This table summarizes whats new in Version 6.1 (R2007b): New Features and Changes Yes Details below Version Compatibility Considerations Yes Summary Fixed Bugs and Known Problems Bug Reports Includes fixes
New features and changes introduced in this version are organized by these topics: Cluster Analysis on page 24 Design of Experiments on page 25 Hypothesis Tests on page 25 Probability Distributions on page 25 Regression Analysis on page 26 Statistical Visualization on page 27

#### Cluster Analysis

The new gmdistribution class represents Gaussian mixture distributions, where random points come from different multivariate normal distributions with certain probabilities. The gmdistribution constructor creates mixture models with specified means, covariances, and mixture proportions, or by fitting a mixture model with a specified number of components to data. Methods for the class include: fit Distribution fitting function pdf Probability density function cdf Cumulative distribution function random Random number generator cluster Data clustering posterior Cluster posterior probabilities
mahal Mahalanobis distance The cluster function for hierarchical clustering now accepts a vector of cutoff values, and returns a matrix of cluster assignments, with one column per cutoff value.
The kmeans function now returns a vector of cluster indices of length n, where n is the number of rows in the input data matrix X, even when X contains NaN values. In the past, rows of X with NaN values were ignored, and the vector of cluster indices was correspondingly reduced in size. Now the vector of cluster indices contains NaN values where rows have been ignored, consistent with other toolbox functions.

#### Design of Experiments

A new option in the D-optimal design function candexch specifies fixed design points in the row-exchange algorithm. A similar feature is already available for the daugment function, which uses the coordinate-exchange algorithm.

#### Hypothesis Tests

The kstest function now uses a more accurate method to calculate the p-value for a single-sample Kolmogorov-Smirnov test.
kstest now compares the computed p-value to the desired cutoff, rather than
comparing the test statistic to a table of values. Results may differ from those in previous releases, especially for small samples in two-sided tests where an asymptotic formula was used in the past.
A new fitting function, copulafit, has been added to the family of functions that describe dependencies among variables using copulas. The function fits parametric copulas to data, providing a link between models of marginal distributions and models of data correlations.

Objects from the classregtree class are intended to be compatible with the structure arrays that were produced in previous versions by the classification and regression tree functions listed above. In particular, classregtree supports dot indexing of the form t.property to obtain properties of the object t. The class also provides function-like behavior through parenthesis indexing, so that t(x) uses the tree t to classify or compute fitted values for
predictors x, rather than index into t as a structure array as it did in the past. As a result, cell arrays should now be used to aggregate classregtree objects.
The new scatterhist function produces a scatterplot of 2D data and illustrates the marginal distributions of the variables by drawing histograms along the two axes. The function is also useful for viewing properties of random samples produced by functions such as copularnd, mvnrnd, and lhsdesign.

#### Other Improvements

The mvtrnd function now produces a single random sample from the multivariate t distribution if the cases input argument is absent. The zscore function, which centers and scales input data by mean and standard deviation, now returns the means and standard deviations as additional outputs.
Version 5.3 (R2006b) Statistics Toolbox Software
This table summarizes whats new in Version 5.3 (R2006b): New Features and Changes Yes Details below Version Compatibility Considerations Yes Summary Fixed Bugs and Known Problems Bug Reports Includes fixes
New features and changes introduced in this version are organized by these topics: Demos on page 32 Design of Experiments on page 32 Hypothesis Tests on page 33 Multinomial Distribution on page 33 Regression Analysis on page 34 Statistical Process Control on page 34
The following demo has been updated: Selecting a Sample Size Modified to highlight the new sampsizepwr function
The following visualization functions, commonly used in the design of experiments, have been added: interactionplot Two-factor interaction plot for the mean maineffectsplot Main effects plot for the mean multivarichart Multivari chart for the mean
The following functions for hypothesis testing have been added or improved: jbtest Replaces the chi-square approximation of the test statistic, which is asymptotic, with a more accurate algorithm that interpolates p-values from a table of quantiles. A new option allows you to run Monte Carlo simulations to compute p-values outside of the table. lillietest Uses an improved version of Lilliefors table of quantiles, covering a wider range of sample sizes and significance levels, with more accurate values. New options allow you to test for exponential and extreme value distributions, as well as normal distributions, and to run Monte Carlo simulations to compute p-values outside of the tables. runstest Adds a test for runs up and down to the existing test for runs above or below a specified value. sampsizepwr New function to compute the sample size necessary for a test to have a specified power. Options are available for choosing a variety of test types.

If the significance level for a test lies outside the range of tabulated values, [0.001, 0.5], then both jbtest and lillietest now return an error. In previous versions, jbtest returned an approximate p-value and lillietest returned an error outside a smaller range, [0.01, 0.2]. Error messages suggest using the new Monte Carlo option for computing values outside the range of tabulated values. If the data sample for a test leads to a p-value outside the range of tabulated values, then both jbtest and lillietest now return, with a warning, either the smallest or largest tabulated value. In previous versions, jbtest returned an approximate p-value and lillietest returned NaN.

#### Multinomial Distribution

The multinomial distribution has been added to the list of almost 50 probability distributions supported by the toolbox. mnpdf Multinomial probability density function
mnrnd Multinomial random number generator

#### Multinomial Regression

Support has been added for multinomial regression modeling of discrete multi-category response data, including multinomial logistic regression. The following new functions supplement the regression models in glmfit and glmval by providing for a wider range of response values: mnrfit Fits a multinomial regression model to data mnrval Computes predicted probabilities for the multinomial regression model

#### Multivariate Regression

The new mvregress function carries out multivariate regression on data with missing response values. An option allows you to specify how missing data is handled.

#### Survival Analysis

coxphfit A new option allows you to specify the values at which the
baseline hazard is computed.
Statistical Process Control
The following new functions consolidate and expand upon existing functions for statistical process control: capability Computes a wider range of probabilities and capability indices than the capable function found in previous releases controlchart Displays a wider range of control charts than the ewmaplot, schart, and xbarplot functions found in previous releases controlrules Supplements the new controlchart function by providing for a wider range of control rules (Western Electric and Nelson)
gagerr Performs a gage repeatability and reproducibility study on measurements grouped by operator and part
The capability function subsumes the capable function that appeared in previous versions of Statistics Toolbox software, and the controlchart function subsumes the functions ewmaplot, schart, and xbarplot. The older functions remain in the toolbox for backwards compatibility, but they are no longer documented or supported.

Markov Chain Monte Carlo Methods
The following functions generate random numbers from nonstandard distributions using Markov Chain Monte Carlo methods: mhsample Generate random numbers using the Metropolis-Hasting algorithm slicesample Generate random numbers using a slice sampling algorithm
Pearson and Johnson Systems of Distributions
Support has been added for random number generation from Pearson and Johnson systems of distributions. pearsrnd Random numbers from a distribution in the Pearson system johnsrnd Random numbers from a distribution in the Johnson system

#### Robust Regression

To supplement the robustfit function, the following functions now have options for robust fitting: nlinfit Nonlinear least-squares regression nlparci Confidence intervals for parameters in nonlinear regression nlpredci Confidence intervals for predictions in nonlinear regression
The following control chart functions now support time-series objects: xbarplot Xbar plot schart Standard deviation chart ewmaplot Exponentially weighted moving average plot
Version 5.1 (R14SP3) Statistics Toolbox Software
This table summarizes whats new in Version 5.1 (R14SP3): New Features and Changes Yes Details below Version Compatibility Considerations No Fixed Bugs and Known Problems No
New features and changes introduced in this version are organized by these topics: Demos on page 40 Descriptive Statistics on page 41 Hypothesis Tests on page 41 Probability Distributions on page 42 Regression Analysis on page 43 Statistical Visualization on page 43
The following demos have been added to the toolbox: Curve Fitting and Distribution Fitting Fitting a Univariate Distribution Using Cumulative Probabilities Fitting an Orthogonal Regression Using Principal Components Analysis Modelling Tail Data with the Generalized Pareto Distribution Pitfalls in Fitting Nonlinear Models by Transforming to Linearity Weighted Nonlinear Regression The following demo has been updated: Modelling Data with the Generalized Extreme Value Distribution
The new partialcorr function computes the correlation of one set of variables while controlling for a second set of variables. The grpstats function now computes a wider variety of descriptive statistics for grouped data. Choices include the mean, standard error of the mean, number of elements, group name, standard deviation, variance, confidence interval for the mean, and confidence interval for new observations. The function also supports the computation of user-defined statistics.
Chi-Square Goodness-of-Fit Test
The new chi2gof function tests if a sample comes from a specified distribution, against the alternative that it does not come from that distribution, using a chi-square test statistic.

Version 5.0.2 (R14SP2) Statistics Toolbox Software
This table summarizes whats new in Version 5.0.2 (R14SP2): New Features and Changes Yes Details below Version Compatibility Considerations No Fixed Bugs and Known Problems Bug Reports Includes fixes
New features and changes introduced in this version are organized by this topic:
The cophenet function now returns cophenetic distances as well as the cophenetic correlation coefficient.
Compatibility Summary for Statistics Toolbox Software
This table summarizes new features and changes that might cause incompatibilities when you upgrade from an earlier version, or when you use files on multiple versions. Details are provided in the description of the new feature or change. Version (Release) Latest Version V7.5 (R2011a) V7.4 (R2010b) New Features and Changes with Version Compatibility Impact None See the Compatibility Considerations subheading for each of these new features and changes: nlmefit Support for Error Models, and nlmefitsa changes on page 8 Export Probability Objects with dfittool on page 9 V7.3 (R2010a) V7.2 (R2009b) V7.1 (R2009a) V7.0 (R2008b) None None None See the Compatibility Considerations subheading for each of these new features and changes: Data Organization on page 18 Statistical Visualization on page 20
Version (Release) V6.2 (R2008a)
New Features and Changes with Version Compatibility Impact See the Compatibility Considerations subheading for this change: Descriptive Statistics on page 22

#### V6.1 (R2007b)

See the Compatibility Considerations subheading for each of these new features and changes: Cluster Analysis on page 24 Hypothesis Tests on page 25 Probability Distributions on page 25 Regression Analysis on page 26

#### V6.0 (R2007a)

See the Compatibility Considerations subheading for each of these new features and changes: Multivariate Statistics on page 29 Regression Analysis on page 30

#### V5.3 (R2006b)

See the Compatibility Considerations subheading for each of these new features and changes: Hypothesis Tests on page 33 Statistical Process Control on page 34

### Tags

manuel d'instructions, Guide de l'utilisateur | Manual de instrucciones, Instrucciones de uso | Bedienungsanleitung, Bedienungsanleitung | Manual de Instruções, guia do usuário | инструкция | návod na použitie, Užívateľská príručka, návod k použití | bruksanvisningen | instrukcja, podręcznik użytkownika | kullanım kılavuzu, Kullanım | kézikönyv, használati útmutató | manuale di istruzioni, istruzioni d'uso | handleiding, gebruikershandleiding