Standard deviation

Mathematical terms
Collect
Check out my collection
0 Useful +1
0
synonymStandard deviation(statistical noun) generally refers to standard deviation
Standard Deviation, a mathematical term, is the square of the difference from the mean Arithmetic mean (i.e., variance) of Arithmetic square root , use sigma Indicates. The standard deviation is also known as the standard deviation, or experimental standard deviation, in Probability statistics In the most commonly used as Statistical distribution Degree of measurement basis.
The standard deviation is the variance Arithmetic square root . The standard deviation reflects one Data set the Degree of dispersion . Mean number The same two sets of data may not have the same standard deviation.
Chinese name
Standard deviation
Foreign name
Standard Deviation
Field of application
Statistics
subject
Maths
Work and energy
Measure the degree of statistical distribution
correlative
variance

Calculation formula

broadcast
EDITOR
Relation to variance: Variance = standard deviation squared.
In the experiment, a single measurement will inevitably produce errors, so we often measure multiple times and then use the measured value Mean value Represents the amount measured and characterizes the distribution of the data with an error bar, where the height of the error bar is ± Standard error . This is the standard deviation.
Coefficient of variation:
Among them,
The average of the index data.

Nature and application

broadcast
EDITOR
Standard deviation ( Standard Deviation ), in Probability statistics In the most commonly used as Statistical distribution Degree (statistical dispersion) is measured. Standard deviation is defined as the units of the population Standard value Rather than Mean number From the Difference squared the Arithmetic mean the Square root . It reflects inter-individual relationships within a group Degree of dispersion . The results measured to the degree of distribution, in principle, have two properties:
Is a non-negative value and has the same unit as the measurement data. A standard deviation of a total or one Random variable There is a difference between the standard deviation of sigma and the standard deviation of the number of samples in a subset.
In simple terms, a standard deviation is a set of data Mean value A measure of dispersion. A large standard deviation represents a large difference between most values and their mean;
For example, the two sets of numbers {0,5,9,14} and {5,6,8,9} both have an average of 7, but the second set has a smaller standard deviation.
Standard deviation can be used as a measure of uncertainty. In the physical sciences, for example, do repeatability When measuring, the standard deviation of the set of measured values represents the values of those measurements precision . When deciding whether the measured values match Predicted value The standard deviation of the measured value plays a decisive role: if the measured mean deviates too far from the predicted value (and is compared with the standard deviation value), the measured value and the predicted value are considered to contradict each other. This is easy to understand, because if the measured values all fall outside a certain range of values, it is reasonable to infer whether the predicted values are correct.
Standard deviation is applied to investments as a measure of the stability of returns. A higher standard deviation means that returns are far from the past Mean number The more volatile the return, the higher the risk. In contrast, a smaller standard deviation means a more stable return and less risk.
For example, six students from each group A and B took the same language test, and the scores of group A were 95, 85, 75, 65, 55, 45, and the scores of Group B were 73, 72, 71, 69, 68, 67. The mean for both groups is 70, but the standard deviation for Group A is about 17.08 points and for Group B is about 2.16 points, indicating that the gap between group A students is much larger than the gap between group B students.
For the population (that is, to estimate the population variance), divide the square root by n (corresponding excel Function: STDEV.P);
If sampling (i.e., estimating the sample variance), divide the square root by (n-1) (corresponding to excel function: STDEV.S);
Because we're dealing with a lot of samples, it's widely used Sign of the square root Inside divide by n minus 1.

Formula meaning

broadcast
EDITOR
Subtract the sum of the squares of all the numbers from their mean, divide the result by the number of numbers in the set (or the number minus one, i.e. the variance), and then take the square root of the resulting value.
The dark blue area is the distance Mean value A range of values within one standard deviation. in Normal distribution In, this range is occupied ratio Is 68.2% of the total value (i.e. 1). For a normal distribution, the ratio within two standard deviations (dark blue, blue) adds up to 95.4%. For a normal distribution, the ratio within plus or minus three standard deviations (dark blue, blue, light blue) adds up to 99.6%.
And because of this property of standard deviation, we get The three Sigma rule (three-sigma guideline).
Normal distribution diagram

dispersion

broadcast
EDITOR
Standard deviation reflects a set of data Degree of dispersion One of the most common forms of quantification is representation precision Important indicators. When we talk about standard deviation, we need to understand why it's there. We use methods to detect it, but the detection method always has errors, so the detection value is not its true value. The gap between the detected value and the true value is the most decisive indicator to evaluate the detection method. But what the true value is, we don't know. So how do you quantify the detection method accuracy Becomes a problem. This is also the purpose of clinical work quality control: to ensure the accuracy and reliability of each batch of experimental results.
Although it is impossible to know the true value of a sample, each sample will always have a true value, no matter what it is. It can be imagined that a good detection method, the detection value should be very tightly dispersed around the true value. If it is not tight, the distance from the true value will be large, and the accuracy is of course not good, and it is impossible to imagine a method with a large degree of dispersion that will measure accurate results. Therefore, the dispersion is the most important and basic index to evaluate the quality of the method.
There are many ways to evaluate and quantify the dispersion of a set of data:

range

The most direct and simplest way is Maximum value - Minimum value (i.e. range) to evaluate the dispersion of a set of data. This method is most common in daily life, such as the elimination of the highest and lowest scores in the competition is a very poor specific application.

Sum of squares from mean deviation

Because of the uncontrollability of errors, it is unscientific to judge a set of data by only two data points. So people don't use ranges in more demanding fields. In fact, the dispersion is the data deviation Mean value The extent of. Therefore, the difference between the data and the mean (we call it the dispersion) can be added to reflect an accurate degree of dispersion. The greater the sum, the greater the dispersion.
But because Accidental error A surname Normal distribution Yes, Deviation from mean There are positive and negative, and for large samples the algebraic sum of deviations from the mean is zero. In order to avoid the plus-minus problem, there are two ways in mathematics: one is to take Absolute value That is, the sum of the absolute values of the deviation from the mean is often said. In order to avoid the sign problem, the most commonly used method in mathematics is another method - squared, so that it is all Nonnegative number . Therefore, the sum of squares of the mean deviation is an index to evaluate the dispersion.

variance

Due to the sum of the squares of the deviation from the mean Sample number It can only reflect the dispersion of the same sample, and it is difficult to compare the same sample in actual work, so in order to eliminate the influence of the number of samples, increase comparability Average the sum of squares of the deviation from the mean, which is what we call the variance and is a good indicator of the dispersion.
Sample size The bigger it is, the more it reflects the real situation, and Arithmetic mean But completely ignored this problem, this has long been considered in statistics, in statistics, the average difference of the sample is mostly divided by Degree of freedom (n-1), which means the degree to which an instinct is free to choose. When there's only one left, it can't be free anymore, so it's n-1 degrees of freedom.

Standard deviation meaning

Because the variance is the square of the data, it is too different from the detection value itself, and it is difficult for people to measure intuitively, so the variance is commonly used Sign of the square root And that's the standard deviation we're talking about.
In statistics, the average difference of a sample is usually divided by the degree of freedom (n-1), which means the degree to which the sample is free to choose. When there's only one left, it can't be free anymore, so it's n-1 degrees of freedom.

Coefficient of variation

Standard deviation can objectively and accurately reflect the degree of dispersion of a set of data, but for different items or different samples of the same item, standard deviation is lack of comparability, so for methodology Evaluation is introduced again Coefficient of variation CV.
The mean and standard deviation of a set of data are often used as the basis for reference. Intuitively, if the center of the value is considered as the mean, then the standard deviation is Statistical distribution One of the "natural" measurements.
from geometry From the point of view of sigma, the standard deviation can be understood as a sigma from N-dimensional space As a function of the distance from a point to a line. As a simple example, there are three values in a set of data
. They can determine a point in three dimensions
. Imagine a straight line through the origin. If all 3 values in this set of data are equal, then point P is a point on the line L, and the distance from P to L is 0, so the standard deviation is also 0. If the three values are not all equal, the point P is perpendicular to L as PR, and PR crosses L at R, then the coordinates of R are those of the three values Mean number :
Using some algebra, it is not difficult to find that the distance between the point P and R (that is, the distance between the point P and the line L) is |PR|. In n-dimensional space, the same rule applies, just replace 3 with n.

Standard deviation, standard error

broadcast
EDITOR
Standard deviation and Standard error All are Mathematical statistics The content of the two is not only relatively similar in literal terms, but also both indicate distance from a certain one Standard value or Median value The degree of dispersion, that is, both expressed Degree of variation But there is a big difference between the two.
First of all, from Statistical sampling Speaking of the aspects. Real life or Investigation and research We are often unable to investigate a certain kind of desire Target group All members are tested, and only some members can be selected out of all members (that is, samples) for investigation, and then statistical principles and methods are used to analyze the data obtained, and the data results are the results of the sample, and then the sample results are inferred about the overall situation. A population can take more than one sample, the more samples taken, its Sample mean The closer it is to the average of the population.
legend
Standard deviation It means Sample data The degree of dispersion. The standard deviation is Sample mean The square root of the variance, standard deviation is usually relative to the mean of the sample data, usually expressed in M±SD, indicating how far a sample data observation is from the mean. And you can see here that the standard deviation is affected by Extreme value The impact of... The smaller the standard deviation, the more aggregated the data. The larger the standard deviation, the more discrete the data. The size of the standard deviation depends on the test, if a test is an academic test, the standard deviation is large, indicating the student's score Degree of dispersion Large, more able to measure the academic level of students; If a test measures a certain psychological quality, a small standard deviation indicates that the questions written are homogeneous, and the smaller standard deviation is better. The standard deviation is closely related to the normal distribution: in the normal distribution, 1 standard deviation is equal to 68.26% of the area of the curve under the normal distribution, and 1.96 standard deviation is equal to 95% of the area. This is... Test score equivalent Plays an important role on.
Standard error It's the sampling error. Because numerous samples can be extracted from a population, the data of each sample is an estimate of the data of the population. The standard error represents the current sample's estimate of the population, and the standard error represents Sample mean And the population mean Relative error . The standard error is divided by the standard deviation of the sample Sample size To calculate the square root of theta. As you can see here, the standard error is more affected by the sample size. The larger the sample size, the smaller the standard error, so Sampling error The smaller the sample, the better representative it is of the population.
A normally distributed population, taking n samples, can get the sample mean, estimated by the sample mean Population mean You need to consider the variance or standard deviation of the sample mean. [1]

function

broadcast
EDITOR
Excel STDEV.S, STDEV.P, STDEVA , STDEVPA Four functions, respectively Sample standard deviation , Population standard deviation , contain Logical value The sample standard deviation of the operation, the population standard deviation of the operation that includes the logical value (excel uses" Standard deviation ").
The difference in calculation method is: sample standard deviation ^2= Sample variance * (Number of data -1); Population standard deviation ^2= Population variance * Number of data.
excel decomposition of the function:
(1) The stdev function can be decomposed into (suppose Sample data For A1:E10 such a matrix) :
stdev(A1:E10)=sqrt(DEVSQ(A1:E10)/(COUNT(A1:E10)-1))
(2) stdevp function can be decomposed into (assuming the overall data is a matrix such as A1:E10) :
stdevp(A1:E10)=sqrt(DEVSQ(A1 :E10) /(COUNT(A1 :E10))
The same goes for stdeva and stdevpa Decomposition method .

Foreign exchange terminology

broadcast
EDITOR
Standard deviation is a statistical measure of the difference between a value in a set of values and its mean value. Standard deviation is used Evaluation price Possible degree of change or volatility. The bigger the standard deviation, Price fluctuation The wider the range, stocks, etc Financial instrument The more the performance fluctuates.
Call the function "STDEV.S" in excel to estimate the standard deviation of the sample. Standard deviation reflects relative to the mean Degree of dispersion .

Application example

broadcast
EDITOR

Fund selection

Fund algorithm
in Investment fund Generally, people pay more attention to performance, but often after buying the fund with the best recent performance, the performance of the fund is not as good as expected, which is because of the selected fund volatility Too big, no consistent performance.
Standard deviation is a tool to measure the volatility of a fund. Standard deviation is the degree to which a fund is likely to change. The bigger the standard deviation, the more likely the fund's future net worth is to change, stability The smaller it is, the higher the risk.
For example, a fund with a one-year standard deviation of 30% means that its net worth could rise by 30% in a year, but it could also fall by 30%. So if there are two yield For the same fund, the investor should choose the fund with the lower standard deviation (bear the same risk and get the same return), and if there are two funds with the same standard deviation, the investor should choose the fund with the higher return (bear the same risk but have a higher return). It is recommended that investors take both the benefits and risks into account to judge the fund. For example, Fund A has a two-year yield of 36% and a standard deviation of 18%; The two-year yield of fund B is 24%, and the standard deviation is 8%. From the data point of view, the return of fund A is higher than that of fund B, but the risk is also greater than that of fund B. "Per unit of A fund Risk-return rate "For 2
Fund B is 3
. Therefore, the original evaluation of fund A is better only based on income, but after standard deviation Risk factor After adjustment, the B fund is even better.
In addition, standard deviation can also be used to judge fund attributes. According to Morningstar, Equity fund The average standard deviation of the active fund is 5.04, and the average standard deviation of the active fund is 5.14. The mean standard deviation of conservatively allocated funds was 4.86; Normal Bond fund The mean standard deviation was 2.91; Money fund The mean standard deviation is 0.19; It can be seen that the more active the fund, the greater the standard deviation; And if the investor holds a fund with a standard deviation higher than Mean value The risk is higher, investors may wish to watch Olympic Games At the same time, also inspect the fund.

Stock market analysis

Stock price The fluctuation of is Stock market risk The performance, therefore Stock market Risk analysis Exactly. Stock market price Fluctuations are analyzed. volatility Represents the value of the future price uncertainty This uncertainty is generally described by variance or standard deviation (Markowitz,1952). Below is a list of China and US stocks for some periods Statistical index , of which Chinese securities market The data is composed of" Chanlong "Software download, American stock market Data taken from ECI "WorldStockExchangeDataDisk".
Stock statistical index
A given year
performance
volatility
Shanghai Composite index
Standard & Poor's index
1996
110.93
16.46
0.2376
0.0573
1997
0.13
31.01
0.1188
0.0836
1998
8.94
26.67
0.0565
0.0676
1999
17.24
19.53
0.1512
0.0433
2000
43.86
10.14
0.097
0.0421
2001
15.34
13.04
0.0902
0.0732
2002
20.82
23.37
0.0582
0.1091
Through calculation, we can get:
Shanghai Composite Index performance Expected value Material (110.93 to 0.13 + 8.94 + 17.24 + 43.86-15.34-20.82) / 7 = 20.6685714
Shanghai Stock Exchange volatility The expected value is ≈0.115643
Standard & Poor's Performance expectation ≈6.731429
S&p expected volatility ≈0.068029
Analysis Figure 2
Standard deviation Calculation formula Then calculate according to formula (2) :
The performance standard deviation of Shanghai Composite Index is ≈45.2489073
The standard deviation of SSE volatility is ≈0.063167
Standard & Poor's index Performance standard deviation ≈21.70647
Standard deviation of S&P volatility ≈0.023647
Because the standard deviation is Absolute value Can not directly compare China and the US by standard deviation, and Coefficient of variation You can make a direct comparison. It can be obtained by calculation: coefficient of variation C·V= standard deviation SD÷ average MN×100%
Sse performance variation coefficient ≈2.18926148
The volatility coefficient of Shanghai Stock Exchange is ≈0.5462
Standard & Poor's coefficient of performance variation ≈3.2247
S&p volatility coefficient of variation ≈0.3476
Through comparison, it can be seen that the variation coefficient of SSE volatility is greater than that of S&P volatility, indicating that in the long run Chinese stock market Relatively poor stability, or a less mature Stock market .

Application in enterprise

Analysis chart
Capital structure Refers to a variety of businesses Source of funds The proportional relationship is the enterprise Financing activities The result. Optimal capital structure It means can make the enterprise Cost of capital Lowest and Enterprise value The largest capital structure; Equity ratio , that is Borrowed capital with Own capital The proportion of the composition is reflected Enterprise capital structure Of important variables. The assets of an enterprise are composed of debt funds and equity funds, but their risk levels are equal to yield All different. On the basis of Portfolio theory Diversification of investment can spread a certain amount of risk, so the fund provider needs to decide the proportion of investment in debt funds and equity funds. In order to ensure that their benefits are maximized while weighing risks and benefits. Theoretical exploration External funding The maximization of the benefits of the provider is also the maximization of the value of the enterprise Investment ratio towards Enterprise financing In other words, it is the optimal proportion of the capital structure of the firm.
Suppose an enterprise's funds come through Issue bonds And stock, and both belong to risk Assets. Where the yield on the bond is
Risk by standard deviation
To measure; The yield of the stock is
, the risk is
; Stocks and bonds Correlation coefficient for
, covariance for
; The proportion of bonds is
, the proportion of stocks is (
*
). On the basis of Portfolio theory The external investors of the enterprise should Enterprise investment acquired Expected rate of return for
, the variance is
  1. 1.
    Corporate debt funds and equity funds are complete Positive correlation That is, the correlation coefficient
    Is 1. Enterprise obtained by outside investors Expected rate of return for
    The standard deviation of risk is
    That is, the standard deviation of the combination is equal to the standard deviation of the parts Weighted mean , by Investment portfolio It's impossible to disperse Investment risk . According to portfolio theory, different proportions of a portfolio are for investors undifferentiated Yes.
  2. 2.
    Corporate debt funds and equity funds are completely negatively correlated, that is, their correlation coefficient is -1. The expected rate of return received by the investor and its variance are respectively. According to portfolio theory, only if the proportion of investment is greater than
    When their portfolio is efficient. towards Enterprise financing In other words, the proportion of equity funds of enterprises is large
    The financing ratio of the enterprise is effective, and when the combination ratio is
    When, the enterprise's financing portfolio risk is zero.
  3. 3.
    The correlation coefficient between debt funds and equity funds is greater than -1 and less than 1. In theory, there are two kinds of a business Financing method There is a high degree of correlation between the two financing methods on the one hand System risk On the other hand, they also bear the same corporate risk. Therefore, from the perspective of practice, the correlation degree between different financing methods of enterprises can not be complete Positive correlation And negative correlation. For an enterprise, debt funds have fixed claims on the enterprise, and equity funds have only residual claims on the enterprise, so the fluctuation of debt funds is unlikely to be as large as that of equity funds. At the same time, the risk of enterprises will affect the debt funds and equity funds of enterprises at the same time, so the correlation coefficient of debt funds and equity funds of enterprises cannot be negative. The correlation coefficient between different financing methods of enterprises is generally between 0 and 1.
So in what proportion will the value of the enterprise reach the maximum? On the basis of Portfolio theory when
, and
When, can appear
Superior to
. See, decide Enterprise capital structure The immediate factors are mainly different Financing method The rate of return and risk and between them Correlation coefficient .