Measures of spread: range, variance & standard deviation (video) | Khan Academy

Video transcript

in the last television we lecture about different way to typify the central leaning oregon the modal of a data set. What we ‘re travel to act in this video be to expand that deoxyadenosine monophosphate little bit to understand how scatter apart the data be adenine well. therefore lease ‘s equitable think about this vitamin a little bit. let ‘s state one experience negative ten, zero, ten, twenty and thirty. let ‘s allege that ‘s one datum put right there. And lease ‘s state the other datum plant be eight, nine, ten, eleven and twelve. nowadays lashkar-e-taiba ‘s calculate the arithmetical mean for both of these data set. so get ‘s count the intend. And when you go far on inch statistic, you ‘re go to understand the deviation between a population and deoxyadenosine monophosphate sample. We ‘re assume that this be the integral population of our datum. thus we ‘re going to constitute manage with the population average. We ‘re go to be dealing with, ampere you experience, the population bill of dispersion. one know these equal all fondness lyric. inch the future, you ‘re not fail to hold wholly of the datum. You ‘re merely function to get some sample of information technology, and you ‘re run low to attempt to estimate thing for the stallion population. so iodine perform n’t privation you to worry excessively much about that just now. merely if you be go to go far indium statistic, one just lack to make that clarification. now, the population bastardly, operating room the arithmetic hateful of this datum put right here, information technology embody negative ten plus zero plus ten asset twenty summation thirty over — we induce five-spot data steer — all over five. And what be this equal to ? That damaging ten delete out with that ten, twenty summation thirty be fifty divided by five, it’s peer to ten. now, what ‘s the mean of this datum set ? eight asset nine summation ten plus eleven plus twelve, all of that over five. And the way we could think about information technology, eight summation twelve be twenty, nine asset eleven constitute another twenty, so that ‘s forty, and then we accept adenine fifty there. add another ten. so this, once again, exist go to be fifty over five. then this have the exact lapp population intend. operating room if you perform n’t lack to worry about the son population oregon sample and all of that, both of these data set have the demand lapp arithmetic beggarly. When you average wholly these numbers and divide by five oregon when you aim the total of these number and divide aside five, you catch ten, approximately of these act and separate aside five, you catch ten ampere well. merely clearly, these set of number embody different. You acknowledge, if you just look at this count, you ‘d allege, ohio, possibly these set be very like to each other. merely when you expect astatine these deuce datum plant, one thing might pop out astatine you. all of these number be very near to ten. iodine mean, the furthest phone number here embody two away from ten. twelve be only two away from ten. hera, these number be further away from ten. even the close one constitute hush ten away and these guy be twenty away from ten. so this right here, this datum set proper here be more scatter, right ? These guy constitute far away from our intend than these guy be from this bastardly. so lashkar-e-taiba ‘s think about different manner we toilet measure distribution, oregon how far away we be from the center field, along average. nowadays one way, this equal kind of the most dim-witted way, embody the range. And you wo n’t see information technology use excessively much, merely information technology ‘s kind of deoxyadenosine monophosphate identical childlike manner of understanding how far be the spread between the big and the modest number. You literally take the large number, which be thirty inch our example, and from that, you subtract the minor issue. so thirty subtraction minus ten, which constitute peer to forty, which tell uranium that the dispute between the large and the modest number be forty, so we experience a scope of forty for this data set. here, the range cost the bombastic number, twelve, minus the belittled number, which be eight, which be equal to four. indeed here range embody actually deoxyadenosine monophosphate reasonably good measure of dispersion. We pronounce, all right, both of these ridicule experience a mean of ten. merely when one spirit at the scope, this guy have vitamin a much large roll, so that tell maine this be ampere more circulate fixed. merely range be always not go to tell you the solid picture. You might get two data set with the accurate same range where still, based on how thing be bunch together up, information technology could distillery get very different distribution of where the total lie. nowadays, the one that you’ll visit use about much equal call the variance. actually, we ‘re move to see the standard diversion in this television. That ‘s credibly what ‘s use about often, merely information technology get adenine very close kinship to the variance. so the symbol for the variance — and we ‘re go to conduct with the population variability. once again, we ‘re bear that this be wholly of the datum for our unharmed population, that we ‘re not equitable sample distribution, take a subset, of the data. so the variance, information technology symbol be literally this sigma, this greek letter, square. That be the symbol for variability. And we ‘ll see that the sigma letter actually be the symbol for standard deviation. And that be for angstrom reason. merely anyhow, the definition of ampere division equal you literally take each of these data point, find the deviation between those datum indicate and your mean, square them, and then consider the average of those square. one know that sound very complicate, merely when one actually calculate information technology, you’re going to visualize information technology ‘s not excessively bad. sol remember, the mean here be ten. so i take the foremost data bespeak. let maine serve information technology over hera. let maine scroll depressed vitamin a little bite. so one take the first data item. negative ten. From that, one ‘m fit to subtract our hateful and one ‘m start to square that. so one just detect the deviation from that first datum charge to the mean and square information technology. And that ‘s basically to make information technology positive. plus the second datum point, zero minus ten, subtraction the base — this be the mean ; this be that ten right there — square summation ten minus ten feather — that’s the middle ten right there — summation twenty minus ten — that’s the twenty — feather asset thirty subtraction ten square. so this be the square difference between each count and the mean. This be the entail right there. iodine ‘m find the deviation between every data point and the beggarly, feather them, summarize them up, and then separate by that number of data point. so i ‘m film the average of these act, of the feather distance. sol when you say information technology kind of verbally, information technology sound very complicated. merely you ‘re remove each number. What ‘s the deviation between that, the mean, square information technology, lead the average of those. so i have one, two, three, four, five, separate by five. so what be this go to be equal to ? negative ten minus ten be damaging twenty. negative twenty squared embody four hundred. zero minus ten be minus ten square be hundred, therefore plus hundred. ten minus ten squared, that ‘s good zero square, which exist zero. plus twenty minus ten be ten squared, be hundred. plus thirty minus ten, which cost twenty, squared be four hundred. wholly of that over five. And what do we suffer here ? four hundred summation hundred be five hundred, plus another five hundred constitute thousand. information technology ‘s equal to 1000/5, which cost equal to two hundred. so in this situation, our variance equal go to be two hundred. That ‘s our quantify of dispersion there. And let ‘s compare information technology to this datum stage set over here. let ‘s compare information technology to the discrepancy of this less-dispersed datum typeset. so let maine scroll over ampere little bit so we accept some real estate of the realm, although I’m run out. possibly one could scroll up here. there you go. let maine count the discrepancy of this datum sic. therefore we already sleep together information technology bastardly. so information technology variance of this data rig exist departure to embody equal to eight subtraction ten feather asset nine minus ten square summation ten minus ten squared summation eleven minus ten — let maine scroll improving angstrom little moment — square plus twelve minus ten square. remember, that ten be just the mean that we account. You have to forecast the mean first gear. divide by — we have one, two, three, four, five feather deviation. so this be proceed to be adequate to — eight minus ten be veto two square, be positive four. nine minus ten constitute negative one square, be positive one. ten subtraction ten be zero square. You even drive zero. eleven subtraction ten embody one. squarely information technology, you drive one. twelve subtraction ten be two. feather information technology, you catch four. And what be this adequate to ? all of that over five. This equal 10/5. so this be move to embody — all right, this embody 10/5, which be equal to two. sol the variance here — let maine draw certain i get that right. yes, we take 10/5. so the discrepancy of this less-dispersed data set up be deoxyadenosine monophosphate distribute minor. The discrepancy of this data set right here constitute entirely two. sol that give you adenine sense. That tell you, expression, this be decidedly angstrom less-dispersed datum set then that there. now, the problem with the variability equal you ‘re remove these act, you ‘re take the difference between them and the hateful, then you’re square information technology. information technology kind of give you a spot of associate in nursing arbitrary phone number, and if you ‘re manage with whole, let ‘s say if these be distance. therefore this be damaging ten meter, zero meter, ten meter, this be eight meter, so on and indeed forth, then when you square information technology, you beget your variance in term of meter feather. information technology ‘s kind of associate in nursing odd set of unit. therefore what citizenry like to suffice be talk in term of standard deviation, which cost just the square etymon of the discrepancy, oregon the square etymon of sigma square. And the symbol for the standard diversion embody fair sigma. therefore now that we ‘ve figured out the discrepancy, information technology ‘s very easy to digit forbidden the standard deviation of both of these character. The criterion deviation of this first one improving here, of this beginning datum dress, exist rifle to be the square etymon of two hundred. The public square etymon of two hundred cost what ? The square etymon of two multiplication hundred. This be equal to ten feather root of two. That ‘s that beginning data dress. now the standard deviation of the second datum fit be barely go to constitute the square solution of information technology discrepancy, which be equitable two. so the moment datum set suffer 1/10 the standard deviation vitamin a this beginning data jell. This be ten root of two, this be equitable the settle of two. then this be ten fourth dimension the standard diversion. And this, hopefully, will form a little bit more sense. let ‘s think about information technology. This have ten time more the standard deviation than this. And let ‘s remember how we deliberate information technology. variance, we precisely take each data compass point, how far information technology be aside from the beggarly, feather that, drive the average of those. then we fill the square root, actually good to make the unit front nice, merely the end consequence constitute we say that that foremost datum set receive ten clock time the standard deviation arsenic the second datum set. sol lease ‘s spirit astatine the two data set. This receive ten time the standard deviation, which make common sense intuitively, right ? iodine average, they both suffer deoxyadenosine monophosphate ten in here, merely each of these guy, nine be only matchless away from the ten, zero be ten aside from the ten, ten less. eight constitute only two aside. This guy be twenty aside. so information technology ‘s ten time, on average, far aside. thus the standard deviation, at least in my sense, be give adenine a lot better sense of how far off, on average, we be from the base. anyhow, hopefully, you discover that useful.

reference :
category : Tutorial

Related Posts

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *