Applied statistics
Last night, while taking a break from working on my Dissertation, I toyed with the idea of (someday) buying a cheap, older laptop (Pentium II? 500MHz-ish?) and using it as a portable multitrack recorder. I'd prob'ly use either Dyne:Bolic Linux (optimized for older machines!), or the Fluxbox (minimalist desktop) installation of Agnula DeMuDi Linux (De[bian-based] Mu[ltimedia] Di[stribution]). Both are free of cost, and downloadable online.
So, I took a few minutes to grab a newspaper off the recycling pile and glance through the classifieds section. Circled two or three that seemed like good prices, and saved the clipping for When I Finish My Dissertation (TM).
But then I wondered: What is a good price?
So, I took a few more minutes to enter five characteristics -- price, processor speed, hard drive size, RAM, and the generation of Pentium -- for the thirteen ads into SPSS (a statistical software package; yes, I have a license).
A few interesting findings:
-The zero-order correlations (a number from zero to 1.00 that measures how tightly related two things are) between price and the other four characteristics were all significant! This is surprising, considering the really low sample size (n=13).
(note''Significance'' is a measure of how ''stable'' or ''trustworthy'' the relationship is. If you're only looking at a handful of cases, it's hard to generate stable results. And ''zero-order'' just means that you're only looking at the relationship between two things; you haven't ''controlled'', or compensated, for anything else.)
-In addition to all being significant, they all had pretty strong correlations to the asking price: the lowest corrrelation (r=0.78) was for the generation of Pentium; the highest (r=0.96; remember, on a scale of 0.00 to 1.00!!!) was for the amount of RAM.
-I then ran an OLS regression to predict the selling price. (OLS regression = ''Ordinary Least-Squares; the software calculates the line the best fits the data, in hypothetical five-dimensional space. It squares all the distances of the points above and below the lines, because the dots above the line would have positive distances, and the dots below the line would have negative distances; squaring each distance makes 'em all postive numbers; then the minimizes the grand total of all those squared distances. And it's five-dimensional [in this instance] because there's five variables.) The estimated influence of both the generation of Pentium and the amount of RAM on the asking price were significant (i.e. stable, trustworthy; p<0.05; p =0.37-ish). Example: 500MHz, 12GB HD, 128MB RAM, Pentium III = $375
This predicted price is the ''average'' price for these specs. If the asking price in the classifieds is cheaper, then it's a bargain!!!
Which it was: $300, not $375!
So -- if I had three hundred bucks lying around -- and wasn't working on a Dissertation instead of recording -- that would be a good one to buy! It's ''below the curve'' -- so to speak!
If you live in Brisbane, Australia -- and buy things through the Weekend Shopper.
And that's why statistics is cool!!! :)
--GG