Okay, so today I wanted to mess around with getting some basic stats from a sample. Nothing too fancy, just your usual mean, median, mode – that sort of thing. I figured it would be a good way to dust off my Python skills and, you know, actually do something practical.
data:image/s3,"s3://crabby-images/521bb/521bbfd2c4b583631a04afc2e2eef0b4e9b1c3ac" alt="Need Drew Sample Stats? Find His Performance Data Explained Here!"
Getting Started
First things first, I needed a sample. I didn’t want to spend ages finding a real dataset, so I just whipped up a list of numbers in Python. Something like this:
my_sample = [1, 5, 2, 8, 2, 9, 3, 2, 7, 6]
Totally random, I just punched in whatever came to mind. The important thing was having something to work with.
Calculating the Mean
Next up, the mean. Easy peasy, right? Add everything up, divide by the number of things. I did it the “long way” first, just to make sure I remembered how:
- I initialized a variable
total
to 0. - Then, I looped through my
my_sample
list. - In each loop, I added the current number to
total
. - Finally, I divided
total
by the length ofmy_sample
(which I got usinglen(my_sample)
) and boom, there’s the mean.
Of Course, I use the numpy to get the mean value,it is more fast and easy.
data:image/s3,"s3://crabby-images/0639f/0639f13e920aca3326374d9da0ddb4bcb2ce2012" alt="Need Drew Sample Stats? Find His Performance Data Explained Here!"
Finding the Median
Median’s a bit trickier, especially if you’re doing it by hand. You gotta sort the numbers first. I used Python’s built-in .sort()
method for that, which modifies the list in place.
my_*()
After sorting, it’s all about finding the middle number. If you have an odd number of items, it’s just the one in the exact center. If it’s even (like my sample), you take the average of the two middle ones.
I used some if and else to test the sample is even or odd,and then deal with these case,and the job is done.
Dealing with the Mode
Mode… ugh, the mode. It’s the most frequent number, which can be a pain to find manually if you have a lot of data. I thought about using a dictionary to count occurrences, but then I remembered the collections
module in Python. It has a Counter
class that’s perfect for this.
data:image/s3,"s3://crabby-images/7fb9b/7fb9b79ce067120fdc782ccbcba30d94404fbcb4" alt="Need Drew Sample Stats? Find His Performance Data Explained Here!"
from collections import Counter
counts = Counter(my_sample)
most_common = *_common(1)
most_common(1)
gives you a list containing a tuple with the most frequent item and its count. Super convenient!
Putting It All Together
At the end, I just printed out all the results: the mean, the median, and the mode. It wasn’t anything groundbreaking, but it felt good to go through the process and refresh my memory on these basic statistical concepts. Plus, it’s always satisfying to write a little bit of code that actually does something, even if it’s just crunching a few numbers.
data:image/s3,"s3://crabby-images/878a9/878a957ad0a502f15b71765dafffbc6a92c50007" alt="Need Drew Sample Stats? Find His Performance Data Explained Here!"
I think I am going to use the Pandas next time,it is way too convenient to use.