Posts

Showing posts with the label Statistics

Descriptive Statistics - using Python, MS Excel

The example below illustrates how to compute statistical averages and variances of a numerical column in excel. By statistical averages and variances we mean to compute the below: Median Average Mean Variance Standard Devation The program assumes that the 2nd column of the "details" worksheet in "marks.xls" workbook has marks obtained in Math by the students of a class. The descriptive Statistics can also be generated using the "Data Analysis" option in MS Excel. To be able to view this option under Data Menu in Excel, ensure to check the Analysis ToolPak option under Tools --> Add-ins Note: NaN = Not a Number ------- from openpyxl import load_workbook import numpy as np wb = load_workbook( filename = "marks.xlsx" ) sheet = wb[ "details" ] rows = sheet.max_row columns = sheet.max_column mathmarks = np.ndarray(rows- 1 ) for i in range ( 2 ,rows+ 1 ): mathmarks[i- 2 ] = sheet.cell( row =i, column = 2 ).value...

Order Statistics: Example using Python

The example below illustrates how to apply order statistics to a column in excel. Order Statistics are basically to find the minimum, maximum and range of an array or set of data. The program assumes that the 2nd column of the "details" worksheet in "marks.xls" workbook has marks obtained in Math by the students of a class. ----------- from openpyxl import load_workbook import numpy as np # File being read is marks.xlsx wb = load_workbook( filename = 'marks.xlsx' ) sheet = wb[ 'details' ] # Finding no of rows and columns in the sheet rows = sheet.max_row columns = sheet.max_column matharr = np.ndarray(rows- 1 ) # reading the values of a column to numpy array to be able to compute order statistics for i in range ( 2 , rows+ 1 ): matharr[i- 2 ] = sheet.cell( row = i, column = 2 ).value print ( "Minimum of the array" ) print (np.amin(matharr)) print ( "Minimum of the array excluding NaN (Not a Number) values...