Statistical Analysis for Benchmarking
Comprehensive guide to statistical analysis techniques for Go performance benchmarking. This guide covers statistical methods, data analysis, confidence intervals, hypothesis testing, and advanced statistical modeling for performance data.
Table of Contents
- Introduction
- Statistical Framework
- Descriptive Statistics
- Inferential Statistics
- Hypothesis Testing
- Regression Analysis
- Time Series Analysis
- Outlier Detection
- Statistical Modeling
- Best Practices
Introduction
Statistical analysis transforms raw benchmark data into meaningful insights about performance characteristics. This guide provides comprehensive statistical methods for analyzing Go application performance, detecting significant changes, and making data-driven optimization decisions.
Statistical Framework
package main
import (
"context"
"fmt"
"math"
"sort"
"sync"
"time"
)
// StatisticalAnalyzer performs comprehensive statistical analysis on benchmark data
type StatisticalAnalyzer struct {
descriptive *DescriptiveAnalyzer
inferential *InferentialAnalyzer
hypothesis *HypothesisAnalyzer
regression *RegressionAnalyzer
timeSeries *TimeSeriesAnalyzer
outlierDetector *OutlierDetector
modeler *StatisticalModeler
config AnalyzerConfig
cache *AnalysisCache
metrics *AnalysisMetrics
mu sync.RWMutex
}
// AnalyzerConfig contains analyzer configuration
type AnalyzerConfig struct {
ConfidenceLevel float64
SignificanceLevel float64
MinSampleSize int
MaxSampleSize int
OutlierThreshold float64
EnableOutlierRemoval bool
EnableNormalization bool
EnableRobustStats bool
BootstrapSamples int
CacheEnabled bool
ParallelProcessing bool
MaxWorkers int
}
// BenchmarkData represents benchmark measurement data
type BenchmarkData struct {
ID string
Name string
Values []float64
Metadata DataMetadata
Timestamp time.Time
Environment Environment
Quality DataQuality
}
// DataMetadata contains metadata about benchmark data
type DataMetadata struct {
Source string
Version string
Platform string
GoVersion string
Iterations int
Duration time.Duration
MemoryUsage int64
CPUUsage float64
Tags map[string]string
}
// Environment describes the execution environment
type Environment struct {
OS string
Architecture string
CPUModel string
CPUCores int
Memory int64
LoadAverage float64
Temperature float64
PowerMode string
}
// DataQuality represents data quality metrics
type DataQuality struct {
Completeness float64
Consistency float64
Accuracy float64
Validity float64
Outliers int
MissingValues int
OverallScore float64
}
// DescriptiveAnalyzer performs descriptive statistical analysis
type DescriptiveAnalyzer struct {
config DescriptiveConfig
cache map[string]*DescriptiveStats
mu sync.RWMutex
}
// DescriptiveConfig contains descriptive analysis configuration
type DescriptiveConfig struct {
EnableRobustStats bool
EnablePercentiles bool
EnableDistribution bool
EnableCorrelation bool
PercentilePoints []float64
BinCount int
HistogramBins int
}
// DescriptiveStats contains descriptive statistics
type DescriptiveStats struct {
Count int
Mean float64
Median float64
Mode []float64
StdDev float64
Variance float64
Min float64
Max float64
Range float64
IQR float64
Percentiles map[float64]float64
Quartiles Quartiles
Moments Moments
Distribution DistributionStats
Robust RobustStats
}
// Quartiles represents quartile values
type Quartiles struct {
Q1 float64
Q2 float64 // Median
Q3 float64
}
// Moments represents statistical moments
type Moments struct {
Mean float64 // First moment
Variance float64 // Second central moment
Skewness float64 // Third standardized moment
Kurtosis float64 // Fourth standardized moment
}
// DistributionStats contains distribution characteristics
type DistributionStats struct {
Type DistributionType
Parameters map[string]float64
GoodnessOfFit float64
KSStatistic float64
ADStatistic float64
JBStatistic float64
Histogram Histogram
}
// DistributionType defines distribution types
type DistributionType int
const (
NormalDistribution DistributionType = iota
LogNormalDistribution
ExponentialDistribution
GammaDistribution
WeibullDistribution
UniformDistribution
PoissonDistribution
UnknownDistribution
)
// Histogram represents a histogram
type Histogram struct {
Bins []float64
Counts []int
Edges []float64
Width float64
}
// RobustStats contains robust statistical measures
type RobustStats struct {
TrimmedMean map[float64]float64 // Trimmed means at different levels
Winsorized map[float64]float64 // Winsorized means
MAD float64 // Median Absolute Deviation
IQRange float64 // Interquartile Range
Biweight BiweightStats
Huber HuberStats
}
// BiweightStats contains biweight statistics
type BiweightStats struct {
Location float64
Scale float64
}
// HuberStats contains Huber statistics
type HuberStats struct {
Location float64
Scale float64
K float64
}
// InferentialAnalyzer performs inferential statistical analysis
type InferentialAnalyzer struct {
config InferentialConfig
cache map[string]*InferentialResults
mu sync.RWMutex
}
// InferentialConfig contains inferential analysis configuration
type InferentialConfig struct {
ConfidenceLevel float64
BootstrapSamples int
EnableBootstrap bool
EnableJackknife bool
EnablePermutation bool
SamplingMethod SamplingMethod
}
// SamplingMethod defines sampling methods
type SamplingMethod int
const (
SimpleRandomSampling SamplingMethod = iota
StratifiedSampling
SystematicSampling
ClusterSampling
BootstrapSampling
)
// InferentialResults contains inferential analysis results
type InferentialResults struct {
PopulationMean EstimateWithCI
PopulationStdDev EstimateWithCI
PopulationVariance EstimateWithCI
Percentiles map[float64]EstimateWithCI
Bootstrap BootstrapResults
Jackknife JackknifeResults
Permutation PermutationResults
}
// EstimateWithCI represents an estimate with confidence interval
type EstimateWithCI struct {
Estimate float64
StdError float64
LowerBound float64
UpperBound float64
Confidence float64
Method EstimationMethod
}
// EstimationMethod defines estimation methods
type EstimationMethod int
const (
SampleMethod EstimationMethod = iota
BootstrapMethod
JackknifeMethod
BayesianMethod
RobustMethod
)
// BootstrapResults contains bootstrap analysis results
type BootstrapResults struct {
Samples int
Estimates []float64
Bias float64
StdError float64
Percentile ConfidenceInterval
BCa ConfidenceInterval // Bias-corrected and accelerated
StudentizedT ConfidenceInterval
}
// ConfidenceInterval represents a confidence interval
type ConfidenceInterval struct {
Lower float64
Upper float64
Confidence float64
Method CIMethod
}
// CIMethod defines confidence interval methods
type CIMethod int
const (
PercentileCI CIMethod = iota
BCaCI
StudentizedCI
NormalCI
BasicCI
)
// JackknifeResults contains jackknife analysis results
type JackknifeResults struct {
Estimates []float64
Bias float64
StdError float64
Variance float64
}
// PermutationResults contains permutation test results
type PermutationResults struct {
Permutations int
Statistics []float64
PValue float64
Observed float64
CriticalValue float64
}
// HypothesisAnalyzer performs hypothesis testing
type HypothesisAnalyzer struct {
tests map[string]HypothesisTest
config HypothesisConfig
cache map[string]*TestResults
mu sync.RWMutex
}
// HypothesisConfig contains hypothesis testing configuration
type HypothesisConfig struct {
SignificanceLevel float64
PowerAnalysis bool
EffectSize bool
MultipleComparisons bool
CorrectionMethod CorrectionMethod
EnableNonParametric bool
}
// CorrectionMethod defines multiple comparison correction methods
type CorrectionMethod int
const (
BonferroniCorrection CorrectionMethod = iota
HolmCorrection
BenjaminiHochbergCorrection
FDRCorrection
NoCorrection
)
// HypothesisTest defines hypothesis tests
type HypothesisTest interface {
Test(data1, data2 []float64, options TestOptions) (*TestResults, error)
GetType() TestType
GetAssumptions() []string
ValidateAssumptions(data []float64) AssumptionResults
}
// TestType defines test types
type TestType int
const (
OneSampleTTest TestType = iota
TwoSampleTTest
PairedTTest
WelchTTest
MannWhitneyTest
WilcoxonTest
KruskalWallisTest
FriedmanTest
ChiSquareTest
KSTest
AndersonDarlingTest
ShapiroWilkTest
)
// TestOptions contains test options
type TestOptions struct {
Alternative Alternative
Paired bool
EqualVariances bool
ConfidenceLevel float64
Exact bool
Continuity bool
}
// Alternative defines alternative hypotheses
type Alternative int
const (
TwoSided Alternative = iota
Greater
Less
)
// TestResults contains hypothesis test results
type TestResults struct {
TestType TestType
Statistic float64
PValue float64
CriticalValue float64
DegreesOfFreedom float64
ConfidenceInterval ConfidenceInterval
EffectSize EffectSizeResults
PowerAnalysis PowerResults
Decision TestDecision
Interpretation string
Assumptions AssumptionResults
}
// TestDecision represents test decision
type TestDecision int
const (
RejectNull TestDecision = iota
FailToRejectNull
Inconclusive
)
// EffectSizeResults contains effect size measurements
type EffectSizeResults struct {
CohensD float64
GlassD float64
HedgesG float64
R float64
R2 float64
EtaSquared float64
Omega2 float64
Interpretation EffectSizeInterpretation
}
// EffectSizeInterpretation defines effect size interpretations
type EffectSizeInterpretation int
const (
NegligibleEffect EffectSizeInterpretation = iota
SmallEffect
MediumEffect
LargeEffect
VeryLargeEffect
)
// PowerResults contains statistical power analysis results
type PowerResults struct {
Power float64
RequiredN int
DetectableEffect float64
TypeIIError float64
Sensitivity float64
}
// AssumptionResults contains assumption validation results
type AssumptionResults struct {
Normality NormalityTest
HomoscedasticityHomoscedasticityTest
Independence IndependenceTest
Outliers OutlierTest
Satisfied bool
Warnings []string
Recommendations []string
}
// NormalityTest contains normality test results
type NormalityTest struct {
ShapiroWilk TestResult
KolmogorovSmirnov TestResult
AndersonDarling TestResult
JarqueBera TestResult
Satisfied bool
}
// HomoscedasticityTest contains homoscedasticity test results
type HomoscedasticityTest struct {
Levene TestResult
BrownForsythe TestResult
Bartlett TestResult
FTest TestResult
Satisfied bool
}
// IndependenceTest contains independence test results
type IndependenceTest struct {
DurbinWatson TestResult
RunsTest TestResult
LjungBox TestResult
Satisfied bool
}
// OutlierTest contains outlier detection results
type OutlierTest struct {
Grubbs TestResult
Dixon TestResult
Rosner TestResult
ModifiedZ TestResult
OutlierCount int
OutlierIndices []int
Satisfied bool
}
// TestResult represents a single test result
type TestResult struct {
Statistic float64
PValue float64
Significant bool
Method string
}
// RegressionAnalyzer performs regression analysis
type RegressionAnalyzer struct {
models map[string]RegressionModel
config RegressionConfig
cache map[string]*RegressionResults
mu sync.RWMutex
}
// RegressionConfig contains regression analysis configuration
type RegressionConfig struct {
EnableLinear bool
EnableNonLinear bool
EnableRobust bool
EnableRegularized bool
CrossValidation bool
CVFolds int
FeatureSelection bool
TransformData bool
}
// RegressionModel defines regression models
type RegressionModel interface {
Fit(x, y []float64) error
Predict(x []float64) ([]float64, error)
GetCoefficients() []float64
GetStatistics() ModelStatistics
GetType() ModelType
}
// ModelType defines regression model types
type ModelType int
const (
LinearRegression ModelType = iota
PolynomialRegression
ExponentialRegression
LogarithmicRegression
PowerRegression
RobustRegression
RidgeRegression
LassoRegression
ElasticNetRegression
)
// RegressionResults contains regression analysis results
type RegressionResults struct {
Model ModelType
Coefficients []float64
Intercept float64
RSquared float64
AdjustedRSquared float64
FStatistic float64
PValue float64
StandardErrors []float64
TStatistics []float64
PValues []float64
ConfidenceIntervals []ConfidenceInterval
Residuals ResidualAnalysis
Diagnostics RegressionDiagnostics
CrossValidation CrossValidationResults
Predictions PredictionResults
}
// ModelStatistics contains model statistics
type ModelStatistics struct {
RSquared float64
AdjustedRSquared float64
RMSE float64
MAE float64
AIC float64
BIC float64
LogLikelihood float64
FStatistic float64
PValue float64
DegreesOfFreedom int
}
// ResidualAnalysis contains residual analysis results
type ResidualAnalysis struct {
Residuals []float64
Standardized []float64
Studentized []float64
Leverage []float64
CooksDistance []float64
DFFits []float64
DFBetas [][]float64
Normality NormalityTest
Homoscedasticity HomoscedasticityTest
Independence IndependenceTest
Linearity LinearityTest
}
// LinearityTest contains linearity test results
type LinearityTest struct {
RamseyRESET TestResult
HarveyCollier TestResult
RainbowTest TestResult
Satisfied bool
}
// RegressionDiagnostics contains regression diagnostics
type RegressionDiagnostics struct {
Multicollinearity MulticollinearityTest
Outliers OutlierDiagnostics
Influence InfluenceDiagnostics
Heteroscedasticity HeteroscedasticityTest
Autocorrelation AutocorrelationTest
}
// MulticollinearityTest contains multicollinearity test results
type MulticollinearityTest struct {
VIF []float64 // Variance Inflation Factor
ConditionIndex float64
Eigenvalues []float64
Satisfied bool
}
// OutlierDiagnostics contains outlier diagnostics
type OutlierDiagnostics struct {
Outliers []int
HighLeverage []int
Influential []int
CooksD []float64
Threshold float64
}
// InfluenceDiagnostics contains influence diagnostics
type InfluenceDiagnostics struct {
Hat []float64
CooksDistance []float64
DFFits []float64
DFBetas [][]float64
CovarianceRatio []float64
}
// HeteroscedasticityTest contains heteroscedasticity test results
type HeteroscedasticityTest struct {
BreuschPagan TestResult
White TestResult
Goldfeld TestResult
Satisfied bool
}
// AutocorrelationTest contains autocorrelation test results
type AutocorrelationTest struct {
DurbinWatson TestResult
LjungBox TestResult
BreuschGodfrey TestResult
Satisfied bool
}
// CrossValidationResults contains cross-validation results
type CrossValidationResults struct {
Folds int
TrainScores []float64
TestScores []float64
MeanTrain float64
MeanTest float64
StdTrain float64
StdTest float64
Overfitting bool
}
// PredictionResults contains prediction results
type PredictionResults struct {
Predictions []float64
Intervals []ConfidenceInterval
PredictionBands []ConfidenceInterval
Residuals []float64
RMSE float64
MAE float64
MAPE float64
}
// TimeSeriesAnalyzer performs time series analysis
type TimeSeriesAnalyzer struct {
models map[string]TimeSeriesModel
config TimeSeriesConfig
cache map[string]*TimeSeriesResults
mu sync.RWMutex
}
// TimeSeriesConfig contains time series analysis configuration
type TimeSeriesConfig struct {
EnableTrend bool
EnableSeasonality bool
EnableStationarity bool
EnableForecasting bool
AutoArima bool
SeasonalPeriod int
ForecastHorizon int
ConfidenceLevel float64
}
// TimeSeriesModel defines time series models
type TimeSeriesModel interface {
Fit(data []TimedValue) error
Forecast(periods int) ([]float64, []ConfidenceInterval, error)
GetComponents() TimeSeriesComponents
GetStatistics() TimeSeriesStatistics
GetType() TimeSeriesModelType
}
// TimeSeriesModelType defines time series model types
type TimeSeriesModelType int
const (
ARIMA TimeSeriesModelType = iota
SARIMA
ExponentialSmoothing
HoltWinters
StateSpace
GARCH
Prophet
)
// TimedValue represents a time-indexed value
type TimedValue struct {
Time time.Time
Value float64
}
// TimeSeriesResults contains time series analysis results
type TimeSeriesResults struct {
Model TimeSeriesModelType
Components TimeSeriesComponents
Statistics TimeSeriesStatistics
Stationarity StationarityTests
Seasonality SeasonalityTests
Forecast ForecastResults
Diagnostics TimeSeriesDiagnostics
ChangePoints []ChangePoint
}
// TimeSeriesComponents contains decomposed components
type TimeSeriesComponents struct {
Trend []float64
Seasonal []float64
Residual []float64
Level []float64
Slope []float64
Irregular []float64
}
// TimeSeriesStatistics contains time series statistics
type TimeSeriesStatistics struct {
AIC float64
BIC float64
LogLikelihood float64
RMSE float64
MAE float64
MAPE float64
MASE float64
AutoCorrelation []float64
PartialAutoCorrelation []float64
}
// StationarityTests contains stationarity test results
type StationarityTests struct {
ADF TestResult // Augmented Dickey-Fuller
KPSS TestResult // Kwiatkowski-Phillips-Schmidt-Shin
PhillipsPerron TestResult
Stationary bool
Trend bool
Drift bool
}
// SeasonalityTests contains seasonality test results
type SeasonalityTests struct {
FriedmanTest TestResult
KruskalWallis TestResult
XTest TestResult
QSSeasonal TestResult
Seasonal bool
Period int
Strength float64
}
// ForecastResults contains forecasting results
type ForecastResults struct {
Forecast []float64
ConfidenceIntervals []ConfidenceInterval
PredictionIntervals []ConfidenceInterval
Residuals []float64
Accuracy ForecastAccuracy
Backtesting BacktestResults
}
// ForecastAccuracy contains forecast accuracy metrics
type ForecastAccuracy struct {
MAE float64
MAPE float64
RMSE float64
MASE float64
sMAPE float64
MSIS float64
}
// BacktestResults contains backtesting results
type BacktestResults struct {
Periods int
Accuracy []ForecastAccuracy
Average ForecastAccuracy
Stability float64
Trend TrendDirection
}
// TrendDirection defines trend directions
type TrendDirection int
const (
NoTrend TrendDirection = iota
UpTrend
DownTrend
Volatile
)
// TimeSeriesDiagnostics contains time series diagnostics
type TimeSeriesDiagnostics struct {
Residuals ResidualDiagnostics
LjungBox TestResult
JarqueBera TestResult
ArchTest TestResult
Heteroscedasticity HeteroscedasticityTest
Normality NormalityTest
}
// ResidualDiagnostics contains residual diagnostics
type ResidualDiagnostics struct {
Residuals []float64
Standardized []float64
ACF []float64
PACF []float64
QQPlot QQPlotResults
WhiteNoise bool
}
// QQPlotResults contains Q-Q plot results
type QQPlotResults struct {
Quantiles []float64
Theoretical []float64
RSquared float64
Slope float64
Intercept float64
Normality bool
}
// ChangePoint represents a change point in time series
type ChangePoint struct {
Time time.Time
Type ChangeType
Magnitude float64
Confidence float64
Method ChangePointMethod
}
// ChangeType defines change point types
type ChangeType int
const (
LevelChange ChangeType = iota
TrendChange
VarianceChange
SeasonalChange
)
// ChangePointMethod defines change point detection methods
type ChangePointMethod int
const (
CUSUM ChangePointMethod = iota
PELT
BinSeg
Segment
WindowBased
)
// OutlierDetector detects outliers in benchmark data
type OutlierDetector struct {
methods map[string]OutlierMethod
config OutlierConfig
cache map[string]*OutlierResults
mu sync.RWMutex
}
// OutlierConfig contains outlier detection configuration
type OutlierConfig struct {
EnableMultipleMethods bool
EnableRobustMethods bool
ThresholdMultiplier float64
MaxOutlierProportion float64
AutomaticThreshold bool
EnsembleVoting bool
MinAgreement int
}
// OutlierMethod defines outlier detection methods
type OutlierMethod interface {
DetectOutliers(data []float64) (*OutlierResults, error)
GetType() OutlierMethodType
GetThreshold() float64
SetThreshold(threshold float64)
}
// OutlierMethodType defines outlier method types
type OutlierMethodType int
const (
ZScoreMethod OutlierMethodType = iota
ModifiedZScoreMethod
IQRMethod
IsolationForestMethod
LocalOutlierFactorMethod
EllipticEnvelopeMethod
OneClassSVMMethod
DBSCANMethod
)
// OutlierResults contains outlier detection results
type OutlierResults struct {
Outliers []int
Scores []float64
Threshold float64
Method OutlierMethodType
Confidence []float64
Severity []OutlierSeverity
Recommendations []string
}
// OutlierSeverity defines outlier severity levels
type OutlierSeverity int
const (
MildOutlier OutlierSeverity = iota
ModerateOutlier
SevereOutlier
ExtremeOutlier
)
// StatisticalModeler builds statistical models from benchmark data
type StatisticalModeler struct {
models map[string]StatisticalModel
config ModelerConfig
cache map[string]*ModelResults
mu sync.RWMutex
}
// ModelerConfig contains modeler configuration
type ModelerConfig struct {
EnableAutoModel bool
EnableEnsemble bool
EnableValidation bool
ValidationMethod ValidationMethod
CrossValidationFolds int
BootstrapSamples int
ModelSelection ModelSelectionCriteria
FeatureEngineering bool
}
// ValidationMethod defines validation methods
type ValidationMethod int
const (
HoldoutValidation ValidationMethod = iota
CrossValidation
BootstrapValidation
TimeSeriesSplit
)
// ModelSelectionCriteria defines model selection criteria
type ModelSelectionCriteria int
const (
AICCriteria ModelSelectionCriteria = iota
BICCriteria
CrossValidationScore
AdjustedRSquared
FStatistic
)
// StatisticalModel defines statistical models
type StatisticalModel interface {
Fit(data *BenchmarkData) error
Predict(input interface{}) (interface{}, error)
Evaluate(testData *BenchmarkData) (*ModelEvaluation, error)
GetParameters() ModelParameters
GetType() StatisticalModelType
}
// StatisticalModelType defines statistical model types
type StatisticalModelType int
const (
LinearModel StatisticalModelType = iota
NonLinearModel
EnsembleModel
BayesianModel
RobustModel
TimeSeriesModel
MachineLearningModel
)
// ModelResults contains statistical modeling results
type ModelResults struct {
Model StatisticalModelType
Parameters ModelParameters
Evaluation ModelEvaluation
Validation ValidationResults
FeatureImportance []FeatureImportance
Predictions PredictionResults
Diagnostics ModelDiagnostics
}
// ModelParameters contains model parameters
type ModelParameters struct {
Coefficients []float64
Intercept float64
Variance float64
Degrees int
Regularization float64
Hyperparameters map[string]interface{}
}
// ModelEvaluation contains model evaluation metrics
type ModelEvaluation struct {
TrainingMetrics EvaluationMetrics
TestingMetrics EvaluationMetrics
ValidationMetrics EvaluationMetrics
Overfitting bool
Underfitting bool
Generalization float64
}
// EvaluationMetrics contains evaluation metrics
type EvaluationMetrics struct {
MSE float64
RMSE float64
MAE float64
MAPE float64
RSquared float64
AdjRSquared float64
AIC float64
BIC float64
LogLikelihood float64
}
// ValidationResults contains validation results
type ValidationResults struct {
Method ValidationMethod
Folds int
Scores []float64
MeanScore float64
StdScore float64
Confidence ConfidenceInterval
Stability float64
}
// FeatureImportance contains feature importance information
type FeatureImportance struct {
Feature string
Importance float64
Rank int
PValue float64
Confidence ConfidenceInterval
}
// ModelDiagnostics contains model diagnostics
type ModelDiagnostics struct {
Residuals ResidualAnalysis
Assumptions AssumptionResults
Influence InfluenceDiagnostics
Multicollinearity MulticollinearityTest
Outliers OutlierDiagnostics
GoodnessOfFit GoodnessOfFitTests
}
// GoodnessOfFitTests contains goodness of fit tests
type GoodnessOfFitTests struct {
ChiSquare TestResult
KolmogorovSmirnov TestResult
AndersonDarling TestResult
CramerVonMises TestResult
Satisfied bool
}
// NewStatisticalAnalyzer creates a new statistical analyzer
func NewStatisticalAnalyzer(config AnalyzerConfig) *StatisticalAnalyzer {
return &StatisticalAnalyzer{
descriptive: NewDescriptiveAnalyzer(),
inferential: NewInferentialAnalyzer(),
hypothesis: NewHypothesisAnalyzer(),
regression: NewRegressionAnalyzer(),
timeSeries: NewTimeSeriesAnalyzer(),
outlierDetector: NewOutlierDetector(),
modeler: NewStatisticalModeler(),
config: config,
cache: NewAnalysisCache(),
metrics: &AnalysisMetrics{},
}
}
// AnalyzeData performs comprehensive statistical analysis
func (sa *StatisticalAnalyzer) AnalyzeData(data *BenchmarkData) (*AnalysisResults, error) {
sa.mu.Lock()
defer sa.mu.Unlock()
// Check cache first
if sa.config.CacheEnabled {
if cached := sa.cache.Get(data.ID); cached != nil {
return cached, nil
}
}
// Validate data quality
if err := sa.validateData(data); err != nil {
return nil, fmt.Errorf("data validation failed: %w", err)
}
// Perform descriptive analysis
descriptive, err := sa.descriptive.Analyze(data.Values)
if err != nil {
return nil, fmt.Errorf("descriptive analysis failed: %w", err)
}
// Detect and handle outliers
outliers, err := sa.outlierDetector.Detect(data.Values)
if err != nil {
return nil, fmt.Errorf("outlier detection failed: %w", err)
}
// Clean data if outlier removal is enabled
cleanData := data.Values
if sa.config.EnableOutlierRemoval && len(outliers.Outliers) > 0 {
cleanData = sa.removeOutliers(data.Values, outliers.Outliers)
}
// Perform inferential analysis
inferential, err := sa.inferential.Analyze(cleanData)
if err != nil {
return nil, fmt.Errorf("inferential analysis failed: %w", err)
}
// Create analysis results
results := &AnalysisResults{
Descriptive: descriptive,
Inferential: inferential,
Outliers: outliers,
DataQuality: data.Quality,
Metadata: data.Metadata,
Timestamp: time.Now(),
}
// Cache results
if sa.config.CacheEnabled {
sa.cache.Set(data.ID, results)
}
return results, nil
}
// CompareData performs statistical comparison between datasets
func (sa *StatisticalAnalyzer) CompareData(data1, data2 *BenchmarkData, testType TestType) (*ComparisonResults, error) {
// Validate assumptions for the chosen test
assumptions1 := sa.hypothesis.ValidateAssumptions(data1.Values, testType)
assumptions2 := sa.hypothesis.ValidateAssumptions(data2.Values, testType)
// Perform the statistical test
options := TestOptions{
Alternative: TwoSided,
ConfidenceLevel: sa.config.ConfidenceLevel,
}
testResults, err := sa.hypothesis.PerformTest(testType, data1.Values, data2.Values, options)
if err != nil {
return nil, fmt.Errorf("hypothesis test failed: %w", err)
}
// Calculate effect size
effectSize := sa.calculateEffectSize(data1.Values, data2.Values)
// Create comparison results
results := &ComparisonResults{
Test: testResults,
EffectSize: effectSize,
Assumptions1: assumptions1,
Assumptions2: assumptions2,
Recommendation: sa.generateRecommendation(testResults, effectSize),
Timestamp: time.Now(),
}
return results, nil
}
// validateData validates benchmark data
func (sa *StatisticalAnalyzer) validateData(data *BenchmarkData) error {
if len(data.Values) < sa.config.MinSampleSize {
return fmt.Errorf("insufficient sample size: %d (minimum: %d)",
len(data.Values), sa.config.MinSampleSize)
}
if len(data.Values) > sa.config.MaxSampleSize {
return fmt.Errorf("sample size too large: %d (maximum: %d)",
len(data.Values), sa.config.MaxSampleSize)
}
// Check for missing values
for i, value := range data.Values {
if math.IsNaN(value) || math.IsInf(value, 0) {
return fmt.Errorf("invalid value at index %d: %f", i, value)
}
}
return nil
}
// removeOutliers removes outliers from data
func (sa *StatisticalAnalyzer) removeOutliers(data []float64, outlierIndices []int) []float64 {
if len(outlierIndices) == 0 {
return data
}
// Create a map of outlier indices for fast lookup
outlierMap := make(map[int]bool)
for _, idx := range outlierIndices {
outlierMap[idx] = true
}
// Filter out outliers
var cleaned []float64
for i, value := range data {
if !outlierMap[i] {
cleaned = append(cleaned, value)
}
}
return cleaned
}
// calculateEffectSize calculates effect size between two datasets
func (sa *StatisticalAnalyzer) calculateEffectSize(data1, data2 []float64) EffectSizeResults {
// Calculate means and standard deviations
mean1 := mean(data1)
mean2 := mean(data2)
std1 := stddev(data1)
std2 := stddev(data2)
// Pooled standard deviation
n1, n2 := float64(len(data1)), float64(len(data2))
pooledStd := math.Sqrt(((n1-1)*std1*std1 + (n2-1)*std2*std2) / (n1 + n2 - 2))
// Cohen's d
cohensD := (mean1 - mean2) / pooledStd
// Glass's delta
glassD := (mean1 - mean2) / std2
// Hedges' g (bias-corrected)
hedgesG := cohensD * (1 - 3/(4*(n1+n2-2)-1))
// Interpret effect size
var interpretation EffectSizeInterpretation
absCohensD := math.Abs(cohensD)
if absCohensD < 0.2 {
interpretation = NegligibleEffect
} else if absCohensD < 0.5 {
interpretation = SmallEffect
} else if absCohensD < 0.8 {
interpretation = MediumEffect
} else if absCohensD < 1.3 {
interpretation = LargeEffect
} else {
interpretation = VeryLargeEffect
}
return EffectSizeResults{
CohensD: cohensD,
GlassD: glassD,
HedgesG: hedgesG,
Interpretation: interpretation,
}
}
// generateRecommendation generates recommendations based on analysis results
func (sa *StatisticalAnalyzer) generateRecommendation(testResults *TestResults, effectSize EffectSizeResults) string {
var recommendations []string
// Statistical significance
if testResults.Decision == RejectNull {
recommendations = append(recommendations,
fmt.Sprintf("The difference is statistically significant (p = %.4f)", testResults.PValue))
} else {
recommendations = append(recommendations,
fmt.Sprintf("No statistically significant difference found (p = %.4f)", testResults.PValue))
}
// Effect size interpretation
switch effectSize.Interpretation {
case NegligibleEffect:
recommendations = append(recommendations, "The effect size is negligible - the difference may not be practically meaningful")
case SmallEffect:
recommendations = append(recommendations, "The effect size is small - the difference is detectable but may have limited practical impact")
case MediumEffect:
recommendations = append(recommendations, "The effect size is medium - the difference is likely to have noticeable practical impact")
case LargeEffect:
recommendations = append(recommendations, "The effect size is large - the difference has substantial practical significance")
case VeryLargeEffect:
recommendations = append(recommendations, "The effect size is very large - the difference has major practical significance")
}
// Additional recommendations based on assumptions
if !testResults.Assumptions.Satisfied {
recommendations = append(recommendations, "Consider using non-parametric tests due to assumption violations")
}
return fmt.Sprintf("%s", recommendations)
}
// Helper functions
func mean(data []float64) float64 {
if len(data) == 0 {
return 0
}
sum := 0.0
for _, value := range data {
sum += value
}
return sum / float64(len(data))
}
func stddev(data []float64) float64 {
if len(data) <= 1 {
return 0
}
m := mean(data)
sum := 0.0
for _, value := range data {
diff := value - m
sum += diff * diff
}
return math.Sqrt(sum / float64(len(data)-1))
}
// Result types
type AnalysisResults struct {
Descriptive *DescriptiveStats
Inferential *InferentialResults
Outliers *OutlierResults
DataQuality DataQuality
Metadata DataMetadata
Timestamp time.Time
}
type ComparisonResults struct {
Test *TestResults
EffectSize EffectSizeResults
Assumptions1 AssumptionResults
Assumptions2 AssumptionResults
Recommendation string
Timestamp time.Time
}
// Placeholder types and implementations
type AnalysisCache struct{}
type AnalysisMetrics struct{}
func NewDescriptiveAnalyzer() *DescriptiveAnalyzer { return &DescriptiveAnalyzer{} }
func NewInferentialAnalyzer() *InferentialAnalyzer { return &InferentialAnalyzer{} }
func NewHypothesisAnalyzer() *HypothesisAnalyzer { return &HypothesisAnalyzer{} }
func NewRegressionAnalyzer() *RegressionAnalyzer { return &RegressionAnalyzer{} }
func NewTimeSeriesAnalyzer() *TimeSeriesAnalyzer { return &TimeSeriesAnalyzer{} }
func NewOutlierDetector() *OutlierDetector { return &OutlierDetector{} }
func NewStatisticalModeler() *StatisticalModeler { return &StatisticalModeler{} }
func NewAnalysisCache() *AnalysisCache { return &AnalysisCache{} }
func (ac *AnalysisCache) Get(key string) *AnalysisResults { return nil }
func (ac *AnalysisCache) Set(key string, results *AnalysisResults) {}
func (da *DescriptiveAnalyzer) Analyze(data []float64) (*DescriptiveStats, error) { return nil, nil }
func (od *OutlierDetector) Detect(data []float64) (*OutlierResults, error) { return nil, nil }
func (ia *InferentialAnalyzer) Analyze(data []float64) (*InferentialResults, error) { return nil, nil }
func (ha *HypothesisAnalyzer) ValidateAssumptions(data []float64, testType TestType) AssumptionResults { return AssumptionResults{} }
func (ha *HypothesisAnalyzer) PerformTest(testType TestType, data1, data2 []float64, options TestOptions) (*TestResults, error) { return nil, nil }
// Example usage
func ExampleStatisticalAnalysis() {
// Create analyzer configuration
config := AnalyzerConfig{
ConfidenceLevel: 0.95,
SignificanceLevel: 0.05,
MinSampleSize: 10,
MaxSampleSize: 10000,
OutlierThreshold: 3.0,
EnableOutlierRemoval: true,
EnableNormalization: true,
EnableRobustStats: true,
BootstrapSamples: 1000,
CacheEnabled: true,
ParallelProcessing: true,
MaxWorkers: 4,
}
// Create analyzer
analyzer := NewStatisticalAnalyzer(config)
// Create sample benchmark data
data := &BenchmarkData{
ID: "benchmark-001",
Name: "CPU Performance Test",
Values: []float64{1.2, 1.1, 1.3, 1.0, 1.4, 1.2, 1.1, 1.3, 1.5, 1.2},
Metadata: DataMetadata{
Source: "go test -bench",
Version: "1.0.0",
Platform: "linux/amd64",
GoVersion: "1.21.0",
Iterations: 10,
Duration: time.Second,
},
Timestamp: time.Now(),
Quality: DataQuality{
Completeness: 1.0,
Consistency: 0.95,
Accuracy: 0.98,
Validity: 1.0,
OverallScore: 0.98,
},
}
// Perform analysis
results, err := analyzer.AnalyzeData(data)
if err != nil {
fmt.Printf("Analysis failed: %v\n", err)
return
}
fmt.Println("Statistical Analysis Results:")
fmt.Printf("Sample size: %d\n", results.Descriptive.Count)
fmt.Printf("Mean: %.4f\n", results.Descriptive.Mean)
fmt.Printf("Median: %.4f\n", results.Descriptive.Median)
fmt.Printf("Standard deviation: %.4f\n", results.Descriptive.StdDev)
fmt.Printf("95%% CI for mean: [%.4f, %.4f]\n",
results.Inferential.PopulationMean.LowerBound,
results.Inferential.PopulationMean.UpperBound)
if len(results.Outliers.Outliers) > 0 {
fmt.Printf("Outliers detected: %d\n", len(results.Outliers.Outliers))
}
fmt.Printf("Data quality score: %.2f\n", results.DataQuality.OverallScore)
}
Descriptive Statistics
Comprehensive descriptive statistical analysis of benchmark data.
Central Tendency Measures
Analysis of mean, median, mode, and robust central tendency measures.
Variability Measures
Standard deviation, variance, interquartile range, and robust variability measures.
Distribution Analysis
Shape analysis including skewness, kurtosis, and distribution fitting.
Inferential Statistics
Statistical inference techniques for making population inferences from sample data.
Confidence Intervals
Construction of confidence intervals for population parameters.
Bootstrap Methods
Non-parametric bootstrap methods for robust statistical inference.
Sampling Distributions
Analysis of sampling distributions and their properties.
Hypothesis Testing
Comprehensive hypothesis testing framework for performance comparisons.
Parametric Tests
T-tests, F-tests, and other parametric hypothesis tests.
Non-parametric Tests
Mann-Whitney, Wilcoxon, and other distribution-free tests.
Multiple Comparisons
Correction methods for multiple hypothesis testing scenarios.
Best Practices
- Sample Size: Ensure adequate sample sizes for reliable statistical inference
- Assumption Validation: Always validate test assumptions before applying methods
- Effect Size: Report both statistical significance and practical significance
- Multiple Comparisons: Apply appropriate corrections for multiple testing
- Outlier Handling: Carefully consider outlier detection and treatment strategies
- Robust Methods: Use robust statistical methods when assumptions are violated
- Reproducibility: Ensure analyses are reproducible with proper random seeds
- Interpretation: Provide clear interpretation of statistical results
Summary
Statistical analysis provides the foundation for evidence-based performance optimization:
- Descriptive Analysis: Comprehensive characterization of performance data
- Inferential Analysis: Population inferences from sample measurements
- Hypothesis Testing: Rigorous testing of performance hypotheses
- Effect Size Analysis: Quantification of practical significance
- Outlier Detection: Identification and handling of anomalous measurements
- Model Building: Statistical models for performance prediction
These techniques enable data-driven performance optimization decisions with proper statistical rigor and confidence.