Migration Guide - Switching to Real Simple Stats
Complete guide for migrating from other statistical libraries to Real Simple Stats.
📚 Overview
This guide helps you transition from:
R - Statistical programming language
SciPy - Python scientific computing
statsmodels - Python statistical models
SPSS - Commercial statistical software
Excel - Spreadsheet analysis
🔄 From R to Real Simple Stats
Philosophy Differences
Aspect |
R |
Real Simple Stats |
|---|---|---|
Syntax |
|
|
Data structures |
data.frames, vectors |
Lists, NumPy arrays |
Output |
Complex objects |
Simple dicts/tuples |
Installation |
|
|
Common Function Translations
Descriptive Statistics
R |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Example Migration:
# R code
data <- c(1, 2, 3, 4, 5)
mean_val <- mean(data)
sd_val <- sd(data)
# Python equivalent
import real_simple_stats as rss
data = [1, 2, 3, 4, 5]
mean_val = rss.mean(data)
sd_val = rss.sample_std_dev(data)
Hypothesis Tests
R |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
|
|
Example Migration:
# R code
group1 <- c(23, 25, 28, 30, 32)
group2 <- c(28, 30, 35, 38, 40)
result <- t.test(group1, group2)
print(result$p.value)
# Python equivalent
import real_simple_stats as rss
group1 = [23, 25, 28, 30, 32]
group2 = [28, 30, 35, 38, 40]
t_stat, p_value = rss.two_sample_t_test(group1, group2)
print(p_value)
Regression
R |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
Example Migration:
# R code
x <- c(1, 2, 3, 4, 5)
y <- c(2, 4, 5, 4, 5)
model <- lm(y ~ x)
summary(model)
# Python equivalent
import real_simple_stats as rss
x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]
slope, intercept, r_value, p_value, std_err = rss.linear_regression(x, y)
r_squared = r_value ** 2
print(f"Slope: {slope:.3f}")
print(f"Intercept: {intercept:.3f}")
print(f"R²: {r_squared:.3f}")
print(f"p-value: {p_value:.4f}")
Distributions
R |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Key Differences
Return Values:
# R returns complex object result <- t.test(x, y) result$statistic result$p.value result$conf.int
# Python returns tuple t_stat, p_value = rss.two_sample_t_test(x, y) # Simpler, but less information
Data Frames:
# R uses data frames natively df <- data.frame(x=c(1,2,3), y=c(4,5,6)) cor(df$x, df$y)
# Python uses lists or pandas import pandas as pd df = pd.DataFrame({'x': [1,2,3], 'y': [4,5,6]}) rss.pearson_correlation(df['x'].tolist(), df['y'].tolist())
Missing Values:
# R handles NA automatically mean(c(1, 2, NA, 4), na.rm=TRUE)
# Python requires manual handling data = [1, 2, None, 4] clean_data = [x for x in data if x is not None] rss.mean(clean_data)
Function Translations
Descriptive Statistics
SciPy/NumPy |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
|
|
Hypothesis Tests
SciPy |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
|
|
Example Migration:
# SciPy code
from scipy import stats
import numpy as np
data = [23, 25, 28, 30, 32]
t_stat, p_value = stats.ttest_1samp(data, 30)
# Real Simple Stats equivalent
import real_simple_stats as rss
data = [23, 25, 28, 30, 32]
t_stat, p_value = rss.one_sample_t_test(data, mu0=30)
Distributions
SciPy |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
|
|
Regression
SciPy |
Real Simple Stats |
|---|---|
|
|
|
|
Example Migration:
# SciPy code
from scipy import stats
x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
# Real Simple Stats equivalent (identical!)
import real_simple_stats as rss
x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]
slope, intercept, r_value, p_value, std_err = rss.linear_regression(x, y)
Key Advantages of Real Simple Stats
Simpler imports:
# SciPy from scipy import stats from scipy.stats import norm, binom import numpy as np # Real Simple Stats import real_simple_stats as rss
Clearer function names:
# SciPy stats.ttest_ind(group1, group2) # Real Simple Stats (more descriptive) rss.two_sample_t_test(group1, group2)
Educational focus:
# Real Simple Stats has better docstrings help(rss.two_sample_t_test) # Includes: explanation, formula, interpretation
From statsmodels to Real Simple Stats
Function Translations
statsmodels |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
Example Migration:
# statsmodels code
import statsmodels.api as sm
import numpy as np
X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
y = [2, 4, 5, 4, 5]
X_with_const = sm.add_constant(X)
model = sm.OLS(y, X_with_const).fit()
print(model.summary())
# Real Simple Stats equivalent
import real_simple_stats as rss
X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
y = [2, 4, 5, 4, 5]
result = rss.multiple_regression(X, y, include_intercept=True)
print(f"R² = {result['r_squared']:.3f}")
print(f"Coefficients: {result['coefficients']}")
print(f"Intercept: {result['intercept']}")
When to Use Each
Use statsmodels when:
Need detailed regression diagnostics
Require time series models (ARIMA, VAR)
Need generalized linear models (GLM)
Want comprehensive statistical tests
Use Real Simple Stats when:
Learning statistics
Quick exploratory analysis
Teaching or presentations
Simple regression/correlation
💼 From SPSS to Real Simple Stats
Common SPSS Tasks
Task 1: Descriptive Statistics
SPSS:
DESCRIPTIVES VARIABLES=score
/STATISTICS=MEAN STDDEV MIN MAX.
Real Simple Stats:
import real_simple_stats as rss
score = [85, 90, 78, 92, 88]
print(f"Mean: {rss.mean(score)}")
print(f"Std Dev: {rss.sample_std_dev(score)}")
print(f"Min: {min(score)}")
print(f"Max: {max(score)}")
Task 2: Independent t-test
SPSS:
T-TEST GROUPS=group(1 2)
/VARIABLES=score.
Real Simple Stats:
import real_simple_stats as rss
group1 = [85, 90, 78, 92, 88]
group2 = [75, 80, 72, 82, 78]
t_stat, p_value = rss.two_sample_t_test(group1, group2)
d = rss.cohens_d(group1, group2)
print(f"t = {t_stat:.3f}, p = {p_value:.3f}")
print(f"Cohen's d = {d:.3f}")
Task 3: Correlation
SPSS:
CORRELATIONS
/VARIABLES=height weight.
Real Simple Stats:
import real_simple_stats as rss
height = [65, 70, 68, 72, 66]
weight = [150, 180, 165, 190, 155]
r = rss.pearson_correlation(height, weight)
print(f"r = {r:.3f}")
Task 4: Linear Regression
SPSS:
REGRESSION
/DEPENDENT score
/METHOD=ENTER hours_studied.
Real Simple Stats:
import real_simple_stats as rss
hours_studied = [1, 2, 3, 4, 5]
score = [55, 65, 70, 80, 85]
slope, intercept, r_value, p_value, std_err = rss.linear_regression(
hours_studied, score
)
print(f"Equation: score = {slope:.2f} * hours + {intercept:.2f}")
print(f"R² = {r_value**2:.3f}")
print(f"p = {p_value:.4f}")
Advantages of Real Simple Stats over SPSS
Free and open-source (SPSS is expensive)
Reproducible (code vs. clicking)
Automatable (scripts vs. manual)
Portable (runs anywhere Python runs)
Integrates with Python ecosystem (pandas, matplotlib, etc.)
From Excel to Real Simple Stats
Common Excel Functions
Excel |
Real Simple Stats |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Example Migration: Data Analysis
Excel Workflow:
Enter data in columns A and B
Click Data → Data Analysis → t-Test
Select ranges
Click OK
View output
Real Simple Stats Workflow:
import real_simple_stats as rss
import pandas as pd
# Read Excel file
df = pd.read_excel('data.xlsx')
# Perform t-test
t_stat, p_value = rss.two_sample_t_test(
df['Group1'].tolist(),
df['Group2'].tolist()
)
# Calculate effect size
d = rss.cohens_d(
df['Group1'].tolist(),
df['Group2'].tolist()
)
# Report results
print(f"t-statistic: {t_stat:.3f}")
print(f"p-value: {p_value:.4f}")
print(f"Cohen's d: {d:.3f}")
Advantages over Excel
Reproducibility: Code can be re-run
Scalability: Handle large datasets
Automation: Process multiple files
Version control: Track changes with Git
Advanced statistics: More methods available
🔄 Complete Migration Example
Scenario: Comparing Two Groups
R Code:
# Load data
group1 <- c(23, 25, 28, 30, 32)
group2 <- c(28, 30, 35, 38, 40)
# Descriptive statistics
mean1 <- mean(group1)
mean2 <- mean(group2)
sd1 <- sd(group1)
sd2 <- sd(group2)
# t-test
result <- t.test(group1, group2)
# Effect size (requires package)
library(effsize)
d <- cohen.d(group1, group2)
# Report
cat(sprintf("Group 1: M=%.2f, SD=%.2f\n", mean1, sd1))
cat(sprintf("Group 2: M=%.2f, SD=%.2f\n", mean2, sd2))
cat(sprintf("t(%.0f)=%.2f, p=%.3f\n",
result$parameter, result$statistic, result$p.value))
cat(sprintf("Cohen's d=%.2f\n", d$estimate))
Real Simple Stats Code:
import real_simple_stats as rss
# Load data
group1 = [23, 25, 28, 30, 32]
group2 = [28, 30, 35, 38, 40]
# Descriptive statistics
mean1 = rss.mean(group1)
mean2 = rss.mean(group2)
sd1 = rss.sample_std_dev(group1)
sd2 = rss.sample_std_dev(group2)
# t-test
t_stat, p_value = rss.two_sample_t_test(group1, group2)
# Effect size
d = rss.cohens_d(group1, group2)
interpretation = rss.interpret_effect_size(d, 'd')
# Report
print(f"Group 1: M={mean1:.2f}, SD={sd1:.2f}")
print(f"Group 2: M={mean2:.2f}, SD={sd2:.2f}")
print(f"t({len(group1)+len(group2)-2})={t_stat:.2f}, p={p_value:.3f}")
print(f"Cohen's d={d:.2f} ({interpretation})")
📋 Migration Checklist
Before Migration
[ ] Identify which functions you use most
[ ] Check if Real Simple Stats supports them
[ ] Review API Comparison
[ ] Test with sample data
During Migration
[ ] Install Real Simple Stats:
pip install real-simple-stats[ ] Convert data structures (data.frames → lists/arrays)
[ ] Translate function calls
[ ] Verify results match original
[ ] Update documentation/comments
After Migration
[ ] Run tests to ensure correctness
[ ] Update analysis scripts
[ ] Train team members
[ ] Document any limitations
Quick Reference Card
Most Common Translations
# Descriptive Statistics
mean(x) → rss.mean(x)
sd(x) / np.std(x, ddof=1) → rss.sample_std_dev(x)
median(x) → rss.median(x)
# Hypothesis Tests
t.test(x, y) → rss.two_sample_t_test(x, y)
cor.test(x, y) → rss.pearson_correlation(x, y)
chisq.test(obs, exp) → rss.chi_square_statistic(obs, exp)
# Regression
lm(y ~ x) → rss.linear_regression(x, y)
predict(model, newdata) → rss.regression_equation(x, slope, intercept)
# Distributions
pnorm(x, mean, sd) → rss.normal_cdf(x, mu, sigma)
qnorm(p, mean, sd) → rss.normal_ppf(p, mu, sigma)
Tips for Successful Migration
Start small: Migrate one analysis at a time
Verify results: Compare outputs with original software
Use version control: Track changes with Git
Document differences: Note any discrepancies
Leverage Python ecosystem: Combine with pandas, matplotlib
Ask for help: Use GitHub issues
Additional Resources
API Comparison: Detailed function mapping
Examples: Interactive tutorials
FAQ: Common questions
Troubleshooting: Error solutions
Need help migrating? Open an issue with your use case!
Last Updated: 2025 Version: 0.3.0