Contributing to Real Simple Stats ================================== We welcome contributions to Real Simple Stats! This guide will help you get started with contributing to the project. Code Quality Standards --------------------- We maintain high code quality standards. All contributions must meet these requirements: Code Style ~~~~~~~~~ * **Formatting**: Code is automatically formatted with Black (88 character line length) * **Linting**: Must pass Flake8 linting with our configuration * **Type Hints**: All functions must have comprehensive type annotations * **Docstrings**: All public functions must have Google-style docstrings Example of properly formatted function:: def calculate_mean(values: List[float]) -> float: """Calculate the arithmetic mean of a list of values. Args: values: List of numeric values to calculate mean for. Must contain at least one value. Returns: The arithmetic mean of the input values. Raises: ValueError: If the input list is empty. Example: >>> calculate_mean([1, 2, 3, 4, 5]) 3.0 """ if not values: raise ValueError("Cannot calculate mean of empty list") return sum(values) / len(values) Testing Requirements ~~~~~~~~~~~~~~~~~~ * **Test Coverage**: New code should maintain or improve test coverage * **Test Types**: Include unit tests for all new functions * **Edge Cases**: Test error conditions and edge cases * **Documentation**: Test examples in docstrings should work Example test structure:: def test_calculate_mean(): """Test mean calculation with various inputs.""" # Test normal case assert calculate_mean([1, 2, 3, 4, 5]) == 3.0 # Test edge cases assert calculate_mean([5]) == 5.0 assert calculate_mean([1.5, 2.5]) == 2.0 # Test error conditions with pytest.raises(ValueError): calculate_mean([]) Quality Checks ~~~~~~~~~~~~~ Before submitting, ensure all quality checks pass:: make format-check # Check code formatting make lint # Check code style make type-check # Check type annotations make test # Run all tests make test-cov # Run tests with coverage report Or run everything at once:: make quality Types of Contributions --------------------- Bug Reports ~~~~~~~~~~ When reporting bugs, please include: * **Clear description** of the issue * **Steps to reproduce** the problem * **Expected vs actual behavior** * **Environment details** (Python version, OS, package version) * **Minimal code example** that demonstrates the issue Feature Requests ~~~~~~~~~~~~~~~ For new features, please: * **Check existing issues** to avoid duplicates * **Describe the use case** and why it's needed * **Provide examples** of how it would be used * **Consider implementation complexity** Code Contributions ~~~~~~~~~~~~~~~~~ We welcome various types of code contributions: **New Statistical Functions** * Implement additional statistical tests * Add new probability distributions * Extend descriptive statistics **Performance Improvements** * Optimize existing algorithms * Add vectorized operations * Improve memory efficiency **Documentation** * Improve existing documentation * Add examples and tutorials * Fix typos and clarify explanations **Testing** * Increase test coverage * Add integration tests * Improve test quality **Infrastructure** * Improve build processes * Enhance CI/CD pipelines * Update development tools Coding Guidelines ---------------- Function Design ~~~~~~~~~~~~~~ * **Single Responsibility**: Each function should do one thing well * **Clear Naming**: Use descriptive names that explain what the function does * **Input Validation**: Validate inputs and provide clear error messages * **Educational Value**: Include mathematical explanations in docstrings Statistical Accuracy ~~~~~~~~~~~~~~~~~~~ * **Verify Formulas**: Ensure statistical formulas are mathematically correct * **Test Against Known Values**: Compare results with established statistical software * **Handle Edge Cases**: Consider what happens with small samples, extreme values, etc. * **Document Assumptions**: Clearly state any assumptions made by the function Error Handling ~~~~~~~~~~~~~ * **Meaningful Messages**: Error messages should help users understand what went wrong * **Appropriate Exceptions**: Use standard Python exceptions (ValueError, TypeError, etc.) * **Input Validation**: Check inputs early and provide clear feedback Example:: if not isinstance(values, (list, tuple, np.ndarray)): raise TypeError("Values must be a list, tuple, or numpy array") if len(values) == 0: raise ValueError("Cannot calculate statistics for empty dataset") if not all(isinstance(x, (int, float)) for x in values): raise ValueError("All values must be numeric (int or float)") Documentation Standards ---------------------- Docstring Format ~~~~~~~~~~~~~~~ We use Google-style docstrings:: def function_name(param1: Type1, param2: Type2) -> ReturnType: """Brief description of what the function does. Longer description if needed, explaining the mathematical background or implementation details. Args: param1: Description of first parameter. param2: Description of second parameter. Returns: Description of return value. Raises: ExceptionType: Description of when this exception is raised. Example: >>> function_name(arg1, arg2) expected_output Note: Any additional notes about usage or mathematical background. """ Code Comments ~~~~~~~~~~~~ * **Explain Why**: Comments should explain why something is done, not what is done * **Mathematical Context**: Explain statistical concepts and formulas * **Complex Logic**: Break down complex calculations with comments Release Process -------------- Version Numbers ~~~~~~~~~~~~~~ We follow semantic versioning (MAJOR.MINOR.PATCH): * **MAJOR**: Breaking changes to the API * **MINOR**: New features, backward compatible * **PATCH**: Bug fixes, backward compatible Changelog ~~~~~~~~ All changes are documented in the changelog with: * **Added**: New features * **Changed**: Changes in existing functionality * **Deprecated**: Soon-to-be removed features * **Removed**: Removed features * **Fixed**: Bug fixes * **Security**: Security improvements Getting Help ----------- If you need help with contributing: * **Check Documentation**: Read through this guide and the API documentation * **Ask Questions**: Open a GitHub issue with the "question" label * **Join Discussions**: Participate in GitHub discussions * **Review Examples**: Look at existing code for patterns and style Communication ------------ * **Be Respectful**: Follow our code of conduct * **Be Patient**: Maintainers review contributions in their spare time * **Be Descriptive**: Provide clear descriptions in issues and pull requests * **Be Collaborative**: We're all working together to improve the project Recognition ---------- Contributors are recognized in: * **README**: Major contributors listed * **Changelog**: Contributors credited for their changes * **Documentation**: Authors acknowledged in relevant sections Thank you for contributing to Real Simple Stats!