Contributing to Real Simple Stats
==================================

We welcome contributions to Real Simple Stats! This guide will help you get started with contributing to the project.


Code Quality Standards
---------------------

We maintain high code quality standards. All contributions must meet these requirements:

Code Style
~~~~~~~~~

* **Formatting**: Code is automatically formatted with Black (88 character line length)
* **Linting**: Must pass Flake8 linting with our configuration
* **Type Hints**: All functions must have comprehensive type annotations
* **Docstrings**: All public functions must have Google-style docstrings

Example of properly formatted function::

    def calculate_mean(values: List[float]) -> float:
        """Calculate the arithmetic mean of a list of values.

        Args:
            values: List of numeric values to calculate mean for.
                   Must contain at least one value.

        Returns:
            The arithmetic mean of the input values.

        Raises:
            ValueError: If the input list is empty.

        Example:
            >>> calculate_mean([1, 2, 3, 4, 5])
            3.0
        """
        if not values:
            raise ValueError("Cannot calculate mean of empty list")
        return sum(values) / len(values)

Testing Requirements
~~~~~~~~~~~~~~~~~~

* **Test Coverage**: New code should maintain or improve test coverage
* **Test Types**: Include unit tests for all new functions
* **Edge Cases**: Test error conditions and edge cases
* **Documentation**: Test examples in docstrings should work

Example test structure::

    def test_calculate_mean():
        """Test mean calculation with various inputs."""
        # Test normal case
        assert calculate_mean([1, 2, 3, 4, 5]) == 3.0

        # Test edge cases
        assert calculate_mean([5]) == 5.0
        assert calculate_mean([1.5, 2.5]) == 2.0

        # Test error conditions
        with pytest.raises(ValueError):
            calculate_mean([])

Quality Checks
~~~~~~~~~~~~~

Before submitting, ensure all quality checks pass::

    make format-check  # Check code formatting
    make lint         # Check code style
    make type-check   # Check type annotations
    make test         # Run all tests
    make test-cov     # Run tests with coverage report

Or run everything at once::

    make quality

Types of Contributions
---------------------

Bug Reports
~~~~~~~~~~

When reporting bugs, please include:

* **Clear description** of the issue
* **Steps to reproduce** the problem
* **Expected vs actual behavior**
* **Environment details** (Python version, OS, package version)
* **Minimal code example** that demonstrates the issue

Feature Requests
~~~~~~~~~~~~~~~

For new features, please:

* **Check existing issues** to avoid duplicates
* **Describe the use case** and why it's needed
* **Provide examples** of how it would be used
* **Consider implementation complexity**

Code Contributions
~~~~~~~~~~~~~~~~~

We welcome various types of code contributions:

**New Statistical Functions**
    * Implement additional statistical tests
    * Add new probability distributions
    * Extend descriptive statistics

**Performance Improvements**
    * Optimize existing algorithms
    * Add vectorized operations
    * Improve memory efficiency

**Documentation**
    * Improve existing documentation
    * Add examples and tutorials
    * Fix typos and clarify explanations

**Testing**
    * Increase test coverage
    * Add integration tests
    * Improve test quality

**Infrastructure**
    * Improve build processes
    * Enhance CI/CD pipelines
    * Update development tools

Coding Guidelines
----------------

Function Design
~~~~~~~~~~~~~~

* **Single Responsibility**: Each function should do one thing well
* **Clear Naming**: Use descriptive names that explain what the function does
* **Input Validation**: Validate inputs and provide clear error messages
* **Educational Value**: Include mathematical explanations in docstrings

Statistical Accuracy
~~~~~~~~~~~~~~~~~~~

* **Verify Formulas**: Ensure statistical formulas are mathematically correct
* **Test Against Known Values**: Compare results with established statistical software
* **Handle Edge Cases**: Consider what happens with small samples, extreme values, etc.
* **Document Assumptions**: Clearly state any assumptions made by the function

Error Handling
~~~~~~~~~~~~~

* **Meaningful Messages**: Error messages should help users understand what went wrong
* **Appropriate Exceptions**: Use standard Python exceptions (ValueError, TypeError, etc.)
* **Input Validation**: Check inputs early and provide clear feedback

Example::

    if not isinstance(values, (list, tuple, np.ndarray)):
        raise TypeError("Values must be a list, tuple, or numpy array")

    if len(values) == 0:
        raise ValueError("Cannot calculate statistics for empty dataset")

    if not all(isinstance(x, (int, float)) for x in values):
        raise ValueError("All values must be numeric (int or float)")

Documentation Standards
----------------------

Docstring Format
~~~~~~~~~~~~~~~

We use Google-style docstrings::

    def function_name(param1: Type1, param2: Type2) -> ReturnType:
        """Brief description of what the function does.

        Longer description if needed, explaining the mathematical
        background or implementation details.

        Args:
            param1: Description of first parameter.
            param2: Description of second parameter.

        Returns:
            Description of return value.

        Raises:
            ExceptionType: Description of when this exception is raised.

        Example:
            >>> function_name(arg1, arg2)
            expected_output

        Note:
            Any additional notes about usage or mathematical background.
        """

Code Comments
~~~~~~~~~~~~

* **Explain Why**: Comments should explain why something is done, not what is done
* **Mathematical Context**: Explain statistical concepts and formulas
* **Complex Logic**: Break down complex calculations with comments

Release Process
--------------

Version Numbers
~~~~~~~~~~~~~~

We follow semantic versioning (MAJOR.MINOR.PATCH):

* **MAJOR**: Breaking changes to the API
* **MINOR**: New features, backward compatible
* **PATCH**: Bug fixes, backward compatible

Changelog
~~~~~~~~

All changes are documented in the changelog with:

* **Added**: New features
* **Changed**: Changes in existing functionality
* **Deprecated**: Soon-to-be removed features
* **Removed**: Removed features
* **Fixed**: Bug fixes
* **Security**: Security improvements

Getting Help
-----------

If you need help with contributing:

* **Check Documentation**: Read through this guide and the API documentation
* **Ask Questions**: Open a GitHub issue with the "question" label
* **Join Discussions**: Participate in GitHub discussions
* **Review Examples**: Look at existing code for patterns and style

Communication
------------

* **Be Respectful**: Follow our code of conduct
* **Be Patient**: Maintainers review contributions in their spare time
* **Be Descriptive**: Provide clear descriptions in issues and pull requests
* **Be Collaborative**: We're all working together to improve the project

Recognition
----------

Contributors are recognized in:

* **README**: Major contributors listed
* **Changelog**: Contributors credited for their changes
* **Documentation**: Authors acknowledged in relevant sections

Thank you for contributing to Real Simple Stats!