|
|
- =====================
- Contributing to SciPy
- =====================
-
- This document aims to give an overview of how to contribute to SciPy. It
- tries to answer commonly asked questions, and provide some insight into how the
- community process works in practice. Readers who are familiar with the SciPy
- community and are experienced Python coders may want to jump straight to the
- `git workflow`_ documentation.
-
- There are a lot of ways you can contribute:
-
- - Contributing new code
- - Fixing bugs and other maintenance work
- - Improving the documentation
- - Reviewing open pull requests
- - Triaging issues
- - Working on the `scipy.org`_ website
- - Answering questions and participating on the scipy-dev and scipy-user
- `mailing lists`_.
-
- Contributing new code
- =====================
-
- If you have been working with the scientific Python toolstack for a while, you
- probably have some code lying around of which you think "this could be useful
- for others too". Perhaps it's a good idea then to contribute it to SciPy or
- another open source project. The first question to ask is then, where does
- this code belong? That question is hard to answer here, so we start with a
- more specific one: *what code is suitable for putting into SciPy?*
- Almost all of the new code added to scipy has in common that it's potentially
- useful in multiple scientific domains and it fits in the scope of existing
- scipy submodules. In principle new submodules can be added too, but this is
- far less common. For code that is specific to a single application, there may
- be an existing project that can use the code. Some scikits (`scikit-learn`_,
- `scikit-image`_, `statsmodels`_, etc.) are good examples here; they have a
- narrower focus and because of that more domain-specific code than SciPy.
-
- Now if you have code that you would like to see included in SciPy, how do you
- go about it? After checking that your code can be distributed in SciPy under a
- compatible license (see FAQ for details), the first step is to discuss on the
- scipy-dev mailing list. All new features, as well as changes to existing code,
- are discussed and decided on there. You can, and probably should, already
- start this discussion before your code is finished.
-
- Assuming the outcome of the discussion on the mailing list is positive and you
- have a function or piece of code that does what you need it to do, what next?
- Before code is added to SciPy, it at least has to have good documentation, unit
- tests and correct code style.
-
- 1. Unit tests
- In principle you should aim to create unit tests that exercise all the code
- that you are adding. This gives some degree of confidence that your code
- runs correctly, also on Python versions and hardware or OSes that you don't
- have available yourself. An extensive description of how to write unit
- tests is given in the NumPy `testing guidelines`_.
-
- 2. Documentation
- Clear and complete documentation is essential in order for users to be able
- to find and understand the code. Documentation for individual functions
- and classes -- which includes at least a basic description, type and
- meaning of all parameters and returns values, and usage examples in
- `doctest`_ format -- is put in docstrings. Those docstrings can be read
- within the interpreter, and are compiled into a reference guide in html and
- pdf format. Higher-level documentation for key (areas of) functionality is
- provided in tutorial format and/or in module docstrings. A guide on how to
- write documentation is given in `how to document`_.
-
- 3. Code style
- Uniformity of style in which code is written is important to others trying
- to understand the code. SciPy follows the standard Python guidelines for
- code style, `PEP8`_. In order to check that your code conforms to PEP8,
- you can use the `pep8 package`_ style checker. Most IDEs and text editors
- have settings that can help you follow PEP8, for example by translating
- tabs by four spaces. Using `pyflakes`_ to check your code is also a good
- idea.
-
- At the end of this document a checklist is given that may help to check if your
- code fulfills all requirements for inclusion in SciPy.
-
- Another question you may have is: *where exactly do I put my code*? To answer
- this, it is useful to understand how the SciPy public API (application
- programming interface) is defined. For most modules the API is two levels
- deep, which means your new function should appear as
- ``scipy.submodule.my_new_func``. ``my_new_func`` can be put in an existing or
- new file under ``/scipy/<submodule>/``, its name is added to the ``__all__``
- list in that file (which lists all public functions in the file), and those
- public functions are then imported in ``/scipy/<submodule>/__init__.py``. Any
- private functions/classes should have a leading underscore (``_``) in their
- name. A more detailed description of what the public API of SciPy is, is given
- in `SciPy API`_.
-
- Once you think your code is ready for inclusion in SciPy, you can send a pull
- request (PR) on Github. We won't go into the details of how to work with git
- here, this is described well in the `git workflow`_ section of the NumPy
- documentation and on the `Github help pages`_. When you send the PR for a new
- feature, be sure to also mention this on the scipy-dev mailing list. This can
- prompt interested people to help review your PR. Assuming that you already got
- positive feedback before on the general idea of your code/feature, the purpose
- of the code review is to ensure that the code is correct, efficient and meets
- the requirements outlined above. In many cases the code review happens
- relatively quickly, but it's possible that it stalls. If you have addressed
- all feedback already given, it's perfectly fine to ask on the mailing list
- again for review (after a reasonable amount of time, say a couple of weeks, has
- passed). Once the review is completed, the PR is merged into the "master"
- branch of SciPy.
-
- The above describes the requirements and process for adding code to SciPy. It
- doesn't yet answer the question though how decisions are made exactly. The
- basic answer is: decisions are made by consensus, by everyone who chooses to
- participate in the discussion on the mailing list. This includes developers,
- other users and yourself. Aiming for consensus in the discussion is important
- -- SciPy is a project by and for the scientific Python community. In those
- rare cases that agreement cannot be reached, the maintainers of the module
- in question can decide the issue.
-
-
- Contributing by helping maintain existing code
- ==============================================
-
- The previous section talked specifically about adding new functionality to
- SciPy. A large part of that discussion also applies to maintenance of existing
- code. Maintenance means fixing bugs, improving code quality or style,
- documenting existing functionality better, adding missing unit tests, keeping
- build scripts up-to-date, etc. The SciPy `issue list`_ contains all
- reported bugs, build/documentation issues, etc. Fixing issues
- helps improve the overall quality of SciPy, and is also a good way
- of getting familiar with the project. You may also want to fix a bug because
- you ran into it and need the function in question to work correctly.
-
- The discussion on code style and unit testing above applies equally to bug
- fixes. It is usually best to start by writing a unit test that shows the
- problem, i.e. it should pass but doesn't. Once you have that, you can fix the
- code so that the test does pass. That should be enough to send a PR for this
- issue. Unlike when adding new code, discussing this on the mailing list may
- not be necessary - if the old behavior of the code is clearly incorrect, no one
- will object to having it fixed. It may be necessary to add some warning or
- deprecation message for the changed behavior. This should be part of the
- review process.
-
-
- Reviewing pull requests
- =======================
-
- Reviewing open pull requests (PRs) is very welcome, and a valuable way to help
- increase the speed at which the project moves forward. If you have specific
- knowledge/experience in a particular area (say "optimization algorithms" or
- "special functions") then reviewing PRs in that area is especially valuable -
- sometimes PRs with technical code have to wait for a long time to get merged
- due to a shortage of appropriate reviewers.
-
- We encourage everyone to get involved in the review process; it's also a
- great way to get familiar with the code base. Reviewers should ask
- themselves some or all of the following questions:
-
- - Was this change adequately discussed (relevant for new features and changes
- in existing behavior)?
- - Is the feature scientifically sound? Algorithms may be known to work based on
- literature; otherwise, closer look at correctness is valuable.
- - Is the intended behavior clear under all conditions (e.g. unexpected inputs
- like empty arrays or nan/inf values)?
- - Does the code meet the quality, test and documentation expectation outline
- under `Contributing new code`_?
-
- If we do not know you yet, consider introducing yourself.
-
-
- Other ways to contribute
- ========================
-
- There are many ways to contribute other than contributing code.
-
- Triaging issues (investigating bug reports for validity and possible actions to
- take) is also a useful activity. SciPy has many hundreds of open issues;
- closing invalid ones and correctly labeling valid ones (ideally with some first
- thoughts in a comment) allows prioritizing maintenance work and finding related
- issues easily when working on an existing function or submodule.
-
- Participating in discussions on the scipy-user and scipy-dev `mailing lists`_ is
- a contribution in itself. Everyone who writes to those lists with a problem or
- an idea would like to get responses, and writing such responses makes the
- project and community function better and appear more welcoming.
-
- The `scipy.org`_ website contains a lot of information on both SciPy the
- project and SciPy the community, and it can always use a new pair of hands.
- The sources for the website live in their own separate repo:
- https://github.com/scipy/scipy.org
-
-
- Recommended development setup
- =============================
-
- Since Scipy contains parts written in C, C++, and Fortran that need to be
- compiled before use, make sure you have the necessary compilers and Python
- development headers installed. Having compiled code also means that importing
- Scipy from the development sources needs some additional steps, which are
- explained below.
-
- First fork a copy of the main Scipy repository in Github onto your own
- account and then create your local repository via::
-
- $ git clone git@github.com:YOURUSERNAME/scipy.git scipy
- $ cd scipy
- $ git remote add upstream git://github.com/scipy/scipy.git
-
- To build the development version of Scipy and run tests, spawn
- interactive shells with the Python import paths properly set up etc.,
- do one of::
-
- $ python runtests.py -v
- $ python runtests.py -v -s optimize
- $ python runtests.py -v -t scipy.special.tests.test_basic::test_xlogy
- $ python runtests.py --ipython
- $ python runtests.py --python somescript.py
- $ python runtests.py --bench
-
- This builds Scipy first, so the first time it may take some time. If
- you specify ``-n``, the tests are run against the version of Scipy (if
- any) found on current PYTHONPATH. *Note: if you run into a build issue,
- more detailed build documentation can be found in :doc:`building/index` and at
- https://github.com/scipy/scipy/tree/master/doc/source/building*
-
- Using ``runtests.py`` is the recommended approach to running tests.
- There are also a number of alternatives to it, for example in-place
- build or installing to a virtualenv. See the FAQ below for details.
-
- Some of the tests in Scipy are very slow and need to be separately
- enabled. See the FAQ below for details.
-
-
- SciPy structure
- ===============
-
- All SciPy modules should follow the following conventions. In the
- following, a *SciPy module* is defined as a Python package, say
- ``yyy``, that is located in the scipy/ directory.
-
- * Ideally, each SciPy module should be as self-contained as possible.
- That is, it should have minimal dependencies on other packages or
- modules. Even dependencies on other SciPy modules should be kept to
- a minimum. A dependency on NumPy is of course assumed.
-
- * Directory ``yyy/`` contains:
-
- - A file ``setup.py`` that defines
- ``configuration(parent_package='',top_path=None)`` function
- for `numpy.distutils`.
-
- - A directory ``tests/`` that contains files ``test_<name>.py``
- corresponding to modules ``yyy/<name>{.py,.so,/}``.
-
- * Private modules should be prefixed with an underscore ``_``,
- for instance ``yyy/_somemodule.py``.
-
- * User-visible functions should have good documentation following
- the Numpy documentation style, see `how to document`_
-
- * The ``__init__.py`` of the module should contain the main reference
- documentation in its docstring. This is connected to the Sphinx
- documentation under ``doc/`` via Sphinx's automodule directive.
-
- The reference documentation should first give a categorized list of
- the contents of the module using ``autosummary::`` directives, and
- after that explain points essential for understanding the use of the
- module.
-
- Tutorial-style documentation with extensive examples should be
- separate, and put under ``doc/source/tutorial/``
-
- See the existing Scipy submodules for guidance.
-
- For further details on Numpy distutils, see:
-
- https://github.com/numpy/numpy/blob/master/doc/DISTUTILS.rst.txt
-
-
- Useful links, FAQ, checklist
- ============================
-
- Checklist before submitting a PR
- --------------------------------
-
- - Are there unit tests with good code coverage?
- - Do all public function have docstrings including examples?
- - Is the code style correct (PEP8, pyflakes)
- - Is the commit message `formatted correctly`_?
- - Is the new functionality tagged with ``.. versionadded:: X.Y.Z`` (with
- X.Y.Z the version number of the next release - can be found in setup.py)?
- - Is the new functionality mentioned in the release notes of the next
- release?
- - Is the new functionality added to the reference guide?
- - In case of larger additions, is there a tutorial or more extensive
- module-level description?
- - In case compiled code is added, is it integrated correctly via setup.py
- (and preferably also Bento configuration files - bento.info and bscript)?
- - If you are a first-time contributor, did you add yourself to THANKS.txt?
- Please note that this is perfectly normal and desirable - the aim is to
- give every single contributor credit, and if you don't add yourself it's
- simply extra work for the reviewer (or worse, the reviewer may forget).
- - Did you check that the code can be distributed under a BSD license?
-
-
- Useful SciPy documents
- ----------------------
-
- - The `how to document`_ guidelines
- - NumPy/SciPy `testing guidelines`_
- - `SciPy API`_
- - The `SciPy Roadmap`_
- - NumPy/SciPy `git workflow`_
- - How to submit a good `bug report`_
-
-
- FAQ
- ---
-
- *I based my code on existing Matlab/R/... code I found online, is this OK?*
-
- It depends. SciPy is distributed under a BSD license, so if the code that you
- based your code on is also BSD licensed or has a BSD-compatible license (e.g.
- MIT, PSF) then it's OK. Code which is GPL or Apache licensed, has no
- clear license, requires citation or is free for academic use only can't be
- included in SciPy. Therefore if you copied existing code with such a license
- or made a direct translation to Python of it, your code can't be included.
- If you're unsure, please ask on the scipy-dev mailing list.
-
- *Why is SciPy under the BSD license and not, say, the GPL?*
-
- Like Python, SciPy uses a "permissive" open source license, which allows
- proprietary re-use. While this allows companies to use and modify the software
- without giving anything back, it is felt that the larger user base results in
- more contributions overall, and companies often publish their modifications
- anyway, without being required to. See John Hunter's `BSD pitch`_.
-
-
- *How do I set up a development version of SciPy in parallel to a released
- version that I use to do my job/research?*
-
- One simple way to achieve this is to install the released version in
- site-packages, by using a binary installer or pip for example, and set
- up the development version in a virtualenv. First install
- `virtualenv`_ (optionally use `virtualenvwrapper`_), then create your
- virtualenv (named scipy-dev here) with::
-
- $ virtualenv scipy-dev
-
- Now, whenever you want to switch to the virtual environment, you can use the
- command ``source scipy-dev/bin/activate``, and ``deactivate`` to exit from the
- virtual environment and back to your previous shell. With scipy-dev
- activated, install first Scipy's dependencies::
-
- $ pip install Numpy pytest Cython
-
- After that, you can install a development version of Scipy, for example via::
-
- $ python setup.py install
-
- The installation goes to the virtual environment.
-
-
- *How do I set up an in-place build for development*
-
- For development, you can set up an in-place build so that changes made to
- ``.py`` files have effect without rebuild. First, run::
-
- $ python setup.py build_ext -i
-
- Then you need to point your PYTHONPATH environment variable to this directory.
- Some IDEs (Spyder for example) have utilities to manage PYTHONPATH. On Linux
- and OSX, you can run the command::
-
- $ export PYTHONPATH=$PWD
-
- and on Windows
-
- $ set PYTHONPATH=/path/to/scipy
-
- Now editing a Python source file in SciPy allows you to immediately
- test and use your changes (in ``.py`` files), by simply restarting the
- interpreter.
-
-
- *Are there any video examples for installing from source, setting up a
- development environment, etc...?*
-
- Currently, there are two video demonstrations for Anaconda Python on macOS:
-
- `Anaconda SciPy Dev Part I (macOS)`_ is a four-minute
- overview of installing Anaconda, building SciPy from source, and testing
- changes made to SciPy from the Spyder IDE.
-
- `Anaconda SciPy Dev Part II (macOS)`_ shows how to use
- a virtual environment to easily switch between the "pre-built version" of SciPy
- installed with Anaconda and your "source-built version" of SciPy created
- according to Part I.
-
-
- *Are there any video examples of the basic development workflow?*
-
- `SciPy Development Workflow`_ is a five-minute example of fixing a bug and
- submitting a pull request. While it's intended as a followup to
- `Anaconda SciPy Dev Part I (macOS)`_ and `Anaconda SciPy Dev Part II (macOS)`_,
- the process is similar for other development setups.
-
-
- *Can I use a programming language other than Python to speed up my code?*
-
- Yes. The languages used in SciPy are Python, Cython, C, C++ and Fortran. All
- of these have their pros and cons. If Python really doesn't offer enough
- performance, one of those languages can be used. Important concerns when
- using compiled languages are maintainability and portability. For
- maintainability, Cython is clearly preferred over C/C++/Fortran. Cython and C
- are more portable than C++/Fortran. A lot of the existing C and Fortran code
- in SciPy is older, battle-tested code that was only wrapped in (but not
- specifically written for) Python/SciPy. Therefore the basic advice is: use
- Cython. If there's specific reasons why C/C++/Fortran should be preferred,
- please discuss those reasons first.
-
-
- *How do I debug code written in C/C++/Fortran inside Scipy?*
-
- The easiest way to do this is to first write a Python script that
- invokes the C code whose execution you want to debug. For instance
- ``mytest.py``::
-
- from scipy.special import hyp2f1
- print(hyp2f1(5.0, 1.0, -1.8, 0.95))
-
- Now, you can run::
-
- gdb --args python runtests.py -g --python mytest.py
-
- If you didn't compile with debug symbols enabled before, remove the
- ``build`` directory first. While in the debugger::
-
- (gdb) break cephes_hyp2f1
- (gdb) run
-
- The execution will now stop at the corresponding C function and you
- can step through it as usual. Instead of plain ``gdb`` you can of
- course use your favourite alternative debugger; run it on the
- ``python`` binary with arguments ``runtests.py -g --python mytest.py``.
-
-
- *How do I enable additional tests in Scipy?*
-
- Some of the tests in Scipy's test suite are very slow and not enabled
- by default. You can run the full suite via::
-
- $ python runtests.py -g -m full
-
- This invokes the test suite ``import scipy; scipy.test("full")``,
- enabling also slow tests.
-
- There is an additional level of very slow tests (several minutes),
- which are disabled also in this case. They can be enabled by setting
- the environment variable ``SCIPY_XSLOW=1`` before running the test
- suite.
-
-
- .. _scikit-learn: http://scikit-learn.org
-
- .. _scikit-image: http://scikit-image.org/
-
- .. _statsmodels: http://statsmodels.sourceforge.net/
-
- .. _testing guidelines: https://github.com/numpy/numpy/blob/master/doc/TESTS.rst.txt
-
- .. _formatted correctly: http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#writing-the-commit-message
-
- .. _how to document: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
-
- .. _bug report: http://scipy.org/bug-report.html
-
- .. _PEP8: http://www.python.org/dev/peps/pep-0008/
-
- .. _pep8 package: http://pypi.python.org/pypi/pep8
-
- .. _pyflakes: http://pypi.python.org/pypi/pyflakes
-
- .. _SciPy API: https://docs.scipy.org/doc/scipy/reference/api.html
-
- .. _SciPy Roadmap: https://scipy.github.io/devdocs/roadmap.html
-
- .. _git workflow: http://docs.scipy.org/doc/numpy/dev/gitwash/index.html
-
- .. _Github help pages: https://help.github.com/articles/set-up-git/
-
- .. _issue list: https://github.com/scipy/scipy/issues
-
- .. _Github: https://github.com/scipy/scipy
-
- .. _scipy.org: https://scipy.org/
-
- .. _scipy.github.com: http://scipy.github.com/
-
- .. _scipy.org-new: https://github.com/scipy/scipy.org-new
-
- .. _documentation wiki: https://docs.scipy.org/scipy/Front%20Page/
-
- .. _SciPy Central: http://scipy-central.org/
-
- .. _doctest: http://www.doughellmann.com/PyMOTW/doctest/
-
- .. _virtualenv: http://www.virtualenv.org/
-
- .. _virtualenvwrapper: http://www.doughellmann.com/projects/virtualenvwrapper/
-
- .. _bsd pitch: http://nipy.sourceforge.net/nipy/stable/faq/johns_bsd_pitch.html
-
- .. _Pytest: https://pytest.org/
-
- .. _mailing lists: https://www.scipy.org/scipylib/mailing-lists.html
-
- .. _Anaconda SciPy Dev Part I (macOS): https://youtu.be/1rPOSNd0ULI
-
- .. _Anaconda SciPy Dev Part II (macOS): https://youtu.be/Faz29u5xIZc
-
- .. _SciPy Development Workflow: https://youtu.be/HgU01gJbzMY
|