|
=====================
|
|
Contributing to SciPy
|
|
=====================
|
|
|
|
This document aims to give an overview of how to contribute to SciPy. It
|
|
tries to answer commonly asked questions, and provide some insight into how the
|
|
community process works in practice. Readers who are familiar with the SciPy
|
|
community and are experienced Python coders may want to jump straight to the
|
|
`git workflow`_ documentation.
|
|
|
|
There are a lot of ways you can contribute:
|
|
|
|
- Contributing new code
|
|
- Fixing bugs and other maintenance work
|
|
- Improving the documentation
|
|
- Reviewing open pull requests
|
|
- Triaging issues
|
|
- Working on the `scipy.org`_ website
|
|
- Answering questions and participating on the scipy-dev and scipy-user
|
|
`mailing lists`_.
|
|
|
|
Contributing new code
|
|
=====================
|
|
|
|
If you have been working with the scientific Python toolstack for a while, you
|
|
probably have some code lying around of which you think "this could be useful
|
|
for others too". Perhaps it's a good idea then to contribute it to SciPy or
|
|
another open source project. The first question to ask is then, where does
|
|
this code belong? That question is hard to answer here, so we start with a
|
|
more specific one: *what code is suitable for putting into SciPy?*
|
|
Almost all of the new code added to scipy has in common that it's potentially
|
|
useful in multiple scientific domains and it fits in the scope of existing
|
|
scipy submodules. In principle new submodules can be added too, but this is
|
|
far less common. For code that is specific to a single application, there may
|
|
be an existing project that can use the code. Some scikits (`scikit-learn`_,
|
|
`scikit-image`_, `statsmodels`_, etc.) are good examples here; they have a
|
|
narrower focus and because of that more domain-specific code than SciPy.
|
|
|
|
Now if you have code that you would like to see included in SciPy, how do you
|
|
go about it? After checking that your code can be distributed in SciPy under a
|
|
compatible license (see FAQ for details), the first step is to discuss on the
|
|
scipy-dev mailing list. All new features, as well as changes to existing code,
|
|
are discussed and decided on there. You can, and probably should, already
|
|
start this discussion before your code is finished.
|
|
|
|
Assuming the outcome of the discussion on the mailing list is positive and you
|
|
have a function or piece of code that does what you need it to do, what next?
|
|
Before code is added to SciPy, it at least has to have good documentation, unit
|
|
tests and correct code style.
|
|
|
|
1. Unit tests
|
|
In principle you should aim to create unit tests that exercise all the code
|
|
that you are adding. This gives some degree of confidence that your code
|
|
runs correctly, also on Python versions and hardware or OSes that you don't
|
|
have available yourself. An extensive description of how to write unit
|
|
tests is given in the NumPy `testing guidelines`_.
|
|
|
|
2. Documentation
|
|
Clear and complete documentation is essential in order for users to be able
|
|
to find and understand the code. Documentation for individual functions
|
|
and classes -- which includes at least a basic description, type and
|
|
meaning of all parameters and returns values, and usage examples in
|
|
`doctest`_ format -- is put in docstrings. Those docstrings can be read
|
|
within the interpreter, and are compiled into a reference guide in html and
|
|
pdf format. Higher-level documentation for key (areas of) functionality is
|
|
provided in tutorial format and/or in module docstrings. A guide on how to
|
|
write documentation is given in `how to document`_.
|
|
|
|
3. Code style
|
|
Uniformity of style in which code is written is important to others trying
|
|
to understand the code. SciPy follows the standard Python guidelines for
|
|
code style, `PEP8`_. In order to check that your code conforms to PEP8,
|
|
you can use the `pep8 package`_ style checker. Most IDEs and text editors
|
|
have settings that can help you follow PEP8, for example by translating
|
|
tabs by four spaces. Using `pyflakes`_ to check your code is also a good
|
|
idea.
|
|
|
|
At the end of this document a checklist is given that may help to check if your
|
|
code fulfills all requirements for inclusion in SciPy.
|
|
|
|
Another question you may have is: *where exactly do I put my code*? To answer
|
|
this, it is useful to understand how the SciPy public API (application
|
|
programming interface) is defined. For most modules the API is two levels
|
|
deep, which means your new function should appear as
|
|
``scipy.submodule.my_new_func``. ``my_new_func`` can be put in an existing or
|
|
new file under ``/scipy/<submodule>/``, its name is added to the ``__all__``
|
|
list in that file (which lists all public functions in the file), and those
|
|
public functions are then imported in ``/scipy/<submodule>/__init__.py``. Any
|
|
private functions/classes should have a leading underscore (``_``) in their
|
|
name. A more detailed description of what the public API of SciPy is, is given
|
|
in `SciPy API`_.
|
|
|
|
Once you think your code is ready for inclusion in SciPy, you can send a pull
|
|
request (PR) on Github. We won't go into the details of how to work with git
|
|
here, this is described well in the `git workflow`_ section of the NumPy
|
|
documentation and on the `Github help pages`_. When you send the PR for a new
|
|
feature, be sure to also mention this on the scipy-dev mailing list. This can
|
|
prompt interested people to help review your PR. Assuming that you already got
|
|
positive feedback before on the general idea of your code/feature, the purpose
|
|
of the code review is to ensure that the code is correct, efficient and meets
|
|
the requirements outlined above. In many cases the code review happens
|
|
relatively quickly, but it's possible that it stalls. If you have addressed
|
|
all feedback already given, it's perfectly fine to ask on the mailing list
|
|
again for review (after a reasonable amount of time, say a couple of weeks, has
|
|
passed). Once the review is completed, the PR is merged into the "master"
|
|
branch of SciPy.
|
|
|
|
The above describes the requirements and process for adding code to SciPy. It
|
|
doesn't yet answer the question though how decisions are made exactly. The
|
|
basic answer is: decisions are made by consensus, by everyone who chooses to
|
|
participate in the discussion on the mailing list. This includes developers,
|
|
other users and yourself. Aiming for consensus in the discussion is important
|
|
-- SciPy is a project by and for the scientific Python community. In those
|
|
rare cases that agreement cannot be reached, the maintainers of the module
|
|
in question can decide the issue.
|
|
|
|
|
|
Contributing by helping maintain existing code
|
|
==============================================
|
|
|
|
The previous section talked specifically about adding new functionality to
|
|
SciPy. A large part of that discussion also applies to maintenance of existing
|
|
code. Maintenance means fixing bugs, improving code quality or style,
|
|
documenting existing functionality better, adding missing unit tests, keeping
|
|
build scripts up-to-date, etc. The SciPy `issue list`_ contains all
|
|
reported bugs, build/documentation issues, etc. Fixing issues
|
|
helps improve the overall quality of SciPy, and is also a good way
|
|
of getting familiar with the project. You may also want to fix a bug because
|
|
you ran into it and need the function in question to work correctly.
|
|
|
|
The discussion on code style and unit testing above applies equally to bug
|
|
fixes. It is usually best to start by writing a unit test that shows the
|
|
problem, i.e. it should pass but doesn't. Once you have that, you can fix the
|
|
code so that the test does pass. That should be enough to send a PR for this
|
|
issue. Unlike when adding new code, discussing this on the mailing list may
|
|
not be necessary - if the old behavior of the code is clearly incorrect, no one
|
|
will object to having it fixed. It may be necessary to add some warning or
|
|
deprecation message for the changed behavior. This should be part of the
|
|
review process.
|
|
|
|
|
|
Reviewing pull requests
|
|
=======================
|
|
|
|
Reviewing open pull requests (PRs) is very welcome, and a valuable way to help
|
|
increase the speed at which the project moves forward. If you have specific
|
|
knowledge/experience in a particular area (say "optimization algorithms" or
|
|
"special functions") then reviewing PRs in that area is especially valuable -
|
|
sometimes PRs with technical code have to wait for a long time to get merged
|
|
due to a shortage of appropriate reviewers.
|
|
|
|
We encourage everyone to get involved in the review process; it's also a
|
|
great way to get familiar with the code base. Reviewers should ask
|
|
themselves some or all of the following questions:
|
|
|
|
- Was this change adequately discussed (relevant for new features and changes
|
|
in existing behavior)?
|
|
- Is the feature scientifically sound? Algorithms may be known to work based on
|
|
literature; otherwise, closer look at correctness is valuable.
|
|
- Is the intended behavior clear under all conditions (e.g. unexpected inputs
|
|
like empty arrays or nan/inf values)?
|
|
- Does the code meet the quality, test and documentation expectation outline
|
|
under `Contributing new code`_?
|
|
|
|
If we do not know you yet, consider introducing yourself.
|
|
|
|
|
|
Other ways to contribute
|
|
========================
|
|
|
|
There are many ways to contribute other than contributing code.
|
|
|
|
Triaging issues (investigating bug reports for validity and possible actions to
|
|
take) is also a useful activity. SciPy has many hundreds of open issues;
|
|
closing invalid ones and correctly labeling valid ones (ideally with some first
|
|
thoughts in a comment) allows prioritizing maintenance work and finding related
|
|
issues easily when working on an existing function or submodule.
|
|
|
|
Participating in discussions on the scipy-user and scipy-dev `mailing lists`_ is
|
|
a contribution in itself. Everyone who writes to those lists with a problem or
|
|
an idea would like to get responses, and writing such responses makes the
|
|
project and community function better and appear more welcoming.
|
|
|
|
The `scipy.org`_ website contains a lot of information on both SciPy the
|
|
project and SciPy the community, and it can always use a new pair of hands.
|
|
The sources for the website live in their own separate repo:
|
|
https://github.com/scipy/scipy.org
|
|
|
|
|
|
Recommended development setup
|
|
=============================
|
|
|
|
Since Scipy contains parts written in C, C++, and Fortran that need to be
|
|
compiled before use, make sure you have the necessary compilers and Python
|
|
development headers installed. Having compiled code also means that importing
|
|
Scipy from the development sources needs some additional steps, which are
|
|
explained below.
|
|
|
|
First fork a copy of the main Scipy repository in Github onto your own
|
|
account and then create your local repository via::
|
|
|
|
$ git clone git@github.com:YOURUSERNAME/scipy.git scipy
|
|
$ cd scipy
|
|
$ git remote add upstream git://github.com/scipy/scipy.git
|
|
|
|
To build the development version of Scipy and run tests, spawn
|
|
interactive shells with the Python import paths properly set up etc.,
|
|
do one of::
|
|
|
|
$ python runtests.py -v
|
|
$ python runtests.py -v -s optimize
|
|
$ python runtests.py -v -t scipy.special.tests.test_basic::test_xlogy
|
|
$ python runtests.py --ipython
|
|
$ python runtests.py --python somescript.py
|
|
$ python runtests.py --bench
|
|
|
|
This builds Scipy first, so the first time it may take some time. If
|
|
you specify ``-n``, the tests are run against the version of Scipy (if
|
|
any) found on current PYTHONPATH. *Note: if you run into a build issue,
|
|
more detailed build documentation can be found in :doc:`building/index` and at
|
|
https://github.com/scipy/scipy/tree/master/doc/source/building*
|
|
|
|
Using ``runtests.py`` is the recommended approach to running tests.
|
|
There are also a number of alternatives to it, for example in-place
|
|
build or installing to a virtualenv. See the FAQ below for details.
|
|
|
|
Some of the tests in Scipy are very slow and need to be separately
|
|
enabled. See the FAQ below for details.
|
|
|
|
|
|
SciPy structure
|
|
===============
|
|
|
|
All SciPy modules should follow the following conventions. In the
|
|
following, a *SciPy module* is defined as a Python package, say
|
|
``yyy``, that is located in the scipy/ directory.
|
|
|
|
* Ideally, each SciPy module should be as self-contained as possible.
|
|
That is, it should have minimal dependencies on other packages or
|
|
modules. Even dependencies on other SciPy modules should be kept to
|
|
a minimum. A dependency on NumPy is of course assumed.
|
|
|
|
* Directory ``yyy/`` contains:
|
|
|
|
- A file ``setup.py`` that defines
|
|
``configuration(parent_package='',top_path=None)`` function
|
|
for `numpy.distutils`.
|
|
|
|
- A directory ``tests/`` that contains files ``test_<name>.py``
|
|
corresponding to modules ``yyy/<name>{.py,.so,/}``.
|
|
|
|
* Private modules should be prefixed with an underscore ``_``,
|
|
for instance ``yyy/_somemodule.py``.
|
|
|
|
* User-visible functions should have good documentation following
|
|
the Numpy documentation style, see `how to document`_
|
|
|
|
* The ``__init__.py`` of the module should contain the main reference
|
|
documentation in its docstring. This is connected to the Sphinx
|
|
documentation under ``doc/`` via Sphinx's automodule directive.
|
|
|
|
The reference documentation should first give a categorized list of
|
|
the contents of the module using ``autosummary::`` directives, and
|
|
after that explain points essential for understanding the use of the
|
|
module.
|
|
|
|
Tutorial-style documentation with extensive examples should be
|
|
separate, and put under ``doc/source/tutorial/``
|
|
|
|
See the existing Scipy submodules for guidance.
|
|
|
|
For further details on Numpy distutils, see:
|
|
|
|
https://github.com/numpy/numpy/blob/master/doc/DISTUTILS.rst.txt
|
|
|
|
|
|
Useful links, FAQ, checklist
|
|
============================
|
|
|
|
Checklist before submitting a PR
|
|
--------------------------------
|
|
|
|
- Are there unit tests with good code coverage?
|
|
- Do all public function have docstrings including examples?
|
|
- Is the code style correct (PEP8, pyflakes)
|
|
- Is the commit message `formatted correctly`_?
|
|
- Is the new functionality tagged with ``.. versionadded:: X.Y.Z`` (with
|
|
X.Y.Z the version number of the next release - can be found in setup.py)?
|
|
- Is the new functionality mentioned in the release notes of the next
|
|
release?
|
|
- Is the new functionality added to the reference guide?
|
|
- In case of larger additions, is there a tutorial or more extensive
|
|
module-level description?
|
|
- In case compiled code is added, is it integrated correctly via setup.py
|
|
(and preferably also Bento configuration files - bento.info and bscript)?
|
|
- If you are a first-time contributor, did you add yourself to THANKS.txt?
|
|
Please note that this is perfectly normal and desirable - the aim is to
|
|
give every single contributor credit, and if you don't add yourself it's
|
|
simply extra work for the reviewer (or worse, the reviewer may forget).
|
|
- Did you check that the code can be distributed under a BSD license?
|
|
|
|
|
|
Useful SciPy documents
|
|
----------------------
|
|
|
|
- The `how to document`_ guidelines
|
|
- NumPy/SciPy `testing guidelines`_
|
|
- `SciPy API`_
|
|
- The `SciPy Roadmap`_
|
|
- NumPy/SciPy `git workflow`_
|
|
- How to submit a good `bug report`_
|
|
|
|
|
|
FAQ
|
|
---
|
|
|
|
*I based my code on existing Matlab/R/... code I found online, is this OK?*
|
|
|
|
It depends. SciPy is distributed under a BSD license, so if the code that you
|
|
based your code on is also BSD licensed or has a BSD-compatible license (e.g.
|
|
MIT, PSF) then it's OK. Code which is GPL or Apache licensed, has no
|
|
clear license, requires citation or is free for academic use only can't be
|
|
included in SciPy. Therefore if you copied existing code with such a license
|
|
or made a direct translation to Python of it, your code can't be included.
|
|
If you're unsure, please ask on the scipy-dev mailing list.
|
|
|
|
*Why is SciPy under the BSD license and not, say, the GPL?*
|
|
|
|
Like Python, SciPy uses a "permissive" open source license, which allows
|
|
proprietary re-use. While this allows companies to use and modify the software
|
|
without giving anything back, it is felt that the larger user base results in
|
|
more contributions overall, and companies often publish their modifications
|
|
anyway, without being required to. See John Hunter's `BSD pitch`_.
|
|
|
|
|
|
*How do I set up a development version of SciPy in parallel to a released
|
|
version that I use to do my job/research?*
|
|
|
|
One simple way to achieve this is to install the released version in
|
|
site-packages, by using a binary installer or pip for example, and set
|
|
up the development version in a virtualenv. First install
|
|
`virtualenv`_ (optionally use `virtualenvwrapper`_), then create your
|
|
virtualenv (named scipy-dev here) with::
|
|
|
|
$ virtualenv scipy-dev
|
|
|
|
Now, whenever you want to switch to the virtual environment, you can use the
|
|
command ``source scipy-dev/bin/activate``, and ``deactivate`` to exit from the
|
|
virtual environment and back to your previous shell. With scipy-dev
|
|
activated, install first Scipy's dependencies::
|
|
|
|
$ pip install Numpy pytest Cython
|
|
|
|
After that, you can install a development version of Scipy, for example via::
|
|
|
|
$ python setup.py install
|
|
|
|
The installation goes to the virtual environment.
|
|
|
|
|
|
*How do I set up an in-place build for development*
|
|
|
|
For development, you can set up an in-place build so that changes made to
|
|
``.py`` files have effect without rebuild. First, run::
|
|
|
|
$ python setup.py build_ext -i
|
|
|
|
Then you need to point your PYTHONPATH environment variable to this directory.
|
|
Some IDEs (Spyder for example) have utilities to manage PYTHONPATH. On Linux
|
|
and OSX, you can run the command::
|
|
|
|
$ export PYTHONPATH=$PWD
|
|
|
|
and on Windows
|
|
|
|
$ set PYTHONPATH=/path/to/scipy
|
|
|
|
Now editing a Python source file in SciPy allows you to immediately
|
|
test and use your changes (in ``.py`` files), by simply restarting the
|
|
interpreter.
|
|
|
|
|
|
*Are there any video examples for installing from source, setting up a
|
|
development environment, etc...?*
|
|
|
|
Currently, there are two video demonstrations for Anaconda Python on macOS:
|
|
|
|
`Anaconda SciPy Dev Part I (macOS)`_ is a four-minute
|
|
overview of installing Anaconda, building SciPy from source, and testing
|
|
changes made to SciPy from the Spyder IDE.
|
|
|
|
`Anaconda SciPy Dev Part II (macOS)`_ shows how to use
|
|
a virtual environment to easily switch between the "pre-built version" of SciPy
|
|
installed with Anaconda and your "source-built version" of SciPy created
|
|
according to Part I.
|
|
|
|
|
|
*Are there any video examples of the basic development workflow?*
|
|
|
|
`SciPy Development Workflow`_ is a five-minute example of fixing a bug and
|
|
submitting a pull request. While it's intended as a followup to
|
|
`Anaconda SciPy Dev Part I (macOS)`_ and `Anaconda SciPy Dev Part II (macOS)`_,
|
|
the process is similar for other development setups.
|
|
|
|
|
|
*Can I use a programming language other than Python to speed up my code?*
|
|
|
|
Yes. The languages used in SciPy are Python, Cython, C, C++ and Fortran. All
|
|
of these have their pros and cons. If Python really doesn't offer enough
|
|
performance, one of those languages can be used. Important concerns when
|
|
using compiled languages are maintainability and portability. For
|
|
maintainability, Cython is clearly preferred over C/C++/Fortran. Cython and C
|
|
are more portable than C++/Fortran. A lot of the existing C and Fortran code
|
|
in SciPy is older, battle-tested code that was only wrapped in (but not
|
|
specifically written for) Python/SciPy. Therefore the basic advice is: use
|
|
Cython. If there's specific reasons why C/C++/Fortran should be preferred,
|
|
please discuss those reasons first.
|
|
|
|
|
|
*How do I debug code written in C/C++/Fortran inside Scipy?*
|
|
|
|
The easiest way to do this is to first write a Python script that
|
|
invokes the C code whose execution you want to debug. For instance
|
|
``mytest.py``::
|
|
|
|
from scipy.special import hyp2f1
|
|
print(hyp2f1(5.0, 1.0, -1.8, 0.95))
|
|
|
|
Now, you can run::
|
|
|
|
gdb --args python runtests.py -g --python mytest.py
|
|
|
|
If you didn't compile with debug symbols enabled before, remove the
|
|
``build`` directory first. While in the debugger::
|
|
|
|
(gdb) break cephes_hyp2f1
|
|
(gdb) run
|
|
|
|
The execution will now stop at the corresponding C function and you
|
|
can step through it as usual. Instead of plain ``gdb`` you can of
|
|
course use your favourite alternative debugger; run it on the
|
|
``python`` binary with arguments ``runtests.py -g --python mytest.py``.
|
|
|
|
|
|
*How do I enable additional tests in Scipy?*
|
|
|
|
Some of the tests in Scipy's test suite are very slow and not enabled
|
|
by default. You can run the full suite via::
|
|
|
|
$ python runtests.py -g -m full
|
|
|
|
This invokes the test suite ``import scipy; scipy.test("full")``,
|
|
enabling also slow tests.
|
|
|
|
There is an additional level of very slow tests (several minutes),
|
|
which are disabled also in this case. They can be enabled by setting
|
|
the environment variable ``SCIPY_XSLOW=1`` before running the test
|
|
suite.
|
|
|
|
|
|
.. _scikit-learn: http://scikit-learn.org
|
|
|
|
.. _scikit-image: http://scikit-image.org/
|
|
|
|
.. _statsmodels: http://statsmodels.sourceforge.net/
|
|
|
|
.. _testing guidelines: https://github.com/numpy/numpy/blob/master/doc/TESTS.rst.txt
|
|
|
|
.. _formatted correctly: http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#writing-the-commit-message
|
|
|
|
.. _how to document: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
|
|
|
|
.. _bug report: http://scipy.org/bug-report.html
|
|
|
|
.. _PEP8: http://www.python.org/dev/peps/pep-0008/
|
|
|
|
.. _pep8 package: http://pypi.python.org/pypi/pep8
|
|
|
|
.. _pyflakes: http://pypi.python.org/pypi/pyflakes
|
|
|
|
.. _SciPy API: https://docs.scipy.org/doc/scipy/reference/api.html
|
|
|
|
.. _SciPy Roadmap: https://scipy.github.io/devdocs/roadmap.html
|
|
|
|
.. _git workflow: http://docs.scipy.org/doc/numpy/dev/gitwash/index.html
|
|
|
|
.. _Github help pages: https://help.github.com/articles/set-up-git/
|
|
|
|
.. _issue list: https://github.com/scipy/scipy/issues
|
|
|
|
.. _Github: https://github.com/scipy/scipy
|
|
|
|
.. _scipy.org: https://scipy.org/
|
|
|
|
.. _scipy.github.com: http://scipy.github.com/
|
|
|
|
.. _scipy.org-new: https://github.com/scipy/scipy.org-new
|
|
|
|
.. _documentation wiki: https://docs.scipy.org/scipy/Front%20Page/
|
|
|
|
.. _SciPy Central: http://scipy-central.org/
|
|
|
|
.. _doctest: http://www.doughellmann.com/PyMOTW/doctest/
|
|
|
|
.. _virtualenv: http://www.virtualenv.org/
|
|
|
|
.. _virtualenvwrapper: http://www.doughellmann.com/projects/virtualenvwrapper/
|
|
|
|
.. _bsd pitch: http://nipy.sourceforge.net/nipy/stable/faq/johns_bsd_pitch.html
|
|
|
|
.. _Pytest: https://pytest.org/
|
|
|
|
.. _mailing lists: https://www.scipy.org/scipylib/mailing-lists.html
|
|
|
|
.. _Anaconda SciPy Dev Part I (macOS): https://youtu.be/1rPOSNd0ULI
|
|
|
|
.. _Anaconda SciPy Dev Part II (macOS): https://youtu.be/Faz29u5xIZc
|
|
|
|
.. _SciPy Development Workflow: https://youtu.be/HgU01gJbzMY
|