Python setuptools’ MANIFEST.in explained

TL;DR: don’t use MANIFEST.in
when packaging in
Python using setuptools
; use the
setuptools_scm
package instead.
In the mess that is Python packaging using setuptools, some things are
actually best understood in their historical context. One of those
things is the file MANIFEST.in
.
A world without VCS
Remember the times from before (general adoption of) version control
systems? This is the world MANIFEST.in
was devised
in.
In that world, imagine you want to distribute your Python project in the
form of a package of source files for others to use. That is, want to
make a source distribution, A.K.A. sdist
.
Such a distribution will be based on the directory you’re developing the project in. However, because this is your local working directory, it will contain all kinds of files which are not part of the project’s source, but part of your own development process such as the configuration of your IDE or files left behind by your text editor. Which of those files should be included as part of your distribution?
Of course, disutils
can guess, and in fact it
does
guess.
Some such guesses are “all Python source files implied by the
py_modules
and packages
options”, README.txt
and
setup.py
. However, at some point guessing will be
insuffienct.
This is were MANIFEST.in
comes in: simply specify
which files to include using various inclusion and exclusion patterns.
In the context of building (using setup.py build
or some derivate thereof such as
setup.py bdist_wheel
) we are faced with more or
less the same problem: which data files in the project should be
included in the build? Thus, MANIFEST.in
was used
as the answer to that question
too
– as long as you set the parameter
include_package_data
to
True
.
Enter the VCS
In 2019, we don’t live in a world from before the general adoption of version control systems. If you have your project in source control, the question “which files constitute the source code of this project?”, as opposed to “which files are local to a particular developer’s environment?” is already answered: the files under source control constitute the source code.
This is also the position setuptools
seems to
take, since it introduced automatic inclusion of files under source
control in the then popular SVN in
2005.
This behavior was later generalized to be able to support arbitrary
version control systems using
plugins
and the svn-specific implementation was even removed at some
point.
If you read this in 2019, your VCS is most likely either
git
or hg
, in which case the
package
setuptools_scm
provides the plugins you need. Add the following incantation to
setup.py
to ensure that files you have under
source control are packaged:
1
2
3
4
5
6
setup(
...,
use_scm_version=True,
setup_requires=['setuptools_scm'],
...,
)
Caveats
altogether? Almost… a few final caveats:
-
If you don’t use an scm, you’ll still need really-really-really use an scm, even for the smallest of projects.
setuptools_scm
has more functionality than determining which files to include in your package. It also derives the version number for the package from scm’s tags. There is no package that provides the former functionality, but not the latter. Counterpoint: you should really tag versions in your version management tool, and you should really not wish to duplicate this behavior manually elsewhere, so there is no need for a package which provides just one of these behaviors. -
There may be cases in which your concept of a “source distribution” differs from “all the stuff under source control”. Common examples are tests, debugging tools, shell scripts, documentation, etc. The counterpoint is that you really shouldn’t try to make such a distinction. This is the position Jason R. Coombs takes:
In my opinion the sdist is meant to be more than just a copy of the Python functionality, but is meant to be a distributable copy of the source code. I would expect someone to be able to download the sdist, extract it, and develop on the project much like they would if cloning the repo.
Note that this final “problem” applies exclusively to source
distributions: builds (and wheels) are automatically limited to files
living under the packages as specified by the
packages
parameter.
NOTE: This article was previously hosted on my now-defunct personal blog ‘remarkablyrestrained.com’. It’s here for archival purposes and because I still think it’s relevant.