- Docs »
- Setup
- Edit on GitHub
Paperless isn’t a very complicated app, but there are a few components, so somebasic documentation is in order. If you follow along in this document andstill have trouble, please open an issue on GitHub so I can fill in thegaps.
Download¶
The source is currently only available via GitHub, so grab it from there,either by using git
:
$ git clone https://github.com/the-paperless-project/paperless.git$ cd paperless
or just download the tarball and go that route:
$ cd to the directory where you want to run Paperless$ wget https://github.com/the-paperless-project/paperless/archive/master.zip$ unzip master.zip$ cd paperless-master
Installation & Configuration¶
You can go multiple routes with setting up and running Paperless:
- The bare metal route
- The docker route
- A suggested linux containers route
The docker route is quick & easy.
The bare metal route is a bit more complicated to setup but makes it easiershould you want to contribute some code back.
The linux containers route is quick, but makes alot of assumptions on theset-up, on the other hand the script could be used to install on a basedebian or ubuntu server.
Standard (Bare Metal)¶
Install the requirements as per the requirements page.
Within the extract of master.zip go to the
src
directory.Copy
../paperless.conf.example
to/etc/paperless.conf
and open it inyour favourite editor. As this file contains passwords. It should only bereadable by user root and paperless! Set the values for:Set the values for:
PAPERLESS_CONSUMPTION_DIR
: this is where your documents will bedumped to be consumed by Paperless.PAPERLESS_OCR_THREADS
: this is the number of threads the OCR processwill spawn to process document pages in parallel.PAPERLESS_PASSPHRASE
: this is only required if you want to use GPG toencrypt your document files. This is the passphrase Paperless uses toencrypt/decrypt the original documents. Don’t worry about defining thisif you don’t want to use encryption (the default).
Note also that if you’re using the
runserver
as mentioned below, youshould make sure that PAPERLESS_DEBUG=”true” or is just commented out asthis is the default.Initialise the SQLite database with
./manage.py migrate
.Collect the static files for the webserver with
./manage.py collectstatic
.Create a user for your Paperless instance with
./manage.py createsuperuser
. Follow the prompts to create your user.Start the webserver with
./manage.py runserver <IP>:<PORT>
.If no specific IP or port is given, the default is127.0.0.1:8000
alsoknown as http://localhost:8000/.You should now be able to visit your (empty) installation atPaperless webserver or whatever you chose before. You can login with theuser/pass you created in #5.In a separate window, change to the
src
directory in this repo again,but this time, you should start the consumer script with./manage.py document_consumer
.Scan something or put a file into the
CONSUMPTION_DIR
.Wait a few minutes
Visit the document list on your webserver, and it should be there, indexedand downloadable.
Caution
This installation is not secure. Once everything is working head over toMaking things more permanent
Docker Method¶
Install Docker.
Caution
As mentioned earlier, this guide assumes that you use Docker nativelyunder Linux. If you are using Docker Machine under Mac OS X orWindows, you will have to adapt IP addresses, volume-mounting, commandexecution and maybe more.
Install docker-compose. [1]
Caution
If you want to use the included
docker-compose.yml.example
file, youneed to have at least Docker version 1.12.0 and docker-composeversion 1.9.0.See the Docker installation guide on how to install the currentversion of Docker for your operating system or Linux distribution ofchoice. To get an up-to-date version of docker-compose, follow thedocker-compose installation guide if your package repository doesn’tinclude it.
Create a copy of
docker-compose.yml.example
asdocker-compose.yml
and a copy ofdocker-compose.env.example
asdocker-compose.env
.You’ll be editing both these files: taking a copy ensures that you cangit pull
to receive updates without risking merge conflicts with yourmodified versions of the configuration files.Modify
docker-compose.yml
to your preferences, following theinstructions in comments in the file. The only change that is a hardrequirement is to specify where the consumption directory shouldmount.[#dockercomposeyml]_Caution
- If you are using NFS mounts for the consume directory you also need to
change the command to turn off inotify as it doesn’t work with NFS
command: ["document_consumer", "--no-inotify"]
Modify
docker-compose.env
and adapt the following environment variables:PAPERLESS_PASSPHRASE
This is the passphrase Paperless uses to encrypt/decrypt the originaldocument. If you aren’t planning on using GPG encryption, you can justleave this undefined.
PAPERLESS_OCR_THREADS
This is the number of threads the OCR process will spawn to processdocument pages in parallel. If the variable is not set, Python determinesthe core-count of your CPU and uses that value.
PAPERLESS_OCR_LANGUAGES
If you want the OCR to recognize other languages in addition to thedefault English, set this parameter to a space separated list ofthree-letter language-codes after ISO 639-2/T. For a list of availablelanguages – including their three letter codes – see theAlpine packagelist.
USERMAP_UID
andUSERMAP_GID
If you want to mount the consumption volume (directory
/consume
withinthe containers) to a host-directory – which you probably want to do –access rights might be an issue. The default user and grouppaperless
in the containers have an id of 1000. The containers will enforce that theowning group of the consumption directory will bepaperless
to be ableto delete consumed documents. If your host-system has a group with an IDof 1000 and you don’t want this group to have access rights to theconsumption directory, you can useUSERMAP_GID
to change the id in thecontainer and thus the one of the consumption directory. Furthermore, youcan change the id of the default user as well usingUSERMAP_UID
.
PAPERLESS_USE_SSL
- If you want Paperless to use SSL for the user interface, set this variableto
true
. You also need to copy your certificate and key to thedata
directory, namedssl.cert
andssl.key
.This is not an ideal solution and, if possible, a reverse proxy with nginxis preferred.
Run
docker-compose up -d
. This will create and start the necessarycontainers.To be able to login, you will need a super user. To create it, execute thefollowing command:
$ docker-compose run --rm webserver createsuperuser
See AlsoConfiguration - Paperless-ngxThis will prompt you to set a username (default
paperless
), an optionale-mail address and finally a password.The default
docker-compose.yml
exports the webserver on your local port8000. If you haven’t adapted this, you should now be able to visit yourPaperless webserver athttp://127.0.0.1:8000
(orhttps://127.0.0.1:8000
if you enabled SSL). You can login with theuser and password you just created.Add files to consumption directory the way you prefer to. Following are twopossible options:
Mount the consumption directory to a local host path by modifying your
docker-compose.yml
:diff --git a/docker-compose.yml b/docker-compose.yml--- a/docker-compose.yml+++ b/docker-compose.yml@@ -17,9 +18,8 @@ services: volumes: - paperless-data:/usr/src/paperless/data - paperless-media:/usr/src/paperless/media- - /consume+ - /local/path/you/choose:/consume
Danger
While the consumption container will ensure at startup that it candelete a consumed file from a host-mounted directory, it mightnot be able to read the document in the first place if the accessrights to the file are incorrect.
Make sure that the documents you put into the consumption directorywill either be readable by everyone (
chmod o+r file.pdf
) orreadable by the default user or group id 1000 (or the one you haveset withUSERMAP_UID
orUSERMAP_GID
respectively).Use
docker cp
to copy your files directly into the container:$ # Identify your containers$ docker-compose ps Name Command State Ports-------------------------------------------------------------------------paperless_consumer_1 /sbin/docker-entrypoint.sh ... Exit 0paperless_webserver_1 /sbin/docker-entrypoint.sh ... Exit 0$ docker cp /path/to/your/file.pdf paperless_consumer_1:/consume
docker cp
is a one-shot-command, just likecp
. This means thatevery time you want to consume a new document, you will have to executedocker cp
again. You can of course automate this process, but option1 is generally the preferred one.Danger
docker cp
will change the owning user and group of a copied fileto the acting user at the destination, which will beroot
.You therefore need to ensure that the documents you want to copy intothe container are readable by everyone (
chmod o+r file.pdf
)before copying them.
[1] | You of course don’t have to use docker-compose, but itsimplifies deployment immensely. If you know your way around Docker, feelfree to tinker around without using compose! |
[2] | If you’re upgrading your docker-compose images fromversion 1.1.0 or earlier, you might need to change in thedocker-compose.yml file the image: pitkley/paperless directive inboth the webserver and consumer sections to build: ./ as per thenewer docker-compose.yml.example file |
Making Things a Little more Permanent¶
Once you’ve tested things and are happy with the work flow, you should securethe installation and automate the process of starting the webserver andconsumer.
Using a Real Webserver¶
The default is to use Django’s development server, as that’s easy and does thejob well enough on a home network. However it is heavily discouraged to useit for more than that.
If you want to do things right you should use a real webserver capable ofhandling more than one thread. You will also have to let the webserver servethe static files (CSS, JavaScript) from the directory configured inPAPERLESS_STATICDIR
. The default static files directory is ../static
.
For that you need to activate your virtual environment and collect the staticfiles with the command:
$ cd <paperless directory>/src$ ./manage.py collectstatic
Apache¶
This is a configuration supplied by steckerhalter on GitHub. It uses Apacheand mod_wsgi, with a Paperless installation in /home/paperless/
:
<VirtualHost *:80> ServerName example.com Alias /static/ /home/paperless/paperless/static/ <Directory /home/paperless/paperless/static> Require all granted </Directory> WSGIScriptAlias / /home/paperless/paperless/src/paperless/wsgi.py WSGIDaemonProcess example.com user=paperless group=paperless threads=5 python-path=/home/paperless/paperless/src:/home/paperless/.env/lib/python3.6/site-packages WSGIProcessGroup example.com <Directory /home/paperless/paperless/src/paperless> <Files wsgi.py> Require all granted </Files> </Directory></VirtualHost>
Nginx + Gunicorn¶
If you’re using Nginx, the most common setup is to combine it with aPython-based server like Gunicorn so that Nginx is acting as a proxy. Below isa copy of a simple Nginx configuration fragment making use of a gunicorninstance listening on localhost port 8000.
server { listen 80; index index.html index.htm index.php; access_log /var/log/nginx/paperless_access.log; error_log /var/log/nginx/paperless_error.log; location /static { autoindex on; alias <path-to-paperless-static-directory>; } location / { proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_pass http://127.0.0.1:8000; }}
The gunicorn server can be started with the command:
$ <path-to-paperless-virtual-environment>/bin/gunicorn --pythonpath=<path-to-paperless>/src paperless.wsgi -w 2
Standard (Bare Metal + Systemd)¶
If you’re running on a bare metal system that’s using Systemd, you can use theservice unit files in the scripts
directory to set this up.
- You’ll need to create a group and user called
paperless
(without login) - Setup Paperless to be in a place that this new user can read and write to.
- Ensure
/etc/paperless
is readable by thepaperless
user. - Copy the service file from the
scripts
directory to/etc/systemd/system
.
$ cp /path/to/paperless/scripts/paperless-consumer.service /etc/systemd/system/$ cp /path/to/paperless/scripts/paperless-webserver.service /etc/systemd/system/
- Edit the service file to point the
ExecStart
line to the proper locationof your paperless install, referencing the appropriate Python binary. Forexample:ExecStart=/path/to/python3 /path/to/paperless/src/manage.py document_consumer
. - Start and enable (so they start on boot) the services.
$ systemctl enable paperless-consumer$ systemctl enable paperless-webserver$ systemctl start paperless-consumer$ systemctl start paperless-webserver
Standard (Bare Metal + Upstart)¶
Ubuntu 14.04 and earlier use the Upstart init system to start servicesduring the boot process. To configure Upstart to run Paperless automaticallyafter restarting your system:
Change to the directory where Upstart’s configuration files are kept:
cd /etc/init
Create a new file:
sudo nano paperless-server.conf
In the newly-created file enter:
start on (local-filesystems and net-device-up IFACE=eth0)stop on shutdownrespawnrespawn limit 10 5script exec <path to paperless virtual environment>/bin/gunicorn --pythonpath=<path to parperless>/src paperless.wsgi -w 2end script
Note that you’ll need to replace
/srv/paperless/src/manage.py
with thepath to themanage.py
script in your installation directory.
If you are using a network interface other than
eth0
, you will have tochangeIFACE=eth0
. For example, if you are connected via WiFi, you willlikely need to replaceeth0
above withwlan0
. To see all interfaces,runifconfig -a
.Save the file.
Create a new file:
sudo nano paperless-consumer.conf
In the newly-created file enter:
start on (local-filesystems and net-device-up IFACE=eth0)stop on shutdownrespawnrespawn limit 10 5script exec <path to paperless virtual environment>/bin/python <path to parperless>/manage.py document_consumerend script
Replace the path placeholder and
eth0
with the appropriate value and save the file.
These two configuration files together will start both the Paperless webserverand document consumer processes when the file system and network interfacespecified is available after boot. Furthermore, if either process ever exitsunexpectedly, Upstart will try to restart it a maximum of 10 times within a 5second period.
Docker¶
If you’re using Docker, you can set a restart-policy in thedocker-compose.yml
to have the containers automatically start with theDocker daemon.
Suggested way for Linux Container Method¶
This method uses some rigid assumptions, for the best set-up:-
- Ubuntu lts as the container
- Apache as the webserver
- proftpd as ftp server
- ftpupload as the ftp user
- paperless as the main user for website
- http://paperless.lan is the desired lan url
- LXC set to give ip addresses on your lan
This could also be used as an install on a base debain/ubuntu server,if the above assumptions are acceptable.
- Install lxc
- Lanch paperless container
$ lxc launch ubuntu: paperless
- Run install script within container
$ lxc exec paperless -- sh -c "wget https://raw.githubusercontent.com/the-paperless-project/paperless/master/docs/examples/lxc/lxc-install.sh && /bin/bash lxc-install.sh --email"
The script will ask you for an ftpupload password.As well as the super-user for paperless web front-end.After around 10 mins, http://paperless.lan is ready andftp://paperless.lan with user: ftpupload
See the Installation recording.