Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python3 Dockerfile, based on CKAN's #242

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Tests
on: [push, pull_request]
jobs:
test:
strategy:
matrix:
python-version: [2.7, 3.6, 3.7, 3.8, 3.9]
fail-fast: false
name: Python ${{ matrix.python-version }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install requirements (Python 2)
if: ${{ matrix.python-version == '2.7' }}
run: pip install -r requirements-dev-py2.txt && pip install .
- name: Install requirements (Python 3)
if: ${{ matrix.python-version != '2.7' }}
run: pip install -r requirements-dev.txt && pip install .
- name: Run tests
run: pytest --cov=datapusher --cov-append --cov-report=xml --disable-warnings tests
- name: Upload coverage report to codecov
uses: codecov/codecov-action@v1
with:
file: ./coverage.xml
46 changes: 46 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
FROM debian:buster

# Install required system packages
RUN apt-get -q -y update \
&& DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade \
&& apt-get -q -y install \
python3-dev \
python3-pip \
python3-virtualenv \
zlib1g-dev \
libxml2-dev \
libxslt1-dev \
libffi-dev \
# else error https://stackoverflow.com/questions/14547631/python-locale-error-unsupported-locale-setting
locales \
postgresql-client \
build-essential \
git \
vim \
wget \
&& apt-get -q clean \
&& rm -rf /var/lib/apt/lists/*


RUN python3 -m virtualenv --python=python3 /venv
ENV PATH="/venv/bin:$PATH"

# else error https://stackoverflow.com/questions/59633558/python-based-dockerfile-throws-locale-error-unsupported-locale-setting
ENV LC_ALL=C

# NO else https://github.com/ckan/datapusher/issues/132
#datapusher | File "/venv/src/datapusher/jobs.py", line 158, in check_response
#datapusher | request_url=request_url, response=response.text)
#datapusher | datapusher.jobs.HTTPError: <unprintable HTTPError object>
#ENV DATAPUSHER_SSL_VERIFY=true

# Setup Datapusher
ADD . /venv/src/
RUN pip install -U pip && \
cd /venv/src/ && \
pip install --upgrade --no-cache-dir -r requirements.txt && \
pip install --upgrade --no-cache-dir -r requirements-dev.txt && \
#pip install -e .
python setup.py develop

CMD [ "python", "/venv/src/datapusher/main.py", "/venv/src/deployment/datapusher_settings.py"]
47 changes: 24 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
[![Build Status](https://travis-ci.org/ckan/datapusher.png?branch=master)](https://travis-ci.org/ckan/datapusher)
[![Coverage Status](https://coveralls.io/repos/ckan/datapusher/badge.png?branch=master)](https://coveralls.io/r/ckan/datapusher?branch=master)
[![Tests](https://github.com/ckan/datapusher/actions/workflows/test.yml/badge.svg)](https://github.com/ckan/datapusher/actions/workflows/test.yml)
[![Latest Version](https://img.shields.io/pypi/v/datapusher.svg)](https://pypi.python.org/pypi/datapusher/)
[![Downloads](https://img.shields.io/pypi/dm/datapusher.svg)](https://pypi.python.org/pypi/datapusher/)
[![Supported Python versions](https://img.shields.io/pypi/pyversions/datapusher.svg)](https://pypi.python.org/pypi/datapusher/)
Expand Down Expand Up @@ -67,7 +66,7 @@ If you need to change the host or port, copy `deployment/datapusher_settings.py`

To run the tests:

nosetests
pytest

## Production deployment

Expand All @@ -85,24 +84,24 @@ probably need to set up Nginx as a reverse proxy in front of it and something li
Supervisor to keep the process up.


# Install requirements for the DataPusher
sudo apt install python3-venv python3-dev build-essential
sudo apt-get install python-dev python-virtualenv build-essential libxslt1-dev libxml2-dev git libffi-dev
# Install requirements for the DataPusher
sudo apt install python3-venv python3-dev build-essential
sudo apt-get install python-dev python-virtualenv build-essential libxslt1-dev libxml2-dev git libffi-dev

# Create a virtualenv for datapusher
# Create a virtualenv for datapusher
sudo python3 -m venv /usr/lib/ckan/datapusher

# Create a source directory and switch to it
sudo mkdir /usr/lib/ckan/datapusher/src
cd /usr/lib/ckan/datapusher/src
# Create a source directory and switch to it
sudo mkdir /usr/lib/ckan/datapusher/src
cd /usr/lib/ckan/datapusher/src

# Clone the source (you should target the latest tagged version)
sudo git clone -b 0.0.17 https://github.com/ckan/datapusher.git
# Clone the source (you should target the latest tagged version)
sudo git clone -b 0.0.17 https://github.com/ckan/datapusher.git

# Install the DataPusher and its requirements
cd datapusher
sudo /usr/lib/ckan/datapusher/bin/pip install -r requirements.txt
sudo /usr/lib/ckan/datapusher/bin/python setup.py develop
# Install the DataPusher and its requirements
cd datapusher
sudo /usr/lib/ckan/datapusher/bin/pip install -r requirements.txt
sudo /usr/lib/ckan/datapusher/bin/python setup.py develop

# Create a user to run the web service (if necessary)
sudo addgroup www-data
Expand Down Expand Up @@ -132,8 +131,8 @@ The default DataPusher configuration uses SQLite as the backend for the jobs dat
sudo -u postgres createuser -S -D -R -P datapusher_jobs
sudo -u postgres createdb -O datapusher_jobs datapusher_jobs -E utf-8

# Run this in the virtualenv where DataPusher is installed
pip install psycopg2
# Run this in the virtualenv where DataPusher is installed
pip install psycopg2

# Edit SQLALCHEMY_DATABASE_URI in datapusher_settings.py accordingly
# eg SQLALCHEMY_DATABASE_URI=postgresql://datapusher_jobs:YOURPASSWORD@localhost/datapusher_jobs
Expand All @@ -143,9 +142,9 @@ The default DataPusher configuration uses SQLite as the backend for the jobs dat

```
# ... rest of datapusher-uwsgi.ini
workers = 3
threads = 3
lazy-apps = true
workers = 3
threads = 3
lazy-apps = true
```

## Configuring
Expand Down Expand Up @@ -184,12 +183,14 @@ Here's a summary of the options available.
| SSL_VERIFY | False | Do not validate SSL certificates when requesting the data file (*Warning*: Do not use this setting in production) |
| TYPES | [messytables.StringType, messytables.DecimalType, messytables.IntegerType, messytables.DateUtilType] | [Messytables][] types used internally, can be modified to customize the type guessing |
| TYPE_MAPPING | {'String': 'text', 'Integer': 'numeric', 'Decimal': 'numeric', 'DateUtil': 'timestamp'} | Internal Messytables type mapping |
| LOG_FILE | `/tmp/ckan_service.log` | Where to write the logs. Use an empty string to disable |
| STDERR | `True` | Log to stderr? |


Most of the configuration options above can be also provided as environment variables prepending the name with `DATAPUSHER_`, eg `DATAPUSHER_SQLALCHEMY_DATABASE_URI`, `DATAPUSHER_PORT`, etc.
Most of the configuration options above can be also provided as environment variables prepending the name with `DATAPUSHER_`, eg `DATAPUSHER_SQLALCHEMY_DATABASE_URI`, `DATAPUSHER_PORT`, etc. In the specific case of `DATAPUSHER_STDERR` the possible values are `1` and `0`.


By default DataPusher uses SQLite as the database backend for the jobs information. This is fine for local development and sites with low activity, but for sites that need more performance should use Postgres as the backend for the jobs database (eg `SQLALCHEMY_DATABASE_URI=postgresql://datapusher_jobs:YOURPASSWORD@localhost/datapusher_jobs`. See also [High Availability Setup](#high-availability-setup). If SQLite is used, is probably a good idea to store the database in a location other than `/tmp`. This will prevent the database being dropped, causing out of sync errors in the CKAN side. A good place to store it is the CKAN storage folder (if DataPusher is installed in the same server), generally in `/var/lib/ckan/`.
By default, DataPusher uses SQLite as the database backend for jobs information. This is fine for local development and sites with low activity, but for sites that need more performance, Postgres should be used as the backend for the jobs database (eg `SQLALCHEMY_DATABASE_URI=postgresql://datapusher_jobs:YOURPASSWORD@localhost/datapusher_jobs`. See also [High Availability Setup](#high-availability-setup). If SQLite is used, its probably a good idea to store the database in a location other than `/tmp`. This will prevent the database being dropped, causing out of sync errors in the CKAN side. A good place to store it is the CKAN storage folder (if DataPusher is installed in the same server), generally in `/var/lib/ckan/`.


## Usage
Expand Down
2 changes: 1 addition & 1 deletion datapusher/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.0.17'
__version__ = '0.0.18'
5 changes: 5 additions & 0 deletions deployment/datapusher-uwsgi.ini
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,8 @@ max-requests = 5000
vacuum = true
callable = application
buffer-size = 32768

## see High Availability Setup
#workers = 3
#threads = 3
#lazy-apps = true
3 changes: 2 additions & 1 deletion deployment/datapusher_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,5 @@
SSL_VERIFY = os.environ.get('DATAPUSHER_SSL_VERIFY', True)

# logging
#LOG_FILE = '/tmp/ckan_service.log'
LOG_FILE = os.environ.get('DATAPUSHER_LOG_FILE', '/tmp/ckan_service.log')
STDERR = bool(int(os.environ.get('DATAPUSHER_STDERR', '1')))
4 changes: 4 additions & 0 deletions requirements-dev-py2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-r requirements.txt
httpretty==0.9.4
pytest
pytest-cov
5 changes: 3 additions & 2 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
-r requirements.txt
httpretty==0.9.4
nose
httpretty==1.1.4
pytest
pytest-cov
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
argparse
ckanserviceprovider==0.0.10
ckanserviceprovider==1.0.0
html5lib==1.0.1
messytables==0.15.2
certifi
requests[security]==2.24.0
requests[security]==2.27.1
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',

'Programming Language :: Python :: 3.9',
],

# What does your project relate to?
Expand Down
Loading