Skip to content

DrSnowbird/openrefine

Repository files navigation

OpenRefine (v3.4) + OpenJDK Java 8 JDK + Maven 3.6 + Python 3.6/2.7 + pip 20 + node 15 + npm 7 + Gradle 6

Image Image

** UPDATE **

Use OpenJDK from now on!!

Components:

Run (recommended for easy-start)

./run.sh

It actually automatically executes the docker run command as

sudo docker run --rm -it --name=openrefine --restart=no -e USER_ID=1000 -e GROUP_ID=1000 -e OPENREFINE_VM_MAX_MEM=8192M -e OPENREFINE_VER=3.1 -v $HOME/data-docker/openrefine/data:/home/developer/data -v $HOME/data-docker/openrefine/workspace:/home/developer/workspace -p 3333:3333 openkbs/openrefine

Note the above command will create $HOME/data and $HOME/workspace automatically mapping onto the container's equivalent directories so that you can just easily use your HOST's directories above to edit projects and data.

Or, manually enter command by yourself

mkdir ./data
docker run -d --name my-openrefine -v $PWD/data:/data -i -t openkbs/openrefine

Base the image to build add-on applications

FROM openkbs/openrefine

Default Build (locally)

./build.sh

Base the image to build add-on applications

FROM openkbs/openrefine

Build (build your own image tag)

Say, you will build the image "my/openrefine".

docker build -t my/openrefine .

To run your own image, say, with some-openrefine:

mkdir ./data
docker run -d --name some-openrefine -v $PWD/data:/data -i -t my/openrefine

To change the default Java Max Memory used by OpenRefine (useful for processing large dataset)

Just edit docker.env file with, for example, default to 8192M (8GB memory)
  
OPENREFINE_VM_MAX_MEM=8192M

Shell into the Docker instance

docker exec -it some-openrefine /bin/bash

Use openrefine (with your Web Browsers)

Web UI:

  http://<ip_address>:3333/

Releases information

About

OpenRefine Docker for Data ETL/ELT

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors