Weka Data Mining with Singularity

By Staff

Sep 26, 2018 | Blog, How To Guides

Weka is a commonly used Machine Learning suite of algorithms for Data Mining with Java.  We’ve developed a Singularity container so that your Weka environment and data can now be moved cross-system on-demand, with all the benefits of the Singularity Image Format (SIF).

Recipe:

BootStrap: docker
From: ubuntu:16.04

%post
    apt-get -y update
    apt-get -y install curl
    apt-get -y install unzip
    apt-get install -y openjdk-8-jre
    curl -sSL "https://prdownloads.sourceforge.net/weka/weka-3-8-3.zip" > weka.zip
    unzip weka.zip -d / && rm -f weka.zip*
    echo 'export CLASSPATH=/weka-3-8-3/weka.jar' >> /environment
    apt-get clean

To build the Weka container, we run:

$ sudo singularity build weka.sif weka.def

Weka builds without any setup required and its basic usage is:

$ singularity exec weka.sif java weka.classifiers.object

Toy datasets are included with this install, let’s test them out with a command:

$ singularity exec weka.sif java weka.classifiers.functions.MultilayerPerceptron \
    -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H a -t /weka-3-8-3/data/breast-cancer.arff

It also comes with a multitude of other functions, try the BayesNet function:

$ singularity exec weka.sif java weka.classifiers.bayes.BayesNet -t /weka-3-8-3/data/iris.arff -D \
  -Q weka.classifiers.bayes.net.search.local.K2 -- -P 2 -S ENTROPY \
  -E weka.classifiers.bayes.net.estimate.SimpleEstimator -- -A 1.0

Of course, when you run Weka you’ll want to use real data by adding the -B flag to bind your data directory into the container:

$ singularity exec -B path/to/data:/weka-3-8-3/data weka.sif java weka.classifiers.functions.[function here] \
    -t /weka-3-8-3/data/yourdataset/file.arff [args]

For more information about Weka visit their home page.

Join Our Mailing List

Related Posts

SingularityCE Now Available in EPEL

EPEL (Extra Packages for Enterprise Linux) is a repository of additional packages for Enterprise Linux, including Red Hat Enterprise Linux, AlmaLinux, Oracle Linux, Rocky Linux and others. By integrating SingularityCE with EPEL, starting with release 3.10.4, users may...

read more

QA and Stability in Singularity

There are many different approaches that can be taken when building software. At one end of the spectrum is the extreme caution and conservatism that’s appropriate, for example, of safety critical code used in vehicles or in real-time operating systems. At the other...

read more