Examples of how three implementations of Scheme work with the JVM. I look at:
With examples:
- hello world - simplest access to static member and method
- data analysis - using Apache Commons CSV and Math libraries to read a CSV file, process some statistics, and cluster the data. (Replicates K-Means Example - Descriptive Statistics - Reading CSV Files.)
Takeaway:
- JScheme's dot notation is very tidy, and keeps things close to Java notation.
- SISC is cumbersome. Unless I'm doing something wrong, everything you use from the Java side must be "acknowledged" on the Scheme side, and all values must be converted.
- Kawa is much the more powerful and flexible - classes can be defined in Kawa, and instances of them passed back to Java functions.
Created by Peter Norvig for Java 1.1 (see http://norvig.com/jscheme.html), and taken over by Ken Anderson and Tim Hickey in 1998. The last released version, 7.2, was modified around 2002.
JScheme implements R4RS, except that:
- strings are immutable (JVM limitation)
- continuations are limited to being escape procedures
Using System.out
means getting hold of the static object out
,
and then calling the println
method on it:
(.println System.out$ "Hello")
Calling a static method is direct:
(display (Math.sqrt 10.0))
Note that JScheme knows about the java.lang
namespace.
The full file "csv-jscheme.scm" is:
;; Reading CSV files from JScheme (import "java.io.*") ; <1> (import "org.apache.commons.csv.*") (import "org.apache.commons.math3.stat.descriptive.*") ;; converts a CSV record to a list, with the first four entries ;; converted to real numbers ;; returns empty list if invalid (define (csvrecord->list record) ; <2> (if (= (.size record) 5) (let ((sepal-length (string->number (.get record 0))) (sepal-width (string->number (.get record 1))) (petal-length (string->number (.get record 2))) (petal-width (string->number (.get record 3)))) (if (and (number? sepal-length) (number? sepal-width) (number? petal-length) (number? petal-width)) (list sepal-length sepal-width petal-length petal-width (.get record 4)) ())) ; return empty list if first four entries are not numbers ())) ; return empty list if record is not of correct size ;; reads CSV data for Iris dataset from file, and converts to list of lists (define (read-data filename) (tryCatch ; <3> (let* ((input-reader (BufferedReader. (FileReader. filename))) ; <4> (records (.iterator (.parse CSVFormat.RFC4180$ input-reader)))) ; <5> (let loop ((result '())) (if (.hasNext records) (let ((item (csvrecord->list (.next records)))) ; <6> (loop (if (null? item) ; ignore empty list result (cons item result)))) (reverse result)))) (lambda (exn) ; <7> (display "Error: in reading data from ") (display filename) (newline) (System.exit -1)))) ;; return an instance of DescriptiveStatistics filled with values using ;; given accessor function on the dataset (define (statistics dataset accessor-fn) (let ((ds (DescriptiveStatistics.))) (for-each (lambda (instance) (.addValue ds (accessor-fn instance))) dataset) ds)) ;; given an attribute name and DescriptiveStatistics instance, ;; display some interesting information (define (display-statistics name ds) (display name) (newline) (display "-- minimum: ") (display (.getMin ds)) (newline) (display "-- maximum: ") (display (.getMax ds)) (newline) (display "-- mean: ") (display (.getMean ds)) (newline) (display "-- stddev: ") (display (.getStandardDeviation ds)) (newline)) (define dataset (read-data "iris.data")) (display "Size of dataset: ") (display (length dataset)) (newline) (display-statistics "Sepal length" (statistics dataset car)) (display-statistics "Sepal width" (statistics dataset cadr)) (display-statistics "Petal length" (statistics dataset caddr)) (display-statistics "Petal width" (statistics dataset cadddr))
- Imports the java libraries.
-
csvrecord->list
follows very closely to the Java version, except we do not have exceptions, and use()
as an error value. -
tryCatch
runs an expression, catching any exceptions. - Opens a buffered reader, using very natural syntax.
- Gets at the csv-records, using an iterator.
-
Use
.next
, the Java iterator in action. - Handles any exception.
Calling and result are as expected, not forgetting to include all the jar file libraries:
> java -cp "commons-csv-1.9.0.jar;commons-math3-3.6.1.jar;jscheme-7.2.jar" jscheme.REPL .\csv-jscheme.scm Size of dataset: 150 Sepal length -- minimum: 4.3 -- maximum: 7.9 -- mean: 5.843333333333334 -- stddev: 0.8280661279778628 Sepal width -- minimum: 2.0 -- maximum: 4.4 -- mean: 3.054 -- stddev: 0.43359431136217386 Petal length -- minimum: 1.0 -- maximum: 6.9 -- mean: 3.758666666666666 -- stddev: 1.7644204199522626 Petal width -- minimum: 0.1 -- maximum: 2.5 -- mean: 1.1986666666666665 -- stddev: 0.763160741700841
NOTE:
The last step, creating a Clusterable
object, is not possible purely in JScheme.
A separate Java class would have to be created, compiled, and included separately.
Implemented by Scott Miller and Matthias Radestock, over the period 2002 - 2007.
SISC implements R5RS.
Java classes and methods must be "acknowledged" by the Scheme side.
(import s2j) ; <1> (define-java-class <java.lang.System>) ; <2> (define println (generic-java-method '|println|)) ; <3> (define getout (generic-java-field-accessor '|out|)) ; <4> (println (getout (java-null <java.lang.System>)) (->jstring "hello")) ; <5>
- The scheme-java bridge library.
-
Pick out the
java.lang.System
class. -
Identify
println
as a generic method. -
And
out
as a field accessor. -
Finally, call
println
. Notice a null instance ofjava.lang.System
is needed forgetout
, and the string must be converted into a Java string.
Similarly, to access sqrt:
(define-java-class <java.lang.Math>) (define msqrt (generic-java-method '|sqrt|)) (msqrt (java-null <java.lang.Math>) (->jdouble 10.0))
;; CSV file reading and analysis using SISC (import s2j) ; <1> (define-java-class <buffered-reader> |java.io.BufferedReader|) ; <2> (define-java-class <file-reader> |java.io.FileReader|) (define-java-class <jl-system> |java.lang.System|) (define-java-class <csv-format> |org.apache.commons.csv.CSVFormat|) (define-java-class <descriptive-statistics> |org.apache.commons.math3.stat.descriptive.DescriptiveStatistics|) (define get (generic-java-method '|get|)) (define add-value (generic-java-method '|addValue|)) (define get-min (generic-java-method '|getMin|)) (define get-max (generic-java-method '|getMax|)) (define get-mean (generic-java-method '|getMean|)) (define get-standard-deviation (generic-java-method '|getStandardDeviation|)) (define exit (generic-java-method '|exit|)) (define iterator (generic-java-method '|iterator|)) (define has-next (generic-java-method '|hasNext|)) (define next (generic-java-method '|next|)) (define parse (generic-java-method '|parse|)) (define size (generic-java-method '|size|)) (define getrfc (generic-java-field-accessor '|RFC4180|)) ;; converts a CSV record to a list, with the first four entries ;; converted to real numbers ;; returns empty list if invalid (define (csvrecord->list record) (if (= (->number (size record)) 5) ; <3> (let ((sepal-length (string->number (->string (get record (->jint 0))))) ; <4> (sepal-width (string->number (->string (get record (->jint 1))))) (petal-length (string->number (->string (get record (->jint 2))))) (petal-width (string->number (->string (get record (->jint 3)))))) (if (and (number? sepal-length) (number? sepal-width) (number? petal-length) (number? petal-width)) (list sepal-length sepal-width petal-length petal-width (->string (get record (->jint 4)))) ())) ; return empty list if first four entries are not numbers ())) ; return empty list if record is not of correct size ;; reads CSV data for Iris dataset from file, and converts to list of lists (define (read-data filename) (with-failure-continuation ; <5> (lambda (m e) (display "Error: in reading data from ") (display filename) (newline) (exit (java-null <jl-system>) (->jint -1))) (lambda () (let* ((input-reader (java-new <buffered-reader> (java-new <file-reader> (->jstring filename)))) (records (iterator (parse (getrfc (java-null <csv-format>)) input-reader)))) (let loop ((result '())) (if (->boolean (has-next records)) (let ((item (csvrecord->list (next records)))) (loop (if (null? item) ; ignore empty list result (cons item result)))) (reverse result))))))) ;; return an instance of DescriptiveStatistics filled with values using ;; given accessor function on the dataset (define (statistics dataset accessor-fn) (let ((ds (java-new <descriptive-statistics>))) (for-each (lambda (instance) (add-value ds (->jdouble (accessor-fn instance)))) dataset) ds)) ;; given an attribute name and DescriptiveStatistics instance, ;; display some interesting information (define (display-statistics name ds) (display name) (newline) (display "-- minimum: ") (display (->number (get-min ds))) (newline) (display "-- maximum: ") (display (->number (get-max ds))) (newline) (display "-- mean: ") (display (->number (get-mean ds))) (newline) (display "-- stddev: ") (display (->number (get-standard-deviation ds))) (newline)) (define dataset (read-data "iris.data")) (display "Size of dataset: ") (display (length dataset)) (newline) (display-statistics "Sepal length" (statistics dataset car)) (display-statistics "Sepal width" (statistics dataset cadr)) (display-statistics "Petal length" (statistics dataset caddr)) (display-statistics "Petal width" (statistics dataset cadddr))
- A Scheme-to-Java bridge library.
- Every Java class and method must be "acknowledged" on the Scheme side: this does have the advantage of providing them with Scheme-natural names.
- Each type needs to be converted, e.g. primitives are different in Scheme and Java.
- ... strings too.
- Error handling - catches any Java exceptions.
TODO: So far, I could not complete the cluster
part of the implementation due to an
error I could not resolve.
Created and continuously developed and maintained for over 25 years by Per Bothner. Implements several related languages/features, but importantly covers R7RS-small. Last version 3.1.1. Only restriction is that:
- tail-call optimisation is a compile-time flag, and usually off
The structure here is: class-name + static member + method
(java.lang.System:out:println "Hello from Kawa")
So, for example, using sqrt
from the Math
class would be:
(display (java.lang.Math:sqrt 10.0))
Kawa needs the full namespace, although names can be aliased:
(define-alias jlMath java.lang.Math) (display (jlMath:sqrt 10.0))
The full file "csv-kawa.scm" is:
;; CSV file reading and analysis using Kawa scheme (import (class java.io ; <1> BufferedReader FileReader) (class org.apache.commons.csv CSVFormat) (class org.apache.commons.math3.stat.descriptive DescriptiveStatistics) (class org.apache.commons.math3.ml.clustering Clusterable KMeansPlusPlusClusterer)) (define-simple-class IrisInstance (Clusterable) ; <2> (sepal-length) (sepal-width) (petal-length) (petal-width) (label) (point) ((getPoint) ; <3> point)) ;; converts a CSV record to a IrisInstance instance, with the first four fields ;; converted to real numbers ;; returns empty list if invalid (define (csvrecord->iris-instance record) ; <4> (if (= (record:size) 5) (let ((sepal-length (string->number (record:get 0))) (sepal-width (string->number (record:get 1))) (petal-length (string->number (record:get 2))) (petal-width (string->number (record:get 3)))) (if (and (number? sepal-length) (number? sepal-width) (number? petal-length) (number? petal-width)) (make IrisInstance ; <5> sepal-length: sepal-length sepal-width: sepal-width petal-length: petal-length petal-width: petal-width label: (record:get 4) point: (double[] sepal-length sepal-width petal-length petal-width)) ())) ; return empty list if first four entries are not numbers ())) ; return empty list if record is not of correct size ;; reads CSV data for Iris dataset from file, and converts to list of IrisInstance (define (read-data filename) (with-exception-handler ; <6> (lambda (exn) ; <7> (display "Error: in reading data from ") (display filename) (newline) (display exn) (java.lang.System:exit -1)) (lambda () (let* ((input-reader (BufferedReader (FileReader filename))) ; <8> (records ((CSVFormat:RFC4180:parse input-reader):iterator))) ; <9> (let loop ((result '())) (if (records:hasNext) (let ((item (csvrecord->iris-instance (records:next)))) ; <10> (loop (if (null? item) ; ignore empty list result (cons item result)))) (reverse result))))))) ;; return an instance of DescriptiveStatistics filled with values using ;; given accessor function on the dataset (define (statistics dataset accessor-fn) (let ((ds (DescriptiveStatistics))) (for-each (lambda (instance) (ds:addValue (accessor-fn instance))) dataset) ds)) ;; given an attribute name and DescriptiveStatistics instance, ;; display some interesting information (define (display-statistics name ds) (display name) (newline) (display "-- minimum: ") (display (ds:getMin)) (newline) (display "-- maximum: ") (display (ds:getMax)) (newline) (display "-- mean: ") (display (ds:getMean)) (newline) (display "-- stddev: ") (display (ds:getStandardDeviation)) (newline)) (define dataset (read-data "iris.data")) (display "Size of dataset: ") (display (length dataset)) (newline) (display-statistics "Sepal length" (statistics dataset (lambda (instance) instance:sepal-length))) ; <11> (display-statistics "Sepal width" (statistics dataset (lambda (instance) instance:sepal-width))) (display-statistics "Petal length" (statistics dataset (lambda (instance) instance:petal-length))) (display-statistics "Petal width" (statistics dataset (lambda (instance) instance:petal-width))) (let* ((model (KMeansPlusPlusClusterer 3)) (clusters (model:cluster dataset)) ; <12> (iterator (clusters:iterator))) (let loop () (if (iterator:hasNext) (let ((cluster (iterator:next))) (display "Cluster: ") (display ((cluster:getCenter):getPoint)) (newline) (display "Cluster has: ") (display ((cluster:getPoints):size)) (display " points") (newline) (loop)))))
- Imports the java libraries.
-
The
IrisInstance
class implements theClusterable
interface, required by the library. -
The interface requires a
getPoint
method, which returns adouble[]
-
csvrecord->list
follows very closely to the Java version, except we do not have exceptions, and use()
as an error value. -
Making an instance of
IrisInstance
requires passing in values for all the fields, including thedouble[]
for the point. -
with-exception-handler
runs an expression, catching any exceptions. - Handles any exception.
- Opens a buffered reader around a file reader.
- Gets at the csv-records, using an iterator.
-
Use
:next
, the Java iterator in action. - The accessor function for the dataset is a field lookup.
- Instances of the class can be passed to the clustering algorithm.
The code is run as follows, including the libraries.
Notice the --no-warn-unknown-member
flag - this makes Kawa output a bit quieter.
> java -cp "kawa.jar;commons-csv-1.9.0.jar;commons-math3-3.6.1.jar" kawa.repl --no-warn-unknown-member .\csv-kawa.scm Size of dataset: 150 Sepal length -- minimum: 4.3 -- maximum: 7.9 -- mean: 5.843333333333334 -- stddev: 0.8280661279778628 Sepal width -- minimum: 2.0 -- maximum: 4.4 -- mean: 3.054 -- stddev: 0.43359431136217386 Petal length -- minimum: 1.0 -- maximum: 6.9 -- mean: 3.758666666666666 -- stddev: 1.7644204199522626 Petal width -- minimum: 0.1 -- maximum: 2.5 -- mean: 1.1986666666666665 -- stddev: 0.763160741700841 Cluster: [5.005999999999999 3.4180000000000006 1.464 0.2439999999999999] Cluster has: 50 points Cluster: [6.853846153846153 3.0769230769230766 5.715384615384615 2.053846153846153] Cluster has: 39 points Cluster: [5.88360655737705 2.740983606557377 4.388524590163935 1.4344262295081966] Cluster has: 61 points