Minimizing Docker Pharo

11 August 2019

As tweeted: Docker image of Pharo VM, 7.0.3-based app Pharo image, changes/sources files, Ubuntu 16.04, 484MB. With Pharo VM Alpine Docker image, 299MB. Build app Pharo image from minimal, run without changes/sources, Alpine Pharo VM, 83MB!

And here's the Docker container resident set size upon switching from the 484MB Docker image to the Alpine-based Docker image:

Pharo Docker RSS

Norbert Hartl and I are collaborating on minimizing Dockerized Pharo. All are welcome to join.

PoC: Alpine Linux Minimal Stateless Pharo

20 July 2019

Alpine Linux is a security-oriented, lightweight distro based on musl-libc and BusyBox. Official Docker images of Alpine Linux are about 5MB each.

I've successfully built the pharo.cog.spur.minheadless OpenSmalltalk VM on Alpine Linux. Dockerizing the Pharo VM files plus a built-from-source (without Pharo image/changes/etc) produces a Docker image weighing in at 12.5MB.

% sudo docker images | egrep "samadhiweb.*alpine"
samadhiweb/pharo7vm                    alpine              299420ff0e03        21 minutes ago      12.5MB

Pharo provides minimal images that contain basic Pharo packages without the integrated GUI, useful for building server-side applications.

From the Pharo 7.0.3 minimal image, I've built a "stateless" image containing FFI, Fuel, and UDBC-SQLite. "Stateless" means the image can be executed by the Pharo VM without the changes and sources files. Here are the sizes of the base minimal image and my SQLite image:

% ls -l udbcsqlite.image Pharo7.0.3-0-metacello-64bit-0903ade.image 
-rw-rw-r-- 1 pierce pierce 13863032 Apr 12 22:56 Pharo7.0.3-0-metacello-64bit-0903ade.image
-rw-rw-r-- 3 pierce pierce 17140552 Jul 20 14:11 udbcsqlite.image

Below, I run the image statelessly on my Ubuntu laptop. Note that this uses the regular Pharo VM, not the Alpine Linux one.

% mkdir stateless
% cd stateless
% ln ../udbcsqlite.image
% chmod 444 udbcsqlite.image
% cat > <<EOF
~/pkg/pharo7vm/gofaro -vm-display-none udbcsqlite.image test --junit-xml-output "UDBC-Tests-SQLite-Base"
% chmod a+x
% ls -l
total 16744
-rwxr-xr-x 1 pierce pierce      116 Jul 20 15:25*
-r--r--r-- 3 pierce pierce 17140552 Jul 20 14:11 udbcsqlite.image
% ./
Running tests in 1 Packages
71 run, 71 passes, 0 failures, 0 errors.
% ls -l 
total 16764
-rw-r--r-- 1 pierce pierce     6360 Jul 20 15:26 progress.log
-rwxr-xr-x 1 pierce pierce      116 Jul 20 15:25*
-r--r--r-- 3 pierce pierce 17140552 Jul 20 14:11 udbcsqlite.image
-rw-r--r-- 1 pierce pierce    11072 Jul 20 15:26 UDBC-Tests-SQLite-Base-Test.xml

Dockerizing udbcsqlite.image together with the aforementioned Alpine Linux Pharo VM produces a Docker image that is 46.8 MB in size.

% sudo docker images | egrep "samadhiweb.*alpine"
samadhiweb/p7minsqlite                 alpine              3a57853099d0        44 minutes ago      46.8MB
samadhiweb/pharo7vm                    alpine              299420ff0e03        About an hour ago   12.5MB

Run the Docker image:

% sudo docker run --ulimit rtprio=2 samadhiweb/p7minsqlite:alpine
Running tests in 1 Packages
71 run, 71 passes, 0 failures, 0 errors.

For comparison and contrast, here are the sizes of the regular Pharo 7.0.3 image, changes and sources files:

% ls -l Pharo7.0*-0903ade.* 
-rw-rw-r-- 1 pierce pierce      190 Apr 12 22:57 Pharo7.0.3-0-64bit-0903ade.changes
-rw-rw-r-- 1 pierce pierce 52455648 Apr 12 22:57 Pharo7.0.3-0-64bit-0903ade.image
-rw-rw-r-- 2 pierce pierce 34333231 Apr 12 22:55 Pharo7.0-32bit-0903ade.sources


14 July 2019

Gitea is an open source Github-lookalike written in Go. Building Gitea from source is straightforward; the output is a single executable gitea. Pre-built binaries and Docker images are also available.

Once configured appropriately, gitea runs a HTTP server, which provides the Github-ish user interface, and a built-in SSH server, which is used for synchronizing Git repos.

In Pharo, Iceberg works with a locally running Gitea just like it works with Github.

I've been using Monticello for version control of my little tools. Monticello works when requirements are simple. But some of the tools have grown enough to need a VCS with good branching and merging capabilities.

Telemon: Pharo metrics for Telegraf

3 July 2019

In my previous post on the TIG monitoring stack, I mentioned that Telegraf supports a large number of input plugins. One of these is the generic HTTP plugin that collects from one or more HTTP(S) endpoints producing metrics in supported input data formats.

I've implemented Telemon, a Pharo package that allows producing Pharo VM and application-specific metrics compatible with the Telegraf HTTP input plugin.

Telemon works as a Zinc ZnServer delegate. It produces metrics in the InfluxDB line protocol format. By default, Telemon produces the metrics generated by VirtualMachine>>statisticsReport and its output looks like this:

TmMetricsDelegate new renderInfluxDB
"pharo uptime=1452854,oldSpace=155813664,youngSpace=2395408,memory=164765696,memoryFree=160273136,fullGCs=3,fullGCTime=477,incrGCs=9585,incrGCTime=9656,tenureCount=610024"

As per the InfluxDB line protocol, 'pharo' is the name of the measurement, and the items in key-value format form the field set.

To add a tag to the measurement:

| tm |
tm := TmMetricsDelegate new. 
tm tags at: 'host' put: 'telemon-1'.
tm renderInfluxDB
"pharo,host=telemon-1 uptime=2023314,oldSpace=139036448,youngSpace=5649200,memory=147988480,memoryFree=140242128,fullGCs=4,fullGCTime=660,incrGCs=14291,incrGCTime=12899,tenureCount=696589"

Above, the tag set consists of "host=telemon-1".

Here's another invocation that adds two user-specified metrics but no tag.

| tm |
tm := TmMetricsDelegate new. 
tm fields 
  at: 'meaning' put: [ 42 ];
  at: 'newMeaning' put: [ 84 ].
tm  renderInfluxDB
"pharo uptime=2548014,oldSpace=139036448,youngSpace=3651736,memory=147988480,memoryFree=142239592,fullGCs=4,fullGCTime=660,incrGCs=18503,incrGCTime=16632,tenureCount=747211,meaning=42,newMeaning=84"

Note that the field values are Smalltalk blocks that will be evaluated dynamically.

When I was reading the specifications for Telegraf's plugins, the InfluxDB line protocol, etc., it all felt rather dry. I imagine this short post is the same so far for the reader who isn't familiar with how the TIG components work together. So here are teaser screenshots of the Grafana panels for the Pharo VM and blog-specific metrics for this blog, which I will write about in the next post.

This Grafana panel shows a blog-specific metric named 'zEntity Count'.

Grafana Pharo app metrics

This next panel shows the blog-specific metric 'zEntity Memory' together with the VM metric 'Used Memory' which is the difference between the 'memory' and 'memoryFree' fields.

Grafana Pharo VM and app metrics

This blog runs in a Docker container. The final panel below shows the resident set size (RSS) of the container as reported by the Docker engine.

Grafana Pharo Docker metrics