As tweeted: Docker image of Pharo VM, 7.0.3-based app Pharo image, changes/sources files, Ubuntu 16.04, 484MB. With Pharo VM Alpine Docker image, 299MB. Build app Pharo image from minimal, run without changes/sources, Alpine Pharo VM, 83MB!
And here's the Docker container resident set size upon switching from the 484MB Docker image to the Alpine-based Docker image:
Norbert Hartl and I are collaborating on minimizing Dockerized Pharo. All are welcome to join.
I've successfully built the pharo.cog.spur.minheadless OpenSmalltalk VM on Alpine Linux. Dockerizing the Pharo VM files plus a built-from-source libsqlite3.so (without Pharo image/changes/etc) produces a Docker image weighing in at 12.5MB.
% sudo docker images | egrep "samadhiweb.*alpine" samadhiweb/pharo7vm alpine 299420ff0e03 21 minutes ago 12.5MB
Pharo provides minimal images that contain basic Pharo packages without the integrated GUI, useful for building server-side applications.
From the Pharo 7.0.3 minimal image, I've built a "stateless" image containing FFI, Fuel, and UDBC-SQLite. "Stateless" means the image can be executed by the Pharo VM without the changes and sources files. Here are the sizes of the base minimal image and my SQLite image:
% ls -l udbcsqlite.image Pharo7.0.3-0-metacello-64bit-0903ade.image -rw-rw-r-- 1 pierce pierce 13863032 Apr 12 22:56 Pharo7.0.3-0-metacello-64bit-0903ade.image -rw-rw-r-- 3 pierce pierce 17140552 Jul 20 14:11 udbcsqlite.image
Below, I run the image statelessly on my Ubuntu laptop. Note that this uses the regular Pharo VM, not the Alpine Linux one.
% mkdir stateless % cd stateless % ln ../udbcsqlite.image % chmod 444 udbcsqlite.image % cat > runtests.sh <<EOF #!/bin/sh ~/pkg/pharo7vm/gofaro -vm-display-none udbcsqlite.image test --junit-xml-output "UDBC-Tests-SQLite-Base" EOF % chmod a+x runtests.sh % ls -l total 16744 -rwxr-xr-x 1 pierce pierce 116 Jul 20 15:25 runtests.sh* -r--r--r-- 3 pierce pierce 17140552 Jul 20 14:11 udbcsqlite.image % % % ./runtests.sh Running tests in 1 Packages 71 run, 71 passes, 0 failures, 0 errors. % ls -l total 16764 -rw-r--r-- 1 pierce pierce 6360 Jul 20 15:26 progress.log -rwxr-xr-x 1 pierce pierce 116 Jul 20 15:25 runtests.sh* -r--r--r-- 3 pierce pierce 17140552 Jul 20 14:11 udbcsqlite.image -rw-r--r-- 1 pierce pierce 11072 Jul 20 15:26 UDBC-Tests-SQLite-Base-Test.xml
udbcsqlite.image together with the aforementioned Alpine
Linux Pharo VM produces a Docker image that is 46.8 MB in size.
% sudo docker images | egrep "samadhiweb.*alpine" samadhiweb/p7minsqlite alpine 3a57853099d0 44 minutes ago 46.8MB samadhiweb/pharo7vm alpine 299420ff0e03 About an hour ago 12.5MB
Run the Docker image:
% sudo docker run --ulimit rtprio=2 samadhiweb/p7minsqlite:alpine Running tests in 1 Packages 71 run, 71 passes, 0 failures, 0 errors.
For comparison and contrast, here are the sizes of the regular Pharo 7.0.3 image, changes and sources files:
% ls -l Pharo7.0*-0903ade.* -rw-rw-r-- 1 pierce pierce 190 Apr 12 22:57 Pharo7.0.3-0-64bit-0903ade.changes -rw-rw-r-- 1 pierce pierce 52455648 Apr 12 22:57 Pharo7.0.3-0-64bit-0903ade.image -rw-rw-r-- 2 pierce pierce 34333231 Apr 12 22:55 Pharo7.0-32bit-0903ade.sources
In my previous post on the TIG monitoring stack, I mentioned that Telegraf supports a large number of input plugins. One of these is the generic HTTP plugin that collects from one or more HTTP(S) endpoints producing metrics in supported input data formats.
I've implemented Telemon, a Pharo package that allows producing Pharo VM and application-specific metrics compatible with the Telegraf HTTP input plugin.
Telemon works as a Zinc
ZnServer delegate. It produces metrics in the
InfluxDB line protocol format.
By default, Telemon produces the metrics generated by
VirtualMachine>>statisticsReport and its output looks like this:
TmMetricsDelegate new renderInfluxDB "pharo uptime=1452854,oldSpace=155813664,youngSpace=2395408,memory=164765696,memoryFree=160273136,fullGCs=3,fullGCTime=477,incrGCs=9585,incrGCTime=9656,tenureCount=610024"
As per the InfluxDB line protocol, 'pharo' is the name of the measurement, and the items in key-value format form the field set.
To add a tag to the measurement:
| tm | tm := TmMetricsDelegate new. tm tags at: 'host' put: 'telemon-1'. tm renderInfluxDB "pharo,host=telemon-1 uptime=2023314,oldSpace=139036448,youngSpace=5649200,memory=147988480,memoryFree=140242128,fullGCs=4,fullGCTime=660,incrGCs=14291,incrGCTime=12899,tenureCount=696589"
Above, the tag set consists of "host=telemon-1".
Here's another invocation that adds two user-specified metrics but no tag.
| tm | tm := TmMetricsDelegate new. tm fields at: 'meaning' put: [ 42 ]; at: 'newMeaning' put: [ 84 ]. tm renderInfluxDB "pharo uptime=2548014,oldSpace=139036448,youngSpace=3651736,memory=147988480,memoryFree=142239592,fullGCs=4,fullGCTime=660,incrGCs=18503,incrGCTime=16632,tenureCount=747211,meaning=42,newMeaning=84"
Note that the field values are Smalltalk blocks that will be evaluated dynamically.
When I was reading the specifications for Telegraf's plugins, the InfluxDB line protocol, etc., it all felt rather dry. I imagine this short post is the same so far for the reader who isn't familiar with how the TIG components work together. So here are teaser screenshots of the Grafana panels for the Pharo VM and blog-specific metrics for this blog, which I will write about in the next post.
This Grafana panel shows a blog-specific metric named 'zEntity Count'.
This next panel shows the blog-specific metric 'zEntity Memory' together with the VM metric 'Used Memory' which is the difference between the 'memory' and 'memoryFree' fields.
This blog runs in a Docker container. The final panel below shows the resident set size (RSS) of the container as reported by the Docker engine.
I've set up the open source TIG stack to monitor the services running on these servers. TIG = Telegraf + InfluxDB + Grafana.
Telegraf is a server agent for collecting and reporting metrics. It comes with a large number of input, processing and output plugins. Telegraf has built-in support for Docker.
InfluxDB is a time series database.
Grafana is a feature-rich metrics dashboard supporting a variety of backends including InfluxDB.
Each of the above runs in a Docker container. Architecturally, Telegraf stores the metrics data that it collects into InfluxDB. Grafana generates visualizations from the data that it reads from InfluxDB.
Here are the CPU and memory visualizations for this blog, running on Pharo 7 within a Docker container. The data is as collected by Telegraf via querying the host's Docker engine.
Following comes to mind:
While Pharo is running on the server, historically I've kept its GUI running via RFBServer. I haven't had to VNC in for a long time now though. Running Pharo in true headless mode may reduce Pharo's CPU usage.
In terms of memory, ~10% usage by a single application is a lot on a small server. Currently this blog stores everything in memory once loaded/rendered. But with the blog's low volume, there really isn't a need to cache; all items can be read from disk and rendered on demand.
Only one way to find out - modify software, collect data, review.
This blog is now on HTTPS.
Caddy is an open source HTTP/2 web server. caddy-docker-proxy is a plugin for Caddy enabling Docker integration - when an appropriately configured Docker container or service is brought up, caddy-docker-proxy generates a Caddy site specification entry for it and reloads Caddy. With Caddy's built-in Let's Encrypt functionality, this allows the new container/service to run over HTTPS seamlessly.
Below is my docker-compose.yml for Caddy. I built Caddy with the caddy-docker-proxy plugin from source and named the resulting Docker image samadhiweb/caddy. The Docker network caddynet is the private network for Caddy and the services it is proxying. The Docker volume caddy-data is for persistence of data such as cryptographic keys and certificates.
Here's the docker-compose.yml snippet for the blog engine:
Of interest are the caddy.* labels from which caddy-docker-proxy generates the following in-memory Caddy site entry:
Also note the ulimits section, which sets the suggested limits for the Pharo VM heartbeat thread. These limits must be set in the docker-compose file or on the docker command line - copying a prepared file into /etc/security/limits.d/pharo.conf does not work when run in a Docker container.
Recently there were discussions and blog posts on Docker for Pharo and Gemstone/S. This is my report after spending an afternoon on the subject.
First, some links:
This blog is implemented in Pharo and is the natural choice for my Docker example application. I already have a Smalltalk snippet to load this blog's code and its dependencies into a pristine Pharo image, so I'll be using that. Also, as a matter of course, I build the Pharo VM from source, and my VM installation also contains self-built shared libraries like libsqlite.so and libshacrypt.so.
Outside of Docker, prepare a custom Pharo image:
gofaro is a simple shell script which purpose is to make sure the Pharo VM loads my custom shared libraries, co-located with the standard VM files, at run time:
loadSCMS1.st looks like this:
Before describing my Dockerfile, here are my conventions for inside the Docker container:
Starting with Ubuntu 18.04, install libfreetype6. The other lines are copied from Torsten's tutorial.
Next, install the Pharo VM.
Now copy over the prepared Pharo image.
Finally, set the Docker container running. Here we create a UID/GID pair to run the application. Said UID owns the mutable Pharo files in /pkg/image and also the /pkg/image directory itself, in case the application needs to create other files such as SQLite databases.
runSCMS1.st runs the blog application. In my current non-Dockerized installation, the runSCMS1.st-equivalent snippet is in a workspace; for Docker, to become DevOps/agile/CI/CD buzzwords-compliant, this snippet is run from the command line. This is one Dockerization adaptation I had to make to my application.
Now we build the Docker image.
The Docker image has been created, but it is not ready to run yet, because the web content is not in the image. I'll put the content in a Docker volume. Below, the first -v mounts my host's content directory into /tmp/webcontent in the container; the second -v mounts the volume smdw-content into /pkg/cms in the container; I'm running the busybox image to get a shell prompt; and within the container I copy the web content from the source to the destination.
Finally, run the Docker image, taking care to mount the volume smdw-content, now with this blog's content:
Verified with a web browser. This works on my computer. :-)