World Cup 2018 with Glorp

10 June 2018

I last wrote about football.db in a Jul 2014 blog post. Four years have gone by, and the World Cup is here again. This time around, I've started building a Glorp descriptor system for the World Cup data.

football.db's data is described in YAML files which are assembled into SQLite databases using tools written in Ruby. From the constructed 2014 and 2018 World Cup databases, I've created SQL dumps and placed them in the repo. To get an SQLite database from the 2018 SQL dump file:

% sqlite3 wc2018.db < wc2018.sql

In its current state, footballdb-Glorp allows querying the initial World Cup first round group membership.

| login sess |
login := Login new
  database: UDBCSQLite3Platform new;
  host: '';
  port: '';
  username: '';
  password: '';
  databaseName: '/tmp/wc2018.db';
sess := OFDescriptor sessionForLogin: login.
sess login.
[   Transcript clear.	
    (sess read: OFTournamentTeam) do: [ :ea |
      Transcript show: ea group title , ' - ', ea team title; cr ]
] ensure: [ sess logout ].

This Glorp descriptor may not be completed in time for this World Cup which is starting in a few days, but will be in time for the next one for sure! :-) Load it thusly:

Metacello new 
  repository: 'github://PierceNg/footballdb-Glorp:master/repo';
  baseline: 'OpenFootball';

SQLite Boolean

20 May 2018

From its documentation, SQLite does not have a separate Boolean storage class. Instead, Boolean values are stored as integers 0 (false) and 1 (true).

I've added Boolean handling to UDBC-SQLite. When writing to an SQLite database from Pharo, true is written as integer 1 and false as integer 0. SQLite uses dynamic typing, and any column in an SQLite database, except an INTEGER PRIMARY KEY column, may be used to store a value of any type, irrespective of the column's type declaration. As such, when writing Boolean values to a database, UDBC-SQLite does not check the database's type declaration.

When reading an SQLite database, UDBC-SQLite does check a column's type declaration: If a column is Boolean, UDBC-SQLite reads 1 as true, 0 as false, NULL as nil, and any other integer values raises an exception. I've encountered real world data where the string "t" means true and "f" means false for a Boolean column, so UDBC-SQLite handles these cases too.

Glorp has been similarly updated. Loading GlorpSQLite, from my development fork for now, installs both UDBC-SQLite and Glorp:

Metacello new
  repository: 'github://PierceNg/glorp-sqlite3';
  baseline: 'GlorpSQLite';

All Glorp unit tests should pass. Tested on Linux using fresh 32- and 64-bit 60540 images.

SQLite Loadable Extensions

4 March 2018

SQLite has the ability to load extensions at run-time. I've now implemented this functionality in UDBC-SQLite.

An SQLite extension is built as a .so/dylib/dll shared library file. Let's use SQLite's rot13 extension as our example. The source file rot13.c is located in the SQLite source code's ext/misc directory. To build the rot13 extension, also download the amalgamation. Unzip the amalgamation and copy rot13.c into its directory. Build the extension:

% gcc -fPIC -shared -o -I. rot13.c

Verify that the extension works:

% sqlite3
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> select rot13('qwerty');
Error: no such function: rot13
sqlite> .load "./"
sqlite> select rot13('qwerty');
sqlite> select rot13('djregl');

For use with Pharo, copy into the Pharo VM directory where all the other .so files are.

Next steps are done in Pharo. For the purpose of this blog post, I downloaded a fresh Pharo 60536 64-bit image. Start the image and install GlorpSQLite from the Catalog browser, which installs the latest UDBC-SQLite. (This also installs Glorp, of course. If you run Glorp's unit tests from Test Runner you should get all 890 tests passed.)

In a playground, run this snippet and Transcript should show, first, the text "no such function: rot13" and then the rot13 outputs.

| db rs r1 r2 |
Transcript clear.
db := UDBCSQLite3Connection openOn: ':memory:'.
[   [ db execute: 'select rot13(''qwerty'')' ]
      on: UDBCSQLite3Error
      do: [ ex: | Transcript show: ex messageText; cr ].
    db enableExtensions;          "<== New stuff."
      loadExtension: ''.  "<== New stuff."
    rs := db execute: 'select rot13(''qwerty'') as r1'.
    r1 := rs next at: 'r1'.
    Transcript show: r1; cr. 
    rs := db execute: 'select rot13(?) as r2' with: (Array with: r1).
    r2 := rs next at: 'r2'.
    Transcript show: r2; cr.
] ensure: [ db close ]

Note the messages #enableExtensions and #loadExtension:. For security reasons, extension loading is disabled by default.

Tested on Linux latest Pharo 64-bit 60536.

Glorp SQLite on Pharo 6

1 February 2017

I've updated ConfigurationOfGlorp for Pharo 6 and added catalog methods to ConfigurationOfGlorpSQLite.

Take a fresh Pharo 60365 image, the latest as of yesterday's download. Launch it, open the Catalog Browser, and install GlorpSQLite from there.

Run the Glorp tests in TestRunner. The result should be green, with all 889 tests passed and 12 tests skipped. The database file is sodbxtestu.db in your image directory.

(On Pharo 5, Glorp runs and passes total of 953 tests. Something to look into.)


31 January 2017

I've created ConfigurationOfGlorpSQLite on STH.

Take a fresh Pharo 5 image. Make a script like the following and run it on the image:

% cat
pharo $1.image config $MCREPO ConfigurationOfGlorpSQLite --install=stable

% ./ Pharo-50757

When done, fire up the image again, and run the Glorp tests in TestRunner. The result should be green, with all 953 tests passed and 12 tests skipped. The database file is sodbxtestu.db in your image directory.

Glorp-SQLite3 for Pharo 5

6 June 2016

I'm pleased to announce the release of Glorp-SQLite3 for Pharo 5.

Developed and tested on Linux. Known working on Windows 7. Your Pharo 5 VM needs to be able to find or the Windows equivalent.

Take a fresh Pharo 5 image. Run the following:

Gofer it
    smalltalkhubUser: 'TorstenBergmann' project: 'UDBC';
(Smalltalk at: #ConfigurationOfUDBC) loadBleedingEdge.

Gofer it
    smalltalkhubUser: 'DBXTalk' project: 'Glorp';
    configurationOf: 'Glorp';
#ConfigurationOfGlorp asClass project stableVersion load.

Gofer it
    smalltalkhubUser: 'DBXTalk' project: 'Glorp';
    package: 'Glorp-SQLite3';

GlorpSQLite3CIConfiguration new configureSqlite3.
GlorpDemoTablePopulatorResource invalidateSetup.

Run the Glorp tests in TestRunner. All tests should pass, with 12 tests skipped. The database file is sodbxtestu.db in your image directory.

SQLite and JSON

24 April 2016

SQLite 3.9.0, released in Oct 2015, introduced the json1 extension that provides functions to manage JSON content stored in an SQLite database. The extension source code is included in the SQLite source amalgamation and is enabled by the -DSQLITE_ENABLE_JSON1 option.

The following assumes that the json1 functionality is available. As sample data, I use the stocks data set from Here's what the wrapped first line of the JSON data file looks like partially:

{ "_id" : { "$oid" : "52853800bb1177ca391c17ff" }, "Ticker" : "A", "Profit
Margin" : 0.137, "Institutional Ownership" : 0.847, "EPS growth past 5
years" : 0.158, "Total Debt/Equity" : 0.5600000000000001, "Current Ratio" :
3, "Return on Assets" : 0.089, "Sector" : "Healthcare", "P/S" : 2.54,
"Change from Open" : -0.0148, "Performance (YTD)" : 0.2605, "Performance
(Week)" : 0.0031, "Quick Ratio" : 2.3, "Insider Transactions" : -0.1352,

To load the file into an SQLite database using NBSQLite3:

| jr db |
jr := (FileSystem / 'tmp' / 'stocks.json') asFileReference readStream.
jr ascii.
db := NBSQLite3Connection openOn: '/tmp/s.db'.
[   db execute: 'create table s (d json)'.
    db execute: 'begin'.
    [ jr atEnd ] whileFalse: [ 
        db execute: 'insert into s values (json(?))' "<== Note."
            with: (Array with: jr nextLine) ] 
    db execute: 'commit'.
] ensure: [ db close ]
jr close.

Note the invocation of the json() SQL function, which verifies that its argument is a valid JSON string and returns a minified version of that JSON string.

Let's inspect the created database with the sqlite3 shell tool:

$ wc -l stocks.json
6756 stocks.json
$ sqlite3 s.db
SQLite version 3.9.2 2015-11-02 18:31:45
Enter ".help" for usage hints.
sqlite> select count(*) from s;
sqlite> .schema
CREATE TABLE s (d json);

The json_extract(X, P1, P2, ...) function extracts and returns one or more values from the JSON structure X:

sqlite> select json_extract(d, '$.Sector', '$.Company') 
   ...> from s 
   ...> where json_extract(d, '$.Ticker') = 'AAPL';
["Consumer Goods","Apple Inc."]

The json_each() table-valued function walks the JSON value provided as its argument and returns one row for each element. The schema for the table returned by json_each() is as follows:

CREATE TABLE json_tree(
    key ANY,             -- key for current element relative to its parent
    value ANY,           -- value for the current element
    type TEXT,           -- 'object','array','string','integer', etc.
    atom ANY,            -- value for primitive types, null for array & object
    id INTEGER           -- integer ID for this element
    parent INTEGER,      -- integer ID for the parent of this element
    fullkey TEXT,        -- full path describing the current element
    path TEXT,           -- path to the container of the current row
    json JSON HIDDEN,    -- 1st input parameter: the raw JSON
    root TEXT HIDDEN     -- 2nd input parameter: the PATH at which to start

Here's json_each() in action:

sqlite> select key, value 
   ...> from s, json_each(s.d) 
   ...> where json_extract(d, '$.Ticker') = 'AAPL'
   ...> limit 10;
Profit Margin|0.217
Institutional Ownership|0.621
EPS growth past 5 years|0.424
Total Debt/Equity|0.14
Current Ratio|1.7
Return on Assets|0.186
Sector|Consumer Goods

Let's run the same query in Pharo:

| q db v |
q := 'select key, value from s, json_each(s.d) where json_extract(d, "$.Ticker") = "AAPL" limit 10'.
db := NBSQLite3Connection openOn: '/tmp/s.db'.
[   (db execute: q) rows do: [ :r |
        Transcript show: (r at: 'key') asString, ' = '.
        v := r at: 'value'.
        Transcript show: v asString, ' (', v class asString, ')'; cr ]
] ensure: [ db close ]

Transcript displays thusly. As we can see, SQLite and NBSQLite3 handle numerical values in the JSON data properly.

_id = {"$oid":"52853800bb1177ca391c1809"} (ByteString)
Ticker = AAPL (ByteString)
Profit Margin = 0.217 (Float)
Institutional Ownership = 0.621 (Float)
EPS growth past 5 years = 0.424 (Float)
Total Debt/Equity = 0.14 (Float)
Current Ratio = 1.7 (Float)
Return on Assets = 0.186 (Float)
Sector = Consumer Goods (ByteString)
P/S = 2.75 (Float)

Notice that the first row is well-formed JSON. If we use json_tree() instead of json_row(), that JSON structure will be returned as a key/value row as well.

Finally, let's read the first JSON record using NeoJSON:

| q db  |
q := 'select json_extract(d, "$") as data from s limit 1'.
db := NBSQLite3Connection openOn: '/tmp/s.db'.
[   (NeoJSONReader on: ((db execute: q) rows first at: 'data') readStream)
        next explore
] ensure: [ db close ].

This is powerful stuff. Well worth a look.

NBSQLite3 on Dolphin Smalltalk 7

31 December 2015

Dolphin Smalltalk 7 has been released as open source. Many thanks to Object Arts.

I spent an afternoon updating the mostly unused Windows partition on my laptop, installing various development tools, and playing with Dolphin 7. As a proof of concept, NBSQLite3 now runs on Dolphin 7. :-)

Happy new year!

NBSQLite3 with SQLcipher

24 December 2015

SQLcipher "is an open source extension to SQLite that provides transparent 256-bit AES encryption of database files." SQLcipher provides the same API as the SQLite Encryption Extension (SEE) by D Richard Hipp, the original developer of SQLite.

I've added SQLcipher/SEE's API to NBSQLite3. By convention, on Unix, SQLcipher produces a shared library named, while the SQLite shared library is named NBSQLite3 switches between the two libraries based on the messages #beSQLcipher and #beSQLite to the NBSQLite3FFI class.

Here's a demonstration code snippet using the keying pragma in SQL:

| dbf db rs row |

Transcript open; clear.	
NBSQLite3FFI beSQLcipher.
dbf := FileReference newTempFilePrefix: 'cipher-' suffix: '.cdb'.

db := NBSQLite3Connection openOn: dbf fullName.
Transcript show: 'Creating an encrypted database with some data.'; cr.
[   db basicExecute: 'pragma key = "test"'.
    db basicExecute: 'create table if not exists x (xk integer primary key, iv integer, tv text);'.
    db beginTransaction.
    [  rs := db execute: 'insert into x values (NULL, ?, ?)' with: #(1 'two') ] ensure: [ rs close ].
    db commitTransaction
] ensure: [ db close ].

db := NBSQLite3Connection openOn: dbf fullName.
Transcript show: 'Opening the encrypted database.'; cr.
[   db basicExecute: 'pragma key = "test"'.
    [   rs := db execute: 'select * from x'.
        row := rs next.
        Transcript show: (row at: 'xk'); cr.
        Transcript show: (row at: 'iv'); cr.
        Transcript show: (row at: 'tv'); cr.
    ] ensure: [ rs close ]
] ensure: [ db close ].

dbf delete.

NBSQLcipherExample class>>examplePragma contains a longer version of the above snippet that includes reopening the encrypted database file without the keying pragma and using the SQLite library.

Tested on Linux Mint. Code updated in Smalltalk Hub. Some refactoring to be expected, because the above snippet using the keying pragma is the only test I've done.

I've placed a copy of here; it is built on my Linux Mint 17.x laptop from the source here, linking in LibreSSL 2.2.4's libcrypto.a statically.

% openssl sha256
SHA256( 441cbc559a4f38a018121c6d86caa0cf0fb2c5b2a57c353cc09a4e048ec8ebe8

% ldd =>  (0xf77da000) => /lib/i386-linux-gnu/ (0xf7569000) => /lib/i386-linux-gnu/ (0xf754d000) => /lib/i386-linux-gnu/ (0xf739e000)
    /lib/ (0xf77dd000)

For good measure, I've also put up a copy of sqlcipher built at the same time. It requires readline.

% openssl sha256 sqlcipher
SHA256(sqlcipher)= 4ccb3cf2064d41675406a55c8404a8877a40541dd9830009f4c0e203468e3d7b

% ldd sqlcipher =>  (0xf770a000) => /lib/i386-linux-gnu/ (0xf76a7000) => /lib/i386-linux-gnu/ (0xf76a2000) => /lib/i386-linux-gnu/ (0xf7685000) => /lib/i386-linux-gnu/ (0xf74d7000) => /lib/i386-linux-gnu/ (0xf74b5000)
    /lib/ (0xf770d000)

Scripting with Pharo

17 May 2015

A couple of links to set the scene, one from 2004, the other from 2005. Their titles relate to this post, although their content doesn't really.

While Pharo is an immersive interactive programming environment, it can also be used for operating system command line-oriented scripting. Take a freshly downloaded standard Pharo 4 image, here renamed p4cli.image:

% pharo -vm-display-null p4cli.image --list
Currently installed Command Line Handlers:
Fuel            Loads fuel files
config          Install and inspect Metacello Configurations from the command line
save            Rename the image and changes file
update          Load updates
printVersion    Print image version
st              Loads and executes .st source files
test            A command line test runner
clean           Run image cleanup
eval            Directly evaluates passed in one line scripts
Let's see how to use NBSQLite3's online backup functionality to back up a live SQLite database through a Pharo script.

Firstly, install the current bleedingEdge version of NBSQLite3.

% pharo -vm-display-null p4cli.image config \ \
      ConfigurationOfNBSQLite3 --install=bleedingEdge
'Installing ConfigurationOfNBSQLite3 bleedingEdge'

Loading 1.2 of ConfigurationOfNBSQLite3...
Fetched -> NBSQLite3-Core-PierceNg.7 --- ...
...finished 1.2%
Now make a script of the backup code snippet seen in my previous blog post, changing the database file names. In this case, I am using the script to make a copy of my server's firewall log database, which is live and constantly updated by the logging daemon. The script is named ''.

| srcDB dstDB backup |
srcDB := NBSQLite3Connection openOn: '/var/log/ulog/ulogd.db'.
dstDB := NBSQLite3Connection openOn: '/tmp/ulogd-backup.db'.
[   backup := NBSQLite3Backup new.
    srcDB asBackupSource: backup.
    dstDB asBackupDestination: backup.
    backup prepare.
    [ [ backup completed ] whileFalse: [ backup step ] ] ensure: [ backup finish ]
] ensure: [ srcDB close. dstDB close. ]

Run the script under Unix "time":

% time pharo -vm-display-null p4cli.image st --quit
pharo -vm-display-null p4cli.image st --quit 
0.26s user 0.10s system 53% cpu 0.675 total

% ls -l /var/log/ulog/ulogd.db* /tmp/*.db
-rw-r--r-- 1 pierce pierce 11636736 May 17 20:49 /tmp/ulogd-backup.db
-rw-r--r-- 1 ulog   ulog   11643904 May 17 20:50 /var/log/ulog/ulogd.db

The database files aren't identical, because the logging daemon has written yet more data since the backup was taken.

NBSQLite3 Online Backup

16 May 2015

SQLite provides an online backup API that operates incrementally, so that the source database file is locked only when it is actually read from, allowing other applications to use the source database while the backup is in progress.

I've added the backup API to NBSQLite3. Here's a demonstration code snippet using OpenFootball's World Cup 2014 database:

| srcDB dstDB backup |
srcDB := NBSQLite3Connection openOn: '/tmp/wc2014.db'.
dstDB := NBSQLite3Connection openOn: '/tmp/backup.db'.
[   backup := NBSQLite3Backup new.
    srcDB asBackupSource: backup.
    dstDB asBackupDestination: backup.
    backup prepare.
    [ [ backup completed ] whileFalse: [ backup step ] ] ensure: [ backup finish ]
] ensure: [ srcDB close. dstDB close. ]

Let's look at the two databases:

% pharo -vm-display-null p4cli.image config \
% ls -l *.db
-rw-r--r-- 1 pierce pierce 1101824 May 16 10:58 backup.db
-rw-r--r-- 1 pierce pierce 1101824 Jul 22  2014 wc2014.db

% openssl md5 backup.db wc2014.db
MD5(backup.db)= 408494059b67fc3c7d39b672ce7e525e
MD5(wc2014.db)= 28ab0865e87920ca2f2b1f0de710c622

Ok, the two files have the same size, but aren't bit-wise identical. Now try sqldiff that comes with SQLite:

% pharo -vm-display-null p4cli.image config \
% sqldiff --help
Usage: sqldiff [options] DB1 DB2
Output SQL text that would transform DB1 into DB2. 

% sqldiff wc2014.db backup.db

No output, meaning the two files are identical structurally and content-wise, according to sqldiff. Let's make sure:

% sqlite3 backup.db
sqlite> select count(*) from goals;
sqlite> select max(id) from goals;
sqlite> delete from goals where id > 168;
sqlite> select max(id) from goals;
sqlite> ^D

% sqldiff wc2014.db backup.db
DELETE FROM goals WHERE id=169;
DELETE FROM goals WHERE id=170;
DELETE FROM goals WHERE id=171;
The output shows that sqldiff and the SQLite library are in cahoots to make two database files appear identical even when they are not. Haha, no, I mean, the output gives us some assurance that the backup API operated as advertised and created a proper backup of the database.

Now, the demonstration code snippet is somewhat contrived, because it backed up a totally quiescent source database. Next step is to verify backing up a database in active use.

Code updated in Smalltalk Hub. Unit tests and ConfigurationOf update to come.

NBSQLite3 Date Time Handling

9 April 2015

SQLite does not support date/time as a built-in data type. Instead, SQLite provides date and time functions to deal with ISO 8601-format strings, floating point values representing Julian days, and Unix epoch time integers.

I've added date/time handling to NBSQLite3, so that storing a DateAndTime object results in an ISO 8601 string in the database, and reading such a string from the database, provided the column type is declared as DateTime, yields a DateAndTime object.

Here's an example using the low-level API:

| dt db stmt |

dt := DateAndTime year: 2015 month: 4 day: 1.

db := NBSQLite3BaseConnection openOn: ':memory:'.
    db basicExecute: 'create table x (xv datetime)'.
    db basicExecute: 'begin'.
    stmt := db prepare: 'insert into x values (?)'.
    stmt at: 1 putDateTime: dt.
    stmt step. stmt finalize.
    db basicExecute: 'commit'.

    stmt := db prepare: 'select xv from x'.
    stmt basicExecute: [ :row |
        | v |
        v := row dateTimeAt: 0.
        Transcript show: v year asString; cr.
        Transcript show: v month asString; cr.
        Transcript show: v dayOfMonth asString; cr ].
    stmt finalize.
] ensure: [ db close ]

And here's the example using the high-level API:

| dt db rs v |

dt := DateAndTime year: 2015 month: 4 day: 1.

db := NBSQLite3Connection openOn: ':memory:'.
    db basicExecute: 'create table x (xv datetime)'.
    db basicExecute: 'begin'.
    db execute: 'insert into x values (?)' with: (Array with: dt).
    db basicExecute: 'commit'.

    rs := db execute: 'select xv from x'.
    v := (rs next) at: 'xv'.
    Transcript show: v class asString; cr.
    Transcript show: v year asString; cr.
    Transcript show: v month asString; cr.
    Transcript show: v dayOfMonth asString; cr.
    rs close
] ensure: [ db close ]

Glorp with NBSQLite3 on Pharo 4

3 April 2015

I've integrated NBSQLite3 into Glorp on the current Pharo 4.0 v40592 beta image.

Firstly, install NBSQLite3 (TorstenBergmann.7) and then Glorp (TorstenBergmann.42) from the config browser.

Then, using the Monticello browser, open the NBSQLite3 repository and load the packages NBSQLite3-Glorp and NBSQLite3-Test-Glorp. Next, open the Glorp repository and load the packages Glorp-PierceNg.97.mcz and GlorpTests-PierceNg.44.mcz.

In a workspace/playground, execute the following:

GlorpDemoTablePopulatorResource invalidateSetup.
    defaultLogin: GlorpDatabaseLoginResource defaultNBSQLite3LocalLogin

Open Test Runner and run the Glorp tests.

Tested on Linux Mint 17 and OSX Mavericks. 2 fewer tests passed on Linux.

Curiously, #testLargeBlob, #testLargeClob and #testLargeText passed on the Pharo 3 image that I wrote this code on.

The database file created by the tests is sodbxtest.db.

% sqlite3 sodbxtest.db
sqlite> .tables
AIRLINE                  GALLERY_LINK             PERISHABLE_ITEM        
AIRLINE_MEAL             GLORP_IMAGE              PERSON                 
ATTACHMENT               GLORP_IMAGE_FILE         POULTRY                
ATTACHMENTBYTES          GLORP_JOB                PUBLISHER_EMP          
BANK_ACCT                GLORP_OWNER              PUBLISHER_TITLE        
BANK_TRANS               GLORP_SLAVE              PUBLISHER_TITLE2       
BOOK                     GLORP_TAG                PUB_EMP_LINK           
BOOK_CUSTOMER            GLORP_TAGS               PUB_TITLES_STOCK       
CUSTOMER_BOOK_LINK       GR_ADDRESS               RESERVATION            
DEFAULTABLE_THING        GR_CUSTOMER              STUFF                  
DOCUMENT                 GR_FOLDER                TAX                    
EMAIL_ADDRESS            GR_MESSAGE               TRANSFORMED_TIME       
EMPLOYEE                 GR_PUBLISHER             TREE_NODE              
ENCYC                    GR_THINGONE              TREE_NODE_LINK         
FKADDRESS                GR_USER                  VIDEO_PURCHASE_LINK    
FKCONTACT                IMAGETAGS                VIDEO_RENTAL           
FLIGHT                   ITINERARY                VIDEO_STORE            
FLIGHT_PASS              NONPERISHABLE_ITEM       WAREHOUSE              
FREQUENT_FLYER           OFFICE                   WAREHOUSE_ITEM_LINK    
GALLERY                  PASSENGER                WORKING_STIFF          

NBSQLite3 on STH

15 January 2015

Torsten Bergmann refactored NBSQLite3 and moved it to PharoExtras on SmalltalkHub. Here's the announcement to Pharo-Dev.

Thanks Torsten!

Glorp with NBSQLite3

5 October 2014

Made really good progress with NBSQLite3 for Glorp.

On the failed tests:

#testReadTimestamp - SQLite3 does not
support date/time/timestamp types natively. More investigation needed.

#testDateCast - The test itself says, "This test will fail on SQLite which
has no DB Date type; see test comment."

#testDeleteAndReinsertTwiceWhenRemovingFromAnExclusiveCollection - This test
is expected to result in a primary key conflict. Without studying the code,
my conjecture is that the test's failing has to do with SQLite's primary
key handling.

#testInt8 - Integer overflow on 32-bit SQLite, I think.

#testPreparedStatementsAreReused - PostgresV2 doesn't support prepared
statements. NBSQLite3 does. Mayhaps this is why? More investigation needed.

#testUpdatingComment - Test passes on OSX Mavericks but fails on Linux Mint
17. No idea why.

#testUpdatingOtherThing - Test passes on OSX Mavericks but fails on Linux
Mint 17. No idea why.

The entire GlorpOptimisticLockingTest and GlorpTimestampTest suites are skipped, because some of the tests fail, and foobars the Pharo-SQLite interface, causing many subsequent tests to fail, requiring restart of the Pharo image. Still need to look into these.

Glorp with NBSQLite3

27 September 2014

Making good progress with NBSQLite3 for Glorp.

Glorp with NBSQLite3

24 September 2014

Sven van Caekenberghe has written a very nice tutorial implementing a Reddit clone in Pharo using Seaside, Glorp and PostgreSQL. Sven also makes available a prebuilt image containing the application.

Seeing that the image contains Glorp working with the PostgresV2 driver, I set about integrating NBSQLite3 with Glorp. After about an afternoon's work, I now have working with Glorp+NBSQLite3.

$ sqlite3 reddit.db
SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from reddit_links;
1||Pharo Smalltalk|2014-09-22 22:46:53|1
2||Planet Smalltalk|2014-09-22 22:47:18|1
6||The World of Smalltalk|2014-09-22 22:58:50|0

There is still much to be done to get Glorp fully working with NBSQLite3: Some tests apparently expect Glorp proxies, but are getting OrderedCollections, and one particular test rendered my Linux X session non-responsive to mouse and keyboard, except for screen brightness key chords!

NBSQLite3 - Callback from C to Pharo Smalltalk

2 August 2014

NBSQLite3 now does callback from C to Smalltalk. As a start, the tracing API, which "is invoked at various times when an SQL statement is being run".

| db cb tk s |
db := NBSQLite3Connection openOn: ':memory:'.
[ cb := NBSQLite3TraceCallback on: [ :appdata :sqlText |
    Transcript show: 'Trace #', appdata readString, ': '.
    Transcript show: sqlText; cr ].
  tk := NBExternalAddress fromString: '42'.
  db traceUsing: cb with: tk.
  db basicExecute: 'create table x (xk integer, xv integer, tv text)'.
  s := db prepare: 'insert into x values (NULL, ?, ?)'.
  1 to: 10 do: [ :x |
      s at: 1 putInteger: x * x.
      s at: 2 putString: x asString, ' * ', x asString.
      s step. s clearBindings. s reset. ].		
  s finalize.
  db basicExecute: 'drop table x'.
  tk finalize 
] ensure: [ db close ]

The Transcript output looks thusly:

Trace #42: create table x (xk integer, xv integer, tv text)
Trace #42: insert into x values (NULL, 1, '1 * 1')
Trace #42: insert into x values (NULL, 4, '2 * 2')
Trace #42: insert into x values (NULL, 9, '3 * 3')
Trace #42: insert into x values (NULL, 16, '4 * 4')
Trace #42: insert into x values (NULL, 25, '5 * 5')
Trace #42: insert into x values (NULL, 36, '6 * 6')
Trace #42: insert into x values (NULL, 49, '7 * 7')
Trace #42: insert into x values (NULL, 64, '8 * 8')
Trace #42: insert into x values (NULL, 81, '9 * 9')
Trace #42: insert into x values (NULL, 100, '10 * 10')
Trace #42: drop table x

Test incorporated and passes on OSX Mountain Lion and Linux Mint 17 for Pharo 3. Get the source on SS3.


26 July 2014

I've been playing with the OpenFootball data using ROE. ROE provides RAArrayRelation, which shares a common superclass with RASQLRelation. As its name implies, RAArrayRelation uses an in-memory array object and does not involve any underlying SQL database.

In my OpenFootball data crunching, I tried JOINing instances of both. When the left side of the JOIN is the RASQLiteRelation instance, the operation threw an SQLite exception "no such table" because ROE expected the right side to also be an RASQLiteRelation on the same underlying database. When the left side is the RAArrayRelation instance, the operation sends #tuples to the right side, which resulted in a DNU. I've tried a few implementations of RASQLiteRelation>>tuples, but so far no success. Even if I get a working implementation, the implication of #tuples is that all matching rows from the SQLite database are loaded into memory, which is not always performant or feasible.

The reason for using an RAArrayRelation is to create a new relation unlike any existing one. The alternative is to create an RASQLiteRelation instance on a new, empty table. However, ROE does not support creating tables in the underlying database. Of course, I can use the given database connection to CREATE TABLE via direct SQL, but that doesn't feel satisfying.

Committed ROE with NBSQLite3 to STH

17 July 2014

I've committed my integration of ROE with NBSQLite3 to SmalltalkHub.

Tested on Pharo 3 on OSX Mountain Lion and Linux Mint 17 based on Ubuntu 14.04. All 23 tests of RATestSQLiteSemantics pass.

World Cup 2014 with ROE

14 July 2014

football.db is a "free open public domain football (soccer) database". Instructions to construct an SQLite data file from the data are on the website. The database consists of 40 tables; its schema looks normalized. With the World Cup 2014 and dependent data, the file weighs in at ~1MB.

Let's see what we can do with the data using ROE.

Each league/tournament is given an event ID; the World Cup's event_id is 8.

| conn groups teams wcTeams |
conn := NBSQLite3BaseConnection on: ('/home/pierce/data/openfootball/sport.db').
conn open.
groups := (((RASQLiteRelation name: 'groups' connection: conn)
    where: #event_id equals: 8) 
	    projectAll: #(id title)) 
	    renameAll: #(id title) to: #(gid gTitle).
teams := (RASQLiteRelation name: 'teams' connection: conn) 
    renameAll: #(id title) to: #(tid tTitle).
wcTeams := ((RASQLiteRelation name: 'groups_teams' connection: conn) * groups
    whereEqual: #(group_id gid)).
((wcTeams * teams whereEqual: #(team_id tid)) 
    projectAll: #(gTitle tTitle)) explore.

The above code snippet results in a RAProjection instance. We can see its content by the following (pretty-printed by hand):

(self collect: [ :ea | ea values asArray ]) asArray 
#(#('Group A' 'Brazil') 
  #('Group A' 'Croatia') 
  #('Group A' 'Mexico') 
  #('Group A' 'Cameroon') 
  #('Group B' 'Spain') 
  #('Group B' 'Netherlands') 
  #('Group B' 'Chile') 
  #('Group B' 'Australia') 
  #('Group C' 'Colombia') 
  #('Group C' 'Greece') 
  #('Group C' 'Côte d''Ivoire') 
  #('Group C' 'Japan') 
  #('Group D' 'Uruguay') 
  #('Group D' 'Costa Rica') 
  #('Group D' 'England') 
  #('Group D' 'Italy') 
  #('Group E' 'Switzerland') 
  #('Group E' 'Ecuador') 
  #('Group E' 'France') 
  #('Group E' 'Honduras') 
  #('Group F' 'Argentina') 
  #('Group F' 'Bosnia-Herzegovina') 
  #('Group F' 'Iran') 
  #('Group F' 'Nigeria') 
  #('Group G' 'Germany') 
  #('Group G' 'Portugal') 
  #('Group G' 'Ghana') 
  #('Group G' 'United States') 
  #('Group H' 'Belgium') 
  #('Group H' 'Algeria') 
  #('Group H' 'Russia') 
  #('Group H' 'South Korea'))

(Ivory Coast's proper name in the output doesn't look right in Pharo. The SQLite database is correct though.)


3 July 2014

I've started a page on NBSQLite3. Comments welcome.

ROE with NBSQLite3

30 March 2014

ROE is "a library for manipulating relational queries as first class Smalltalk expressions, and generating SQL from them."

Not bad for an afternoon's hacking, now ROE works with NBSQLite3 and all tests pass.

(Yes, I notice these screenshots are of rather poor quality.)

NBSQLite3 - NativeBoost SQLite for Pharo

1 March 2014

I like SQLite. SQLite is small, fast and reliable. As mentioned in this old blog post, I was extending the Squeaksource SQLite FFI wrapper to support parameter binding. Since then, Torsten Bergmann has also picked it up and now tends to it on SmalltalkHub.

Taking the chance to learn NativeBoost, with hints from Masashi Umezawa's PunQLite, I took my SQLite FFI changes and created NBSQLite3. NBSQLite3's small test suite runs successfully on Pharo 2.0 (OSX, Linux) and the current Pharo 3.0 beta on OSX.

Here's a multithreading example:

| db db2 sem blocking blocked |

db := NBSQLite3BaseConnection on: 'file::memory:?cache=shared'.
db2 := NBSQLite3BaseConnection on: 'file::memory:?cache=shared'.
sem := Semaphore new.	

Transcript open; clear.	

blocking := [ :x :sema |
    Transcript show: 'Thread-1: begin exclusive...'; cr; flush.
    [ x open.
      x execute: 'begin exclusive'.
      sema wait.
      Transcript show: 'Thread-1: releasing exclusive lock.'; cr; flush.
      x execute: 'rollback' 
    ] ensure: [ x close ] 
] newProcessWith: (Array with: db with: sem).

blocking resume.
Processor yield.

blocked := [ :x |
    Transcript show: 'Thread-2: create table ...'; cr; flush.
    [[ x open. x execute: 'create table if not exists x (xk integer)' ]
         on: NBSQLite3Locked do: [ :ex | Transcript show: ex messageText; cr; flush ]
    ] ensure: [ x close ]
] newProcessWith: (Array with: db2).

blocked resume.
Processor yield.

sem signal.

To run this example, evaluate "NBSQLite3Example multithreading". The output looks thusly:

Thread-1: begin exclusive...
Thread-2: create table ...
database schema is locked: main
Thread-1: releasing exclusive lock.

The SQLite shared library is compiled with -DSQLITE_USE_URI=1 and -DSQLITE_THREADSAFE=2.

StackOverflow SQLite Full Text Search

25 November 2012

In the previous entry, I wrote about parsing the StackOverflow XML dump files.

This entry is about setting up SQLite full text search for the StackOverflow posts data. In Smalltalk, set up the database schema:

| schema db |

schema := 'create table posts (Id integer primary key, ...)'.

db := SQLiteConnection fileNamed: 'so.db'.
db open.
db executeQuery: schema.
db close.	

The key SAX parsing callback method becomes, in essence, as follows:

startElement: aQualifiedName attributes: aDict
    | stmt |

   "db is the instance variable with the database connection."
    stmt := db prepare: 'insert into posts values (?,...,?)'.

    aQualifiedName = 'row' ifTrue: [ 
        stmt at: 1 putInteger: (aDict at: 'Id') asInteger.
        stmt at: 20 putInteger: (aDict at: 'FavoriteCount' ifAbsent: [-1]) asInteger.
        stmt step. stmt clearBindings. stmt reset ].

    stmt finalize.

On my computer, running through the 6+ million posts took about an hour. The resultant SQLite database file is about 6.5GB in size. Here are the comparative file sizes:

-rw-r--r--  1 pierce  staff   7569879502 Sep  7  2011 posts.xml
-rw-r--r--  1 pierce  staff   6819280896 Nov 24 20:31 so.db

Next, let's set up full text search on this database:

% sqlite3 so.db
sqlite> create virtual table sposts using fts4 (id, body, title);
sqlite> insert into sposts select id, body, title from posts;
sqlite> create virtual table stags using fts4 (id, tags);
sqlite> insert into stags select id, tags from posts;

I didn't time how long the INSERTs took. Setting up full text search boosted the database's file size:

-rw-r--r--  1 pierce  staff   6819280896 Nov 24 20:31 so.db (before FTS)
-rw-r--r--  1 pierce  staff  16491300864 Nov 25 21:34 so.db (after FTS)

Now search the database using the sqlite3 shell tool:

% sqlite3 so.db
sqlite> select count(*) from stags where tags match 'smalltalk';
sqlite> select max(id) from stags where tags match 'smalltalk';
sqlite> select body, tags from posts where id = 7260027;
<p>what's the best server side programming language for web development
if we keep very good performance and a lot of database interaction of
the script with server in mind. i am already working on a project with
php, i'm fearing that my application might have a lot of traffic, i
just want to use only the server side language which is best in terms
of performance, and please also guide me any major step i should take
to prevent my application from crashing when it will face a huge
traffic, what major step i might need to take to stop myself crying
after launching my applicaition, i heard that twitter use ruby, how's
ruby's performance? my concern is only and only performance and
database interaction, thanks in advance.</p>

SQLite Parameter Binding

28 April 2011

I like SQLite. SQLite is small, fast and reliable. There is an SQLite FFI wrapper on Squeaksource which is very simple to use. Let's start with this database:

$ sqlite3 /tmp/test.db
SQLite version 3.7.6
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table a (k integer primary key, tv text, iv integer);
sqlite> insert into a values (1, 'wah wah wah', 7);
sqlite> insert into a values (2, 'quick brown fox', 42);

Run the following snippet in a workspace:

| db |
Transcript clear.
db := SQLiteConnection fileNamed: '/tmp/test.db'.
db withResultForQuery: 'select * from a;' do: [ :result |
    result do: [ :row | Transcript show: row asString; cr ]].
db close.

The output is shown in the Transcript thus:

a Dictionary('iv'->'7' 'k'->'1' 'tv'->'wah wah wah' )
a Dictionary('iv'->'42' 'k'->'2' 'tv'->'quick browm fox' )

The underlying SQLite API used by withResultForQuery:do: is sqlite3_get_table, which is "a legacy interface preserved for backwards compatibility". Using this API, at the C level, NULL values are NULL pointers, and all other values are in their UTF-8 zero-terminated string representation. Not very interesting at the Smalltalk level, either, as seen from the Transcript.

SQLite also supports "prepared statements", aka "parameter binding". A prepared statement is compiled once (using sqlite3_prepare, into an SQLite VDBE virtual machine program), then executed multiple times (using sqlite3_step). Parameters are bound using sqlite3_bind_XXX calls, and values are retrieved using sqlite3_column_XXX calls. Parameter binding defends against SQL injection attacks.

I'm extending the SQLite FFI wrapper to support some of the parameter binding APIs. (Well, until enough to do my stuff.) Here's an example code snippet:

| db stmt |
Transcript clear.
db := SQLiteConnection fileNamed: '/tmp/test.db'.
db open.
[ stmt := db prepare: 'select * from a;'.
  stmt isNil ifFalse: [  
      | value |
      stmt execute: [ :row |
          value := row stringAt: 1.
          Transcript show: value, ' (', value class asString, ')'; cr.
          value := row integerAt: 2.
          Transcript show: value asString, ' (', value class asString, ')'; cr ].
      stmt finalize ]]
ensure: [ db close ]

The Transcript displays the following:

wah wah wah (ByteString)
7 (SmallInteger)
the quick brown fox (ByteString)
42 (SmallInteger)