Liquid: RDF endpoint for FluidDB

A while ago I wrote some thoughts about how to map RDF to and from FluidDB. There I explored how you could map RDF onto FluidDB, and how to get it back. That got me thinking about how to get a simple endpoint you could query for RDF. Imagine that you could pull FluidDB data […]

Related posts:

  1. Liquid: RDF meandering in FluidDB
  2. Temporary storage for Meandre’s distributed flow execution
  3. Efficient serialization for Java (and beyond)

A while ago I wrote some thoughts about how to map RDF to and from FluidDB. There I explored how you could map RDF onto FluidDB, and how to get it back. That got me thinking about how to get a simple endpoint you could query for RDF. Imagine that you could pull FluidDB data in RDF, then I could just get all the flexibility of SPARQL for free. With this idea in my mind I just went and grabbed Meandre, the JFLuidDB library started by Ross Jones, and build a few components.

The main goal was to be able to get an object, list of the tags, and express the result in RDF. FluidDB helps the mapping since objects are uniquely identified by URIs. For instance, the unique object 5ff74371-455b-4299-83f9-ba13ae898ad1 (FluidDB relies on UUID version four with the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx) is uniquely identified by http://sandbox.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1 (or a url of the form http://sandbox.fluidinfo.com/objects/xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx), in case you are using the sandbox or http://fluiddb.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1 if you are using the main instance. Same story for tags. The tag fluiddb/about can be uniquely identified by the URI http://sandbox.fluidinfo.com/tags/fluiddb/about, or http://fluiddb.fluidinfo.com/tags/fluiddb/about.

A simple RDF description for and object

Once you get the object back the basic translated RDF version for object a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db should look like as the listing below in TURTLE notation.

<http://sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute fluiddb/default/tags/permission/update/policy"^^<http://www.w3.org/2001/XMLSchema#string> .

I will break the above example into small chunks and explain the above example into the three main pieces involved (the id, the about, and the tags). The basic construct is simple. First a triple to mark the object as a FluidDB object.

<http://sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/>   
.

Then if the object has an about associated on creation, another triple gets generated and added, as shown below. To be consistent, I suggest reusing DC description since that is what the about for an object tend to indicate.

<http://sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db>
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute fluiddb/default/tags/permission/update/policy"^^<http://www.w3.org/2001/XMLSchema#string> 
.

Finally, if there are tags associated to the object, a bag gets created, and all the URI describing the tags get pushed into the bag as shown below.

<http://sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description>
.

Creating and RDF endpoint

Armed with the previous, the thing should be easy. Just allow querying for objects, then collect the object information, and finally generate the final RDF. Using Meandre and JFLuidDB I wrote a few components that allow the simple creation of such an endpoint as illustrated by the picture below.

Meandre FluidDB RDF endpoint

The basic mechanism is simple. Just push the query into the Query for objects component. This component will stream each of the uuid of the matched objects to Read object which pulls the object information. Then the object is passed to Object to RDF model that basically generates the RDF snipped shown in the example shown above for each of the objects pushed. Finally all the RDF fragments are reduced together by component Wrapped models reducer. Then the resulting RDF model just gets serialize into text using the Turtle notation. Finally the serialized text is printed to the console. The equivalent code could be express as a ZigZag script as:

#
# Imports eliminated for clarity
#

#
# Create the component aliases
#
alias  as OBJECT_TO_RDF
alias  as PRINT_OBJECT
alias  as QUERY_FOR_OBJECTS
alias  as READS_THE_REQUESTED_OBJECT
alias  as WRAPPED_MODELS_REDUCER
alias  as MODEL_TO_RDF_TEXT
alias  as PUSH_STRING

#
# Create the component instances
#
push_query_string = PUSH_STRING()
wrapped_models_reducer = WRAPPED_MODELS_REDUCER()
query_for_objects = QUERY_FOR_OBJECTS()
reads_object = READS_THE_REQUESTED_OBJECT()
model_to_rdf_text = MODEL_TO_RDF_TEXT()
print_rdf_text = PRINT_OBJECT()
object_to_rdf_model = OBJECT_TO_RDF()

#
# Set component properties
#
push_query_string.message = "has fluiddb/tag/path"
query_for_objects.fluiddb_url = "http://sandbox.fluidinfo.com"
eads_object.fluiddb_url = "http://sandbox.fluidinfo.com"
model_to_rdf_text.rdf_dialect = "TTL"

#
# Create the flow by connecting the components
#
@query_for_objects_outputs = query_for_objects()
@model_to_rdf_text_outputs = model_to_rdf_text()
@push_query_string_outputs = push_query_string()
@object_to_rdf_model_outputs = object_to_rdf_model()
@reads_object_outputs = reads_object()
@wrapped_models_reducer_outputs = wrapped_models_reducer()

query_for_objects(text: push_query_string_outputs.text)
model_to_rdf_text(model: wrapped_models_reducer_outputs.model)
object_to_rdf_model(object: reads_object_outputs.object)
reads_object(uuid: query_for_objects_outputs.uuid)[+200!]
print_rdf_text(object: model_to_rdf_text_outputs.text)
wrapped_models_reducer(model: object_to_rdf_model_outputs.model)

The only interesting element in the script is the [+200!] entry that creates 200 parallel copies of read object that will concurrently hit FluidDB to pull the data, trying to minimize the latency. The script could be compiled into a MAU and run. The output of the execution would look like the following:

$ java -jar zzre-1.4.7.jar pull-test.mau 
Meandre MAU Executor [1.0.1vcli/1.4.7]
All rights reserved by DITA, NCSA, UofI (2007-2009)
THIS SOFTWARE IS PROVIDED UNDER University of Illinois/NCSA OPEN SOURCE LICENSE.
 
Executing MAU file pull-test.mau
Creating temp dir pull-test.mau.run
Creating temp dir pull-test.mau.public_resources
 
Preparing flow: meandre://seasr.org/zigzag/1253813636945/4416962494019783033/flow/pull-test-mau/
2009-09-24 12:34:38.480::INFO:  jetty-6.1.x
2009-09-24 12:34:38.495::INFO:  Started SocketConnector@0.0.0.0:1715
Preparation completed correctly
 
Execution started at: 2009-09-24T12:34:38
----------------------------------------------------------------------------
<http://sandbox.fluidinfo.com/objects/a24b4a18-5483-47c6-9b62-0955210c7ebd>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute test/Net::FluidDB-name-1253772095.82845-0.944567286499904"^^<http://www.w3.org/2001/XMLSchema#string> .
 
<http://sandbox.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute test/Net::FluidDB-name-1253622685.3231461-0.437099602163897316"^^<http://www.w3.org/2001/XMLSchema#string> .
 
<http://sandbox.fluidinfo.com/objects/67e52346-527e-4bb7-b8f3-05fa8a8ae35b>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute test/Net::FluidDB-name-1253620190.69175-0.861614257420541"^^<http://www.w3.org/2001/XMLSchema#string> .
 
<http://sandbox.fluidinfo.com/objects/8a65a184-03d9-4881-95df-02fa0561a86f>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute fluiddb/namespaces/permission/update/exceptions"^^<http://www.w3.org/2001/XMLSchema#string> .
 
<http://sandbox.fluidinfo.com/objects/335b44e9-a72f-479d-ad60-3661a35231ba>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute test/Net::FluidDB-name-1253776141.95577-0.284175700598524"^^<http://www.w3.org/2001/XMLSchema#string> .
 
<http://sandbox.fluidinfo.com/objects/3bbf1cc6-731c-4e56-a664-adeb5484334f>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute fluiddb/namespaces/permission/delete/policy"^^<http://www.w3.org/2001/XMLSchema#string> .
 
<http://sandbox.fluidinfo.com/objects/aba5adcf-fd44-40ab-b702-9cc635650bc3>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute test/Net::FluidDB-name-1253614713.757-0.604769721717702"^^<http://www.w3.org/2001/XMLSchema#string> .
 
<http://sandbox.fluidinfo.com/objects/f61ceb3b-33df-4356-8e7d-c56d3d0ae338>
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
              <http://sandbox.fluidinfo.com/objects/> , <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
              <http://sandbox.fluidinfo.com/tags/fluiddb/about> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/path> ;
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3>
              <http://sandbox.fluidinfo.com/tags/fluiddb/tags/description> ;
      <http://purl.org/dc/elements/1.1/description>
              "Object for the attribute test/Net::FluidDB-name-1253615887.80879-0.0437609496034099"^^<http://www.w3.org/2001/XMLSchema#string> .
 
...

That’s it! A first RDF dump of the query!

The not so great news

The current FluidDB API does not provide any method to be able to pull data from more than one object at once. That basically means, that for each uuid a call to the server needs to be process. That is a huge latency overhead. The FluidDB guys know about it and they are scratching their heads on how to provide a “multi get”. A full trace of the output can be found on this FluidDB RDF endpoint trace.

This element is crucial for any RDF endpoint. Above I left out a basic element, the time measures. That part looks like:

Flow execution statistics

Flow unique execution ID : meandre://seasr.org/zigzag/1253813636945/4416962494019783033/flow/pull-test-mau/8D8E354A/1253813678323/1493255769/
Flow state               : ended
Started at               : Thu Sep 24 12:34:38 CDT 2009
Last update              : Thu Sep 24 12:37:28 CDT 2009
Total run time (ms)      : 170144

Basically 170s to pull only 238 objects, where all the time is spent round tripping to FluidDB.

Getting there

This basically means that such high latency would not allow efficient interactive usage of the end point. However, this exercise was useful to prof that simple RDF endpoints for FluidDB are possible and would greatly boost the flexibility of interaction with FluidDB . The current form of the endpoint is may still have value if you are not in a hurry, allowing you to run SPARQL queries against FluidDB data and get the best of both worlds.

The code use

If you are interested on running the code, you may need Meandre and the components I put together for the experiment, that you can get from http://github.com/xllora/liquid.

Related posts:

  1. Liquid: RDF meandering in FluidDB
  2. Temporary storage for Meandre’s distributed flow execution
  3. Efficient serialization for Java (and beyond)