Apache’s Jena TDB system is stated as being faster, more scalable, and better supported than the Jena SDB, which is a non-native system relying on an RDBMS. TDB is, for instance, the system supporting persistence in the Fuseki SPARQL server. The architecture is built around three concepts, namely a node table, triples/quads indexes, and a prefixes table. The node table serves to store the dictionary and follows the two mappings approach presented in Chapter 4. Practically, the string-to-id and id-to-string operations are respectively implemented using B+trees and a sequential file. A large cache is dedicated to ensure fast data retrieval during query processing. Triples and quads indexes are stored in specialized structures and respectively store three and four identifiers from the node table. B+trees are used to persist these indexes. The system supports SPARQL update operations, which are handled using ACID transactions (with the serializable isolation level) through a WAL (write ahead logging) approach. This implies that write transactions are first written into a journal and then stored in the database when resources permit it. This approach presents the benefit of not requiring a locking solution for read transactions. Finally, Jena TDB supports a bulk-load solution that does not support transactions. The different features contained in Jena TDB, such as some security aspects as well as some APIs, make it a solution to consider in a production setting.
1. Olivier Curé; Guillaume Blin. RDF Database Systems. (2014). Morgan Kaufmann