Greek Syntax, and Logos Bible Software

I introduced a series of posts on upcoming Greek Syntax tools last week. This is the second post (first post after the intro, you haven’t missed anything) in that series.

We have two different data sets that will be made available. If you’re at either the ETS or SBL conferences in November, you can see them demo’d. To keep my sanity (and yours) I’ll only discuss one data set at a time.

This first series of posts will discuss the Syntactically Analyzed Greek New Testament, as implemented within Logos Bible Software.

Interested in utilizing syntax within your study of the New Testament? Read on!

The first thing to consider is that, for the most part, there are two primary components that make up what we call “grammar”: morphology and syntax.

For years, thanks to the pioneering efforts of the GRAMCORD Institute, the Packard Humanities Institute, CCAT (and James Tauber), Timothy and Barbara Friberg, and Dr. Maurice Robinson (no omissions are intended, these are the morphologies that come to mind as I write this) New Testament students and scholars have been able to take morphology into account when searching/concording the Greek New Testament.

This is foundational work, and it is necessary and useful. But it is only half of the equation. Over the years, these morphological databases have been updated to indicate or allude to things of a syntactic nature. For example, roles of conjunctions have been tagged, though this is entirely context dependent and not wholly deduceable based on the form of the conjunction. In morphologies, prepositions have been given notation as to case (even through prepositions don’t have a case, morphologically speaking) to allow searching for particular usage (preposition + noun). This is all good and helpful; it only proves that taking additional syntax layers into account when searching is necessary.

Morphologies have been encroaching on the area of syntax slowly and steadily. Some folks even describe the latest form of many morphological databases as “Morpho-syntactic”. They’ve begun to meld syntax into the word-level tagging.

And that’s the problem. In most cases, it all still comes down to the word. Tags are loaded onto a word. Search engines and query dialogs are expanded to allow all sorts of very cool (and very useful) things at the word level.

But it’s still all at the word. There are no easy ways to reliably bound a search to a clause. Searches can be bound to a sentence, given a punctuated edition of a Greek New Testament. They can be constrained to N words within another instance of something, but even that can be a crude measure leading to false positives or missed hits depending on the value used for N.
We can’t say things like “I want to find where [noun] is in the subject of the clause”.

Even more, we can’t start to take discourse structures into account. We can’t easily or with any real amount of precision search for Verb-Subject-Object order in a clause; less so if we want to locate particular terms within those syntactic structures.

Or at least, we couldn’t. But the folks at started working on this problem a few years back, and now they’re through the New Testament. And we’re working on implementing their work in Logos Bible Software.

The material consists of three levels of tagging. These are:

  • Word Level: This is reflective of what is in standard morphologies. It includes form-based morphological tagging and lexical forms for dictionary/lexicon lookup. In a novel twist, the folks have also included potential semantic domains (from Louw & Nida) tagged at the word level.
  • Word Group Level: A word group is a group of one or more words. Frequently, word groups consist of only one word. Word groups can be like phrases; they are units of meaning consisting of one or more words.
  • Clause Level: In the clause model, clauses contain clause components. Clause components may contain embedded clauses or word groups.

So if we have a sentence like:

Word word word word word word word word word.

It could be represented in material (using brackets to delimit the clause and clause components, and curly-braces for the word groups) like this:

[[{Word} {word word}] [{word word word} {word}] [{word word}]]

As you can see, taking syntax into consideration moves us above the word level (‘s Base annotation) into structures that consist of groups of words that are in relationship to each other. This is where we are headed with syntactic databases: Additional layers of data are encoded on top of the word level: word groups, clause components and clauses. This allows groups of words to be dealt with as whole units — while still allowing individual words to play roles within those units.
In my next post I’ll talk specifically about word groups in the annotation and some of the heretofore unavailable functionality these little groupings allow. Here’s a teaser: One immediately recognizable application is the concept of finer search boundaries.


  1. Dave Phillips says

    I can’t wait for this to be released! Since you are working with OpenText, and are including potential Semantic Domain tags, is there any chance that you’ll build a function to generate semantic domain maps? I would LOVE to see this ability in Logos!