Differences in Syntax Searches and Morphology Searches

Rubén Gómez, in his Bible Software Review Weblog, gives us an example of Graphical Searches in different software applications.

He uses H. Van Dyke Parunak’s article on “Computers and Biblical Studies” in Anchor Yale Bible Dictionary as a basis. The article (Vol 1 p. 1118) says:

Particularly powerful patterns are possible in a language that allows one to ask (for example) for all verbs that occur within three words of the phrase “in Christ,” without intervening verbs. A high proportion of the targets matching such a pattern will be clauses in which the prepositional phrase in fact modifies the verb.
Freedman, D. N. (1996, c2008). The Anchor Yale Bible Dictionary (1:1118). New Haven, CT: Yale.

The Anchor Yale Bible Dictionary (ABD) was published in 1992. At that time, Parunak’s underlying target result — clauses in which the prepositional phrase translated “in Christ” modifies the verb of the clause (or, better stated, locating references to the kinds of action done “in Christ”) — could only be approximated using morphological searching criteria: “for all verbs that occur within three words of the phrase ‘in Christ,’ without intervening verbs”.

But what Parunak’s target result really demands is a search that is sensitive to syntax, not just morphology and word proximity. What about when more than three words occur between the verb and the preposition? What if the prepositional phrase isn’t contiguous?

Syntax searches in Logos Bible Software 3 have no such limitations.
(Note: this post has been updated, see the bottom Update section and, of course, comments for further thoughts on syntax and morphology)


Stating the Search With Syntax Terminology
Using Logos Bible Software 3 and the OpenText.org Syntactically Analyzed Greek New Testament, we can more precisely search for Parunak’s target result: Clauses that have verbs directly modified by ?? ??????.

First, let’s talk about terms and think about how Parunak’s target result can be translated into terms used by the OpenText.org analysis. As he notes, this involves finding verbs that are modified by a prepositional phrase.

In OpenText.org, this means we are looking for a Predicator, which is the verbal element of the clause.

This also means we’re looking for an Adjunct. Adjuncts note circumstances of the action (predicator) of the clause, and include things like prepositional phrases. Inside of the Adjunct, we need to find a Word Group that contains a Head Term that contains a Modifier that is a Specifier. The specifier is a modifier that is usually either a preposition or a definite article. Thus we want a specifier that is the word ?? and is a preposition. It provides further specificity by modifying the head word of the word group. Additionally, we want the head word of the word group to be ??????.
So, in terms of the OpenText.org annotation, we want to find where a Predicator is modified by an Adjunct, and where that adjunct has a word group that has a head term containing ?? ??????, where ?? directly modifies ?????? via specification.

In other words, something that looks like this:

You may notice that Parunak’s morphological formulation of his target specifies “no intervening verbs”. This restriction is not necessary from the point of view of the syntactic search because we’re finding something much more specific. We’re finding verbs that are modified by prepositional phrases. Since the syntactic relationships are the thing being annotated and queried, we’re able to be more specific. We are not bound to a loose approximation of the search target but can instead positively state the intended target — clauses with verbs modified by “in Christ”. This is why we can include the Anything attribute. Since the search is constrained to the clause, and since the search specifes the relation between clausal components, we can gloss over items between the two clausal components. We’re searching for relationships, not for word-level annotations that may or may not imply relationships.

You may also note that the search looks for hits in Predicator-Adjunct order. But notice that there is an “OR” block and the reverse order (Adjunct-Predicator) is also specified. By the way, adding the reversed set of search criteria was easily accomplished via a single copy-paste, then re-ordering the lines with the up/down arrows in the interface.

Syntax Search Results
Our OpenText.org syntax search locates 45 instances in the New Testament that match the specified criteria. Here’s a glimpse at some of the hits, note that the predicators show up in orange so you can see the different verbs used in the structure. Note also the English text (from the reverse interlinear) with appropriate English text of the Greek syntactic structure highlighted:

In looking through the results, it’s important to note that several things are found by our syntax search that wouldn’t be found by the morphologically-defined search described in ABD. Romans 8.2 (the second hit in the above graphic) is a good example. In that case, the predicator comes after the prepositional phrase: ?? ?????? ????? ???????????. Below is the Syntax Visualization of the verse with the adjunct (A) and the predicator (P) highlighted.

Another example of something a morphologically-stated query would miss is Eph 1.3, where the verb (a participle) has eight words intervening between the verb and the prepositional phrase: ? ????????? ???? ?? ???? ??????? ?????????? ?? ???? ??????????? ?? ??????.

And another example of how bounding to word-level constraints might miss things is found in 1Co 4.15. Here, the prepositional phrase has a post-positive conjunction. That, and the verb is after the prepositional phrase in this instance as well, and there are even six words between ?????? and the predicator: ?? ??? ?????? ????? ??? ??? ?????????? ??? ????

????????. This breaks most assumptions of the morphologically-stated search but still qualifies as part of the intended target, clauses where the verb is modified by ?? ??????.

Constraining Search Results Even Further
This last example brings up a further question: Does the target only include prepositional phrases of “in Christ”, or would phrases like “in Christ Jesus” or even “in the blood of Christ” also be desired? The current syntax search considers these things valid, but it doesn’t need to. We can limit to explicitly ?? ?????? if we’d like.

Updating the search to take this limit into account restricts the search hits to 14. Note that these do not include prepositional phrases of “in Christ Jesus” (or any similar constructions) but only ?? ?????? . The amended query is below.

Note that the modifier now specifies that it is the First item in the head term, the word inside the modifier specifies that it is the Only thing inside of the modifier, and the last word specifies that it is the Last thing in the head term. Also, I’ve inserted an

Anything block that has a length of zero, effectively stating that the preposition and its object must be next to each other.

And here are the updated results:

Of course, this removes some of the nice exceptions we located above, but if one is only interested in the prepositional phrase ?? ?????? (and not things like ?? ?????? ?????) modifying a verb in its clause, then this list of fourteen items may very well meet that criteria.

The Benefits of Syntactic Searching
This is why we’re so excited about using syntax in searches here at Logos. It allows an exegete to search using precise syntactic criteria (e.g., clauses with verbs modified by a prepositional phrase of ?? ??????) instead of loosely approximating syntactic criteria using word-level morphological features plus proximity.

It is good to be reminded that Parunak’s Anchor Yale Bible Dictionary article from 14 years ago pointed towards this kind of searching in this searching example. It took us 14 years, but we’ve moved forward and can now start to ask even better questions of the text — and get even better results.

Update
Rubén Gómez links to this article in an update to his post with some further comments about the blurred relationship between morphology and syntax as it is implemented in analyzed editions of the Hebrew Bible and Greek New Testament. He’s right, of course. That’s why all of the syntactic databases in Logos Bible Software 3 include morphological tagging at the word level.

Morphological information is very useful and will continue to be useful. We’re using that as a foundation and building additional layers of syntactic information at the clause and phrase level, resulting in a tool that can take both morphological and syntactic criteria into account when asking questions of the text.

The above example does exactly this when it searches a Clause (syntax) for a relationship between Predicator (syntax) and Adjunct (syntax), but within the adjunct specifies morphological criteria, a Preposition (morphology) with dictionary form of ?? (morphology) which modifies via Specification (syntax) the Head Term (syntax) which has a dictionary form of ??????? (morphology).

Thanks again to Rubén for spurring these thoughts and interacting with us. Hopefully this post has been helpful in illustrating how asking questions of the text using both syntactic and morphological criteria can be done with the new Logos Bible Software 3.

Comments

  1. Philip Gons says:

    This is truly revolutionary; this is going to take original language studies to a new level! While a definite improvement to the days of morphology and proximity in many ways, one possible limitation I see is that the user is limited by the decisions of the people who decided what is modifying what. What happens when they aren’t sure which way to go with a clause or phrase? The database is limited to one choice, right? If they cannot specify that it could legitimately be x *or* y, then there could possibly be some other examples that get ruled out because they were forced to choose between two equally valid options. So there may be some examples that even a syntax search misses (that possibly a morphology/proximity search might catch). I guess a solution to this would be (1) to develop a way for elements to be tagged two or more different ways like “probable” and “possible,” and/or (2) to have multiple syntax databases produced by different people so that the user will possibly get slightly different results and hopefully catch a few more examples in the one database that are missed in the other, and vice versa. While there can be some variety with reference to decisions about morphology, it seems that morphology is a little more objective than syntax, so this wasn’t really much of a problem before. Am I thinking correctly here?

  2. Thanks for the comments, Phil.
    With any sort of tagging there is the possibility of subjectivity. While the vast majority of annotation decisions will be made as a result of the syntactic theories one applies to the text, there are always touchy areas that require judgment and thought.
    This is one reason why Logos supports multiple syntax databases that apply different theories to the text. For the Greek New Testament, one can consult the in-development Lexham Syntactic Greek New Testament or even the Lexham Clausal Outlines of the Greek New Testament for second opinions on how something is annotated. The OpenText.org annotation isn’t the only source.
    Additionally, because we realize there may be more than one way to look at a passage, as we create the Lexham Syntactic Greek New Testament we are annotating multiple views of a particular words, clauses and phrases. The current Lexham SGNT implementation in Logos 3 supports multiple annotations at the word level; we’re still considering how best to implement the data that shows multiple annotations at the clause level. As that resource develops, users will be able to consult alternate annotations in the same database.
    Hope it helps.

  3. Philip Gons says:

    Excellent! This definitely helps. I was aware of the Lexham database, but I didn’t know that it supported multiple annotations. That is great!
    1. Does the opentext material support multiple annotations? If not, are their plans to implement something like this?
    2. Is Logos open to a third or fourth syntax database in the future, or is two the limit?
    Keep up the great work!

  4. Hi Phil.
    1. You can get more information on OpenText.org at their web site. Their annotation currently supports one analysis of the text, but that analysis is at several levels (word, word group, and clause). I can’t speak as to how their future work will progress.
    2. On syntax databases, the sky is the limit. We have no innate technological restrictions that would prevent us from offering more syntax databases.
    Thanks!

  5. David Hooton says:

    The one weakness, for me, is still the actual search syntax:-
    1. The OR is at the query level rather than the term level; thus invovling multiple searches for alternative conditions.
    2. Term exclusions are not catered for. I’m thinking of the Filter exclusion in Grapical Query Editor, because “Must not be Present” operates at the wrong level. Whilst I can specify NOT “the”, I cannot exclude the definite article morphology term.
    This limits the specificaion of the Granville Sharp Rule.
    With regard to subjectivity, the OpenText version of the Rule allows for a limitation of nouns according to LN domains. But choosing domain 12 (Supernatural Beings) excludes occurrences of “Christ”, who is relegated to domain 93 (Names of Persons and Places).

  6. Hi Dave.
    Quick responses to a few of the issues you raise:
    1. True, the ‘OR’ is at the object level and while it helps, it does make searching for alternates at all sorts of levels difficult. This is one of the challenges of syntax searching. How to ‘OR’ at different levels within syntactic structure? The current syntax searching implementation reflects where we’ve started in addressing that complication, not where we have finished. Though note, if you’re looking for words at a given point, you can select multiple words in the text, lexeme or gloss boxes in the word portion of the syntax search dialog. This is the easiest way to search for (word or word or word).
    2. Regarding morph exclusions specifically, one could search for all parts of speech but the article. Not the easiest, I realize, but it is possible, at least at that level. At present, it is best to think in positive criteria to specify a syntax query.
    3. Regarding one’s specifying criteria for hits that may or may not reflect Granville Sharp’s famous rule, I hope to blog on that in the near future, so y’all keep your eyes peeled.
    4. Regarding subjectivity of OpenText.org material, your qualm is really with Louw & Nida’s organization, not OpenText.org’s application of it. L&N treat XRISTOS sans article and with IHSOUS as a proper name. XRISTOS as a title (“Messiah”) is also in domain 53, which may be more appropriate (than domain 93) to include when looking for, say, Christological implications of Granville Sharp’s first rule. OpenText.org have annotated each instance of XRISTOS with both domains, so searching for domain 53 will find all instances.
    Thanks!

  7. David Hooton says:

    2. Specifying all but the article works well, but I would have a difficult time reconstructing that from the resulting (positive) expression!
    3. I look forward to your blog.
    4. I meant that to illustrate the subjectivity that inevitably conflicts with a specific need. I would have classified Christ in Domain 12, and might well *not* have classified lord/Lord in Domain 12!

  8. David Bradford says:

    Thanks for the very helpful article. Here’s my first question. I looked around on the Logos site for the answer but never saw a clear answer. I own the Scholar’s pack upgraded to v. 3 of the DLS engine. Do I have the syntactical database required to do what you’ve demonstrated in this article? In Greek as well as Hebrew?

  9. David, if you did the free, online update or ordered the $5 update disc, the answer is “no”. The syntax resources, reverse interlinear Bibles, and a number of other new resources are part of the paid upgrade, available from: http://www.logos.com/upgrade