I honestly didn’t mean to immediately write another post that refers to another blog, but this one is just too cool.
It has been long noticed that links are not uniformly distributed in many networks, and in many cases the distribution of links follows a power law in which only a few of web pages (or bloggers) get a lion share of the links (see, e.g, Clay Shirky, “Power Laws, Weblogs, and Inequality” [Feb. 8, 2003]).
Can the same phenomenon be observed in Biblical citations? Clearly, some verses are much more popular (e.g. John 3:16) than others, but can the power law still be seen?
This is an interesting question to ask, though the specific findings may well depend on the corpus of texts being searched. Oddly enough, some time ago we explored a feature that answers this kind of question using the resources inside Logos Bible Software. We even wrote a prototype report that does it using a “brute force” approach just to see what happened. We haven’t made it a priority to refine and speed up the report, though we may return to the concept in the future.
Stephen’s post reminded me of the prototype, so I asked Bob and he pointed me to it. Our implementation is a little different; we take three variables and then run the report. First we take a collection of resources; then we take a range of references; then we specify a pericope set.
The report searches the collection of resources for Bible references within a specified range, then “maps” the results onto pericopes. This provides results that correspond to meaningful textual units.
For the below example, I used a collection that consisted of the New Testament volumes of the International Critical Commentary (ICC). I specified a range of “Galatians” and also specified the ESV Pericope Set.
Here’s what the report comes up with. This is sorted by hit count. So, at least in the ICC NT, these are the popular citations of Galatians, grouped by pericope:
- Galatians 1:11-24: Paul Called by God (264 hits in 199 articles)
- Galatians 2:1-10: Paul Accepted by the Apostles (241 hits in 174 articles)
- Galatians 3:15-29: The Law and the Promise (186 hits in 142 articles)
- Galatians 5:1-15: Christ Has Set Us Free (168 hits in 131 articles)
- Galatians 2:15-21: Justified by Faith (150 hits in 125 articles)
- Galatians 5:16-26: Walk by the Spirit (146 hits in 108 articles)
- Galatians 4:8-20: Paul’s Concern for the Galatians (144 hits in 111 articles)
- Galatians 6:1-10: Bear One Another’s Burdens (126 hits in 98 articles)
- Galatians 1:1-5: Greeting (113 hits in 82 articles)
- Galatians 4:1-7: Sons and Heirs (112 hits in 84 articles)
- Galatians 6:11-18: Final Warning and Benediction (112 hits in 88 articles)
- Galatians 1:6-10: No Other Gospel (104 hits in 78 articles)
- Galatians 3:1-9: By Faith, or by Works of the Law? (94 hits in 64 articles)
- Galatians 2:11-14: Paul Opposes Peter (73 hits in 63 articles)
- Galatians 3:10-14: The Righteous Shall Live by Faith (68 hits in 51 articles)
- Galatians 4:21-31: Example of Hagar and Sarah (67 hits in 56 articles)
So, when looking across the 30 volumes of ICC that cover the New Testament, and restricting our focus to Galatians, we see that the most frequently-cited portion of Galatians is 1:11-24…with 2:1-10 a pretty close second. After that, the hit count drops off pretty fast.
It’s worth noting a couple of differences between what we’re doing and what Stephen did.
Stephen’s search (using Google) pulled from a corpus that consists primarily of web pages, with some Word docs and PDFs included. The web corpus will tend to reflect a broader usage pattern than that found in Logos Bible Software, which is primarily copyrighted, published material produced by professional scholars and authors. For these purposes, one is not superior to the other…but different samples could be expected to produce different results.
Another difference comes to light in the comments section of Stephen’s post. As Stephen readily acknowledges, searching Google for “Gal 2:1″ is a pretty blunt instrument. It fails to consider verse ranges, alternate notation schemes, or even occurrences where the author bothers to spell out all of G-a-l-a-t-i-a-n-s.
Bible references inside Logos books, on the other hand, have been encoded in such a way that Gal 2:1, Gal 2:1-10, Galatians 2.1 and even “verse 1″ (given proper context) all count as hits for Galatians 2:1.
Corpus studies have their own literature and science. Perhaps someday we’ll introduce features that allow you to run comparisons between various corpora to see how they differ. With 5,000+ books digitized, tagged and available for Logos Bible Software, this kind of thing starts to be a real possibility. But for the moment, it’s a nice diversion.