And, Or, Not: Adventures in Boolean Searching

Colloquium: Thursday, March 5, 14:00 to 16:00  in room A 2.35 (inside the CBS Library at Solbjerg Plads)

This Thursday we’re going both “old school” and “hard-core” in the Craft Colloquium. We’ll be talking about Boolean searches of the Library’s databases, i.e., searches that partition the articles that are indexed into distinct, if often overlapping, “sets” of search results, and then connect them with logical operators like “and”, “or” and “not”.

Consider an imaginary database that indexes every article in the management literature. Each entry will identify the title, author and journal (including date and issue number), and will provide abstract and keywords. Many databases today will also include the reference list of the article and will, additionally, be able to identify all the articles on the reference lists of which this particular article appears. These are all “facts” about the article. And we can group the articles into sets according to those facts.

Now, there will be millions of articles, thousands of authors and perhaps hundreds of journals. There will be a set of all articles published in the Administrative Science Quarterly and another set of articles published by the Academy of Management Journal. Probably many thousands in each journal. Since an article is only published in one place (republications get a separate entry in the databases) the sets are completely distinct from each other. There is no overlap. In Boolean terms, this means that, while there are thousands of articles that are published in ASQ or AMJ, there are exactly zero articles that are published in ASQ and AMJ.

Unlike titles, keywords and authors are not unique to articles. This means that the set of articles that are about finance and organization is not necessarily empty. Nor is the set of articles that are written by Jones and Smith (since, in addition to the papers they’ve written separately, they may written some together). But the set of articles that are “about finance or organizations” and the set of articles that are written by Jones or Smith will often be much larger. (There will be rare cases of author “teams” that never publish separately.)

The ramifications of this basic logical approach to searching are, of course, endless. And Liv will help us work through the possibilities in our session. What is the set of all articles that are written by Smith and published in AMJ? Notice the difference between this and the completely unwieldy set of articles that written by Smith or published in AMJ. Or, what is the set of articles that are written by Smith or written by Jones and published in the AMJ? Here we have to take pause. Consider the difference between the set of articles that are written by Smith (and published anywhere) or written by Jones and published in the AMJ, and the set of articles that are written by both Smith and Jones and published in the AMJ. Here Boolean notation will often use brackets to ensure that the right result is produced.

The basic logic is simple. But its application can be very complicated. If you want to try it, please come on Thursday. As always, bring examples from your own research that we can work with. That will ensure maximum relevance.

We’ve also decided on what our themes will be up to Easter. On March 12, we’ll talk about problem of language, i.e., the status of English in scholarship and the difficulties associated with alternatives. On March 19, we’ll open the floor for whatever issues people want to talk about, either because we’ve left some hanging from a previous session, or because we haven’t gotten to it yet. This will also be a kind of brainstorming sessions for our topic for after Easter. We’d like to have a full calendar of topic for April and May.

Leave a Reply

Your email address will not be published. Required fields are marked *