Some notes on James C. Scott: Against the Grain

James C. Scott is one of our most prominent anthropologists.  He has had a long career researching the life-ways of peasant culture and the various methods that traditional farmers employ to resist the encroachments of the state.  His earliest field-work (1970's) was with rice cultivators in Burma and in Viet-Nam.  Now in his eighties, he has produced a book, Against the Grain, which amounts to a distillation of his entire scholarly experience.[1]

In sum, this book is an examination of the formation of the earliest states in the fertile crescent around Uruk.  He attempts to find similarities between these proto-states and other state societies.  He thinks to have found certain consistent factors in state formation; the primary of which is the presence and cultivation of grain.  He clearly feels that grain is the culprit behind what he sees as the coercive tax-gathering state.  Remove cereals, restore a broader ecological range of food alternatives and the state must fall.  This is his thesis: cultivation choices are the major factor in determining political forms.

It seems, therefore, that we could look to Scott for a description of how states come into being and, indeed, he is always on the edge of providing such an explanation.
But he never actually does.

If we examine the way in which he uses the word ‘state’ we see the problem.

‘…virtually everywhere, it seems, [the] early state battens itself onto this new source of sustenance.’[2] 

The source of sustenance in question is that clutch of concentrated cereal-oriented Mesopotamian riverine settlements that have formed on rich alluvial land.

‘The state form colonizes this nucleus as its productive base, scales it up, intensifies it, …’[3]  

The ‘nucleus’ here is, again, the settlements already alluded to.

Do you see?  The state ‘battens itself’ onto pre-existing agricultural settlements.  Or the state ‘colonizes’ an agro-nucleus.  The state ‘scales it up’, ‘intensifies it’.  Scott really should be more forthcoming with his pronouns.  In his mind the state is an external actor.  Like a ravening lion it seizes on these innocent settlements as though they were so many sheep.  But we are not told what this actor is or where it comes from.  They are elites.  That’s all he knows.

The reason for this curious omission is that Dr. Scott doesn’t really care how the state came to be.  He is an anarchist.  He opposes the State.  For him the State is literally a bunch of gangsters[4] who, given the right opportunity, will convert settlements into states that they might more easily rob them.  Against the Grain is not an anthropological but a moral case against those early state-forming elites who subjected mankind to misery in perpetuity in order that they could batten and grow rich.

But who were these elites?  How did they come to gain control over free peasants and convert them into slaves of the grain?  Why were they able to maintain their power?  And what happened on those frequent occasions when the elites lost their power and the State disintegrated?  Why did all this happen?

We are not told.

We are given the after-picture.  Scott suggests that there are characteristics that, when we find them – or most of them – together, we may infer the presence of a state.  Scott explains here:

“…, ‘stateness’, …, is an institutional continuum, less an either/or proposition than a judgment of more or less.  A polity with a king, specialized administrative staff, social hierarchy, a monumental center, city walls, and tax collection and distribution is certainly a ‘state’ in the strong sense of the term.  Such states come into existence …”[5]

All these things might very well be found in conjunction with states and much ink has been spilt debating which of these is ‘essential’ for inferring the existence of a state.  

This only answers a ‘what’ question.

To the question of ‘why did the state originate as a model for human association?’ there is no answer.   For Scott it’s all just a criminal conspiracy. 

As I mentioned above, the most important requirement for state formation, according to Scott, is the presence and cultivation of cereals: wheat, barley, oats, rice, rye.  These are the culprits.  As he says

‘…only the cereal grains can serve as a basis for taxation: visible, divisible, assessable, storable, transportable, and “rationable.”’[6]

and ...

‘The fact that cereal grains grow above ground and ripen at roughly the same time makes the job of any would-be taxman that much easier.  If the army or the tax officials arrive at the right time, they can cut, thresh, and confiscate the entire harvest in one operation.  For a hostile army, cereal grains make a scorched-earth policy that much simpler; they can burn the harvest-ready grain fields and reduce the cultivators to flight or starvation.  Better yet, a tax collector or enemy can simply wait until the crop has been threshed and stored and confiscate the entire contents of the granary.’[7]

And even though many societies grow tubers – potatoes (Solanum tuberosum), taro (many varieties but basically Colocasia esculenta), sweet potato (Ipomoea batatas), yams (Dioscorea spp.) – such crops cannot be the foundation of state societies because they are not suitable for tax collectors.  Scott gives these reasons:

“Such crops ripen in a year but may be safely left in the ground for an additional year or two.  They can be dug up as needed and the remainder stored where they grew, underground.  If an army or tax collectors want your tubers,  they will have to dig them up tuber by tuber, as the farmer does, and then they will have a cartload of potatoes which is far less valuable (either calorically or at the market) than a cartload of wheat, and is also more likely to spoil quickly.”[8]

And this: "History records no cassava states, no sago, yam, taro, plantain, breadfruit, or sweet potato states."[9]

And: “It follows, I think, that state formation becomes possible only when there are few alternatives to a diet dominated by domestic grains.”[10]

So Scott’s thesis is that the natural characteristics of the several tubers make them unsuitable for the tax collector and, as a result, cannot be the basis for a state.  He claims priority for this thesis.[11]

Scott’s thesis is interesting, deeply researched, it appears to explain a great deal …

and it is stone-cold false. 

Are there state societies (or even hierarchical societies) based on tuber crops? 

Of course there are.

Scott’s inability to see this follows directly from his lack of interest in explaining the rise of state societies in the first place.    Instead of an explanation involving people's decisions and how they adapt their associative practices to specific environments, he substitutes a crude deterministic theory about crops and their putative socio-political effects.[12]  He trims off the troubling anomalies and … presto-change-o … we have a new theory of states.  In order to support his deterministic theory of state formation he tries to fit the crop to the tax-collector and is forced to make ludicrous assumptions about how tax collectors really go about their work. 

But tax-collectors don’t work the way Scott thinks they do:

‘Hawaiian scholars David Malo and Samuel Kamakau described the system of annual Makahiki tax collection, when an image of Lono was carried in a clock-wise procession around the island.  As the Lono image entered each ahupua’a territory, two poles were set in the ground; the distance between the poles was close for a small ahupua’a, farther apart for a large territory.  The gap had to be filled, from pole to pole, with baskets of sweet potato, calabashes of poi, pigs, dogs, bark-cloth, fishnets, fine mats, and other offerings.  If the konohiki or ahupua’a manager failed to make sure that his people filled the space between the upright sticks, then the chiefs attending the Lono image would order the territory to be plundered.  As Kamakau drily remarked, “Only when the keepers [of the image]  were satisfied with the tribute given did they stop this plundering.”' [13]

The first thing that strikes us when we compare Scott’s simplified ideas about tax collection and Kamakau’s description of the Hawaiian system is how limited Scott’s conception is.  It's not just that tubers were the basis of Hawaiian taxation.  The Hawaiian system levies contributions from the entire ecosystem; not just products of the ground but hand-manufactures as well as products from the forest and ocean.  And there is another division in the Hawaiian system which is invisible in Scott.  The Hawaiians not only levy necessities but items of high status: feathers from the  ‘o’o and mamo birds as well as fine handicrafts in the form of malos, woven mats, skirts, and a wide variety of tapa cloths.

This is what taxation looks like in an actual society, not in the abstract state of Scott’s imaginings.

In the light of this his image of tax collectors digging up their own tubers looks strange and perverse.  Why would Scott ever have thought such a thing?

And what about Scott’s idea that tubers cannot be the foundation of a state because they cannot be stored?  Taro, yams, and sweet potatoes were all subject to appropriation during the Makahiki.   Taro (Colocasia esculenta) was the primary food crop in ancient Hawaii.  The plant (either wet-land or dry-land) produces a tuber (a corm) after 9 months to 2 years of cultivation. When the large corm is harvested the tough rind is scraped off and the rest is baked in a Hawaiian oven (imu).  After cooking the meat is ground and mashed (with a minimum of water) into a thick paste that the Hawaiians called pai’ai.   To prepare it for eating a portion of pai’ai would be mixed with water to a desired consistency.  This poi was then consumed.  Pai'ai could be stored in calabashes, wrapped in ti (Cordyline fruticosa) leaves, or placed in baskets.  It is in the form of pai’ai, which keeps indefinitely, that taro would have been presented to the royal tax-gatherers.[14]

So Scott’s idea that tubers cannot form the basis for taxation or a state is simply false.

Nor can Scott and his supporters claim that ancient Hawaii was not a state.  In fact Hawaii, though ethnographers often overlook it, is of extreme interest.  It was one of only five or six places in the world where, with no models to follow, a true monarchy developed.[15]  More than that, unlike other primary kingships, Hawaii’s system survived until contact by modern explorers in the late eighteenth century.  And its customs, moeurs, and folk-ways could still be researched and voluminously recorded in the early nineteenth century by young western-educated Hawaiian students.  It’s no exaggeration to say that we learn more about the process of earliest state formation from Hawaii than any other culture complex. 

A shame that this did not come to Scott’s attention.


Pleiades Data - Does Crowd Sourcing for Toponymy Actually Work?

In a previous post I discussed the magnitude of the errors we could expect in a positioning system limited to two-place decimal fractions.  I suggested that the average error that we could expect in such a system at latitudes approximately 35° N would be about 400 m.  I thought that this imprecision would bar such a system from serious work in toponymy and I asked what product would use a system so limited in mathematical range and so obviously unsuitable for the purpose.

The Pleiades Project is an attempt to adapt crowd-sourcing to the field of toponymy.  They have received generous financial support from the National Endowment for the Humanities to the amount of $1,140,780.  You can read about their grants here, here, here, and here.

What has the world of scholarship received for all this money?

Recently I was checking the positions for a random sample of geographic locations of classical and hellenistic locations in Greece.  The locations were derived from Pleiades.  This list was not selected by me but by a colleague.  Out of 45 locations 17 (36%) had significant errors.  

The smallest error was 357 m. and the largest was 3775 m.  The average error or arithmetic mean (in the erroneous part of the sample) was 1,514.7 m.   The median error was 1200 m.   I present no standard deviation because the errors are not normally distributed. In fact it appears as though the error distribution in Pleiades data might be bimodal. This suggests that there is more than one underlying cause for the Pleiades system’s inaccuracies.   

From my error worksheet.  Y-dimension is error in m.

The 13 less erroneous locations may derive from crowd contributions.  The uppermost 4 positions  (Passaron, Skotoussa, Messene, Antigonia) may be remnants of the original digitization of the Barrington Atlas data – however that was accomplished.  In other words I am suggesting that it appears as though crowd-sourcing tends to smooth out but not eliminate the original digitizing errors.  I emphasize that these are suppositions on my part.  But, clearly, the complicated history of Pleiades' data generation has left a signature in the error results.

Here is a link to my worksheet.  Occasional references in that worksheet to sites as 'Fnnn' or 'Cnnn' may be resolved at the site

If these results are upheld by others then I would suggest that Pleiades is not an appropriate component of any scholarly work.  If a system with a two-place fractional component has an average error of nearly 400 m. then the average error of Pleiades data of more than 1500 m. suggests that Pleiades data - at some point in its generation - never had an accuracy better than 0.5 to 1.5 fractional places (10^-0.5 to 10^-1.5).

I estimate that it takes at least 2 hours of research to reliably establish a location from Bronze Age or later sites in Greece.  I do not know how many data points Pleiades claims but if it is, for example, 10000 points then it would require an effort of about 20000 man hours to complete a reliability review for Pleiades.  At 2200 man hours in a man year that would require about 9 man years to complete.  This is an order of magnitude estimate.

Crowd-sourcing in toponymy studies does not appear to work.

This defective data of Pleiades casts a shadow downstream - for example in such derivative products as Pelagios/Peripleo.

If Pleiades cannot undertake a good faith reliability study it should be rejected by the scholarly community.

Fact Computing, Part 2

In a previous post I said that Pelagios Commons/Peripleo does not have its roots in the world of scholarship but in the world of computer science - specifically the ideas of Tim Berners-Lee.(1)  Now let's concentrate more specifically on what Pelagios Commons/Peripleo really does.

The first thing that must be clearly understood is that Pelagios Commons/Peripleo creates no scholarly content and has nothing whatsoever to do with any Classical scholarship.  It is strictly a computer science construct and, with a different database, would be perfectly at home in the world of migration tracking, chemical research, or anything else.

Peripleo is simply a front-end site or data aggregator of a very common type.

Pelagios Commons links large amounts of data produced by other non-related sites and entities and subsumes them under a common format.  It then exploits this umbrella format in order to write its own front-end viewing tools (Peripleo).

Its business model is exactly like that of Huffington Post and any one of hundreds of similar sites.  Through an agreement with providers it reproduces their work tout court.  They say that these unpaid contributors are members of a ‘Community’ but this ‘Community’ is nothing more than the stable of content providers who give away to Pelagios the fruits of their labors.  The most amusing statement on the Pelagios website strenuously denies this plainly obvious fact:


Pelagios provides no original scholarly content.

Pelagios exclusively displays content provided by others.

Pelagios forces their providers to reduce their own work into a Pelagios format in order for Pelagios' software to display it.

Peripleo implements numerous search options.

What else can Pelagios/Peripleo be but an aggregator/search portal?  In fact, if you go to their Peripleo splash page they clearly say 'Peripleo is a search engine ...'.  The fact that their content providers cooperate in the theft of their own labors does not change the essential nature of the arrangement (This was true for Huffington Post which disguised its essential nature until the moment it went public).  The content providers are said by Pelagios to be members of a ‘Community’.  From my many years as a professional computer scientist I can assure my readers that this type of dishonest rebranding is quite common everywhere in the online world.  The first step in any internet grift is to give it a name that expresses the opposite of what it really is.

That it is the contributors who are to do all the work is also obvious from the tools that Pelagios Commons provides:

Recogito   This is an ‘online platform for collaborative document annotation’.  But it is not the staff of Pelagios Commons that’s going to do this annotation (how could they?).  It is the contributor, the member of the ‘Community’ who creates this content.

Their Cookbook  makes it easy to see who it is who does all the work for Pelagios (hint: not the Pelagios staff themselves).  In every case the contributors are responsible for massaging all their data into a form that Pelagios can accept.   This is a cost to the contributor of many hours of uncompensated labor.  Pelagios should disguise this aspect better than they do.

The following picture should make these several relationships clearer.

I have claimed that the Pelagios Commons enterprise creates no content.  Strictly speaking that is not quite true.  In fact, Pelagios Commons has achieved the Holy Grail of academia: it is a perpetual motion machine for producing conference papers and web presentations.  If you inspect the list to which I’ve linked you will quickly see who it is who specifically benefits from the Pelagios Commons enterprise.


Casting doubt on the Pelagios enterprise is not to deny that some sort of digital structuring of the data that we have from Mediterranean societies of antiquity would be useful.  It would be useful.  But how is that goal to be attained?

The data that comes to us (or generated by us) relative to antiquity is of the most heterogeneous forms.  Locations, building plans, daily customs, food stuffs and their hypothesized yields, customs, clothing, trade, etc.  Everything of human interest falls within the purview of scholars of antiquity.  This is a classic data fusion problem.  Data fusion problems arise in environments where a number of sensors of different types provide data of interest that is to be presented in a uniform view.  Such problems arise in the cockpits of fighter pilots and in very many environmental studies where, again, different sensors (or the same types of sensors with different capabilities) are used to gather data which is then to be united, combined or fused into a single point of view.

Pelagios Commons dimly recognizes that this is the real problem.  But they have performed this task backwards.  They start from the assumption that Linked Data is the solution to everything.  Upon that ideology  they built a product which is useful for no one.  That’s the essential problem.  The site really isn’t good for anything because it started ideologically.   It did not start by asking what it is that scholars of ancient societies really need in the form of digital support.

How should the social data from ancient Mediterranean societies be fused?  But, before that, what does it mean, from the digital point of view, to support such scholars?  Particularly in view of the fact that the scholars in such fields have radically differing interests.


1) Pelagios Commons here links directly to a discussion of Tim Berners-Lee idea of Linked Data here.

Release of Database 52 to

The announcement reads as follows:

Release of DB 52. MAP_Rev_52__02_05_18. Twenty seven new sites. Various corrections and emendations. Search table files updated.

Geographic Positions with Two-Digit Fractions

I’m going to keep this simple:

Let’s pretend that the earth is flat (1); for our purposes it clarifies nothing to introduce spherical trig.  

Here you see part of a grid that marks locations.  The horizontal lines are latitude lines.  They measure distances north and south.  The vertical lines are longitude lines.  They measure distances east and west.  

Now let’s say that the locational precision in our system is two decimal places.  That means that no position on our flat earth can be described with numbers more precise than one one hundredth of a degree.  All lat/lon values must be in one of these forms:


Our system forbids anything else.  

So let’s say that the lines in our grid representation are exactly two fractional decimal places apart – a line (lat or lon) every one one hundredth (0.01) of a degree.

I’ve put in some suggested numbers for us to work with.  The first thing that you must notice is that unless a location is sitting on one of the vertices (that’s where the lines actually cross) it is NOT ACCURATELY REPRESENTABLE in our system.

Let’s look at an example.  Let us suppose that there is a place called, oh I don’t know, let’s make up some unusual name like ‘Prophitis Elias’.  

Now PE is a disadvantaged site because it happens to be located exactly halfway between all the vertices;  at position 34.345 N and 22.835 E.  Our system only allows us two decimal points of representation and so the position of PE is rendered as 34.35, 22.84. (2)   That shifts the position of PI onto the vertex at that position.  This is a deliberately introduced mistake (called ‘aliasing’) that tries to keep the town of PE in the system.  But it’s still a falsification which arises out of the constraints of our system and so it’s up to users to determine how serious this aliasing is.

How serious is it?  How far is PE from its true position under this kind of aliasing?

I’m going to introduce some simplifying assumptions.   I calculate that the circumference of the earth at  35.34 N is 20,291.484 miles [3] and so one one hundredth of a degree longitude at that latitude is 0.56365 miles or 2976.0851 feet.

One one hundredth of a degree in latitude is 3652.14666 feet.  Since our town of PE is sitting exactly equal distances from the vertices the error would be the hypotenuse (A) of the triangle shown here

The aliasing error for the town of PE would be 1826.0733 feet in latitude (NS) and 1488.043 feet in longitude (EW).  These numbers,  1826.1 and 1488.0, are half of the distances mentioned just above because PE is half-way from all the vertices. The actual error (A) is merely the hypotenuse of the resulting right triangle.  Solving for A by the Pythagorean theorem gives us the maximum aliasing error in this system which is 2355.592 feet or 717.984 m.   So the maximum error in this system is almost  ¾ of a kilometer.  I emphasize that this is true only for this latitude; these errors would grow smaller towards the poles and larger towards the equator.

More important: what is the average error?  How far off will we be in the usual case?

The average error has to be less than the maximum error but how much less?  Is it half?
To answer this question I created a simulation and ran 1,000,000 trials several times.  It turns out that the error is not quite normally distributed (a little skewed to the low end) and the average (the arithmetic median) size of the error converges on ~ 1272 feet (388 m.)

Here's a bar chart of 5,000,000 runs.  Each cell from L to R represents an additional 73 m. in error.

To make a long story short – In a positional system limited to two decimal places in the fraction you can expect an aliasing error of nearly 400 m.

Who would create such a limited cockamamie geographical positioning system and expect it to be used for precise work such as describing the position of, oh I don't know, say something like Bronze Age find sites?

Who indeed?


Fact Computing

In this life, we want nothing but Facts, sir; nothing but Facts!
Mr. Gradgrind in Hard Times

Recently a friend suggested to me that was a natural candidate for Pelagios Commons.

Sites of classical learning being assimilated
 into Pelagios Commons/Peripleo

‘Absolutely not’, I replied.

And then found myself tongue-tied because I couldn’t adequately express why not.  So that’s what this blog post is about. 

Pelagios Commons is an effort to bring together a large number of websites (themselves all concerned with various aspects of Classical learning) under a single umbrella and, in some sense, merge them into a common resource.  Pelagios is, in computer parlance, a 'front end' or 'concentrator'.  They perform no original scholarship; they take the results and research of various other sites, smash them into homogeneous factlets, and then spew them out again through their results engine which is called Peripleo.

Since they contribute nothing to scholarship itself, what is it that Pelagios Commons is really trying to accomplish?  They are trying to demonstrate that Classical Studies, in all its forms, is suitable for, and can be represented by, something called 'Open Linked Data'.
What is ‘Open Linked Data’?
Wikipedia defines ‘Linked Data’ as:
 In computinglinked data ... is a method of publishing structured data so that it can be interlinked and become more useful through semantic queries.’[2]
(emphasis is mine)

(‘Open Data’ is merely linked data which has no copyright or other usage restrictions.)
‘Open Linked Data’ might be visualized as an enormous network of little fact nuggets.  The ‘interlinking’ mechanism is, of course, the internet but the underlying goal is given in the definition – it is to facilitate ‘Semantic Queries’ and to make the entire congeries of facts ‘more useful’.
It’s the ‘Semantic Queries’ part which gives it away – Open Linked Data is, conceptually, part of the Semantic Web of Tim Berners-Lee.  Now I’m opposed to the idea of any scholar's wasting their time on Semantic Web efforts and have blogged about it here.  I strongly urge my readers to go back and read that entire post but I reproduce the conclusion here:
“And no automation can replace scholarship.  By scholarship I mean the several activities of gathering evidence, organizing, patient collation, reflection, judgment and the expression of these activities in the form of essays, books, diagrams and, yes, even in the form of web sites or blogs.  There is no grand slam against reality; no Tower to the Heavens that we can build that will let us storm the citadel of knowledge.   We have to patiently scrape away at the matrix of the Unknown with our small intellects in order to see it more plainly.

I oppose the Semantic Web because it is a totalizing (and trivializing) view of knowledge that is inappropriate for fields in the Social Sciences such as Classical Studies.
For example, as the proprietor of I take my field to be the locations of Bronze Age sites.  So far, there are about 2400 such sites in my database and it seems as though site locations would be prima facie appropriate for an Open Linked Data approach.  But it turns out that there is ambiguity about nearly every one of those sites and many of them - as they stand in the DB right now - are likely to be wrong.  My database is not a database of sites so much as it is a database of scholarly arguments about the location and meanings of Bronze Age sites.  In other words, a database of ambiguities, many of which can never be resolved.   And if this is true of relatively straight-forward things like lat/lon pairs how much more true must it be of other Social Science topics in Classical Studies?  Where will we find the semantic web approach to Slavery in ancient Greece and Rome?  Where will we find the Open Linked Data representation of the efflorescence in Fifth-century Athens?  Where will we find the automated web-linked explanation of the collapse at the end of LH III?  Oh, but I forget.  There are available simple fact-based answers, suitable for the Semantic Webon all these topics.  For example:
Slavery was a bad but necessary thing for Greece and Rome in an age with no petroleum, electricity, or engines.  Classical Athens experienced an efflorescence in the Fifth-century because of the indomitable Will of its people and their love of Freedom.  Mycenaean civilization collapsed at the end of LH III because some Invaders from the Sea destroyed everything.
But I exalt myself onto a plane where I do not belong.  Tolstoy was here long before me.  The famous chapter 1 in Epilogue 2 of War and Peace is perfectly ready for semantic net representation. [2]
Open Linked Data is a trivializing approach to knowledge.
Ready for your close-up, Mr. Gradgrind?
