Data and scripts used for my PhD thesis in Digital Humanities called "Character Networks and Centrality", defended on 2014/09/19 and 2014/12/12 at University of Lausanne.
The thesis can be downloaded here: http://infosciences.epfl.ch/record/203889?ln=en
The first time you use the script, you have the choice either to use the pre-compiled data, or to compile them by yourselves.
In the first case, use main0.R as it is.
In the second case, open it – main0.R – and un-comment all the five "source" entries. This will load the initial data (the index, see below) and read, sort and turn them into a bipartite graph, then project the whole to a unipartite graph, sort clusters, etc. You won't need to do that again, and can keep the original file in the state it is now.
In our case here, the data files are names.txt, which is a list of all characters along with a unique ID, and apparitions_par_pages.txt. In the latter, each line is an occurrence. Columns are:
- The character unique ID
- The volume
- The page
- "n" if the occurrence is inside a footnote.