“Betweenness Centrality” as an Indicator of the
“Interdisciplinarity” of Scientific Journals
Journal of the American Society for Information Science and Technology (forthcoming)
Loet Leydesdorff
Amsterdam School of Communications Research (ASCoR), University of Amsterdam
Kloveniersburgwal 48, 1012 CX Amsterdam, The Netherlands
loet@leydesdorff.net; http://www.leydesdorff.net
Abstract
In addition to science citation indicators of journals like impact and immediacy, social network analysis provides a set of centrality measures like degree, betweenness, and closeness centrality. These measures are first analyzed for the entire set of 7,379 journals included in the Journal Citation Reports of the Science Citation Index and the Social Sciences Citation Index 2004, and then also in relation to local citation environments which can be considered as proxies of specialties and disciplines. Betweenness centrality is shown to be an indicator of the interdisciplinarity of journals, but only in local citation environments and after normalization because otherwise the influence of degree centrality (size) overshadows the betweenness-centrality measure. The indicator is applied to a variety of citation environments, including policy-relevant ones like biotechnology and nanotechnology. The values of the indicator remain sensitive to the delineations of the set because of the indicator’s local character. Maps showing interdisciplinarity of journals in terms of betweenness centrality can be drawn using information about journal citation environments which is available online.
Keywords: centrality, betweenness, interdisciplinarity, journal, citation, indicator
1. Introduction
Ever since Garfield (1972; Garfield & Sher, 1963) proposed impact factors as indicators for the quality of journals in evaluation practices, this measure has been heavily debated. Impact factors were designed with the purpose of making evaluation possible (e.g., Linton, 2006). Other indicators (e.g., Price’s [1970] immediacy index) were also incorporated into the Journal Citation Reports of the Science Citation Index, but were coupled less directly to library policies and science policy evaluations (Moed, 2005; Monastersky, 2005; Bensman, forthcoming).
Soon after the introduction of the Science Citation Index, it became clear that publication and citation practices are field-dependent (Price, 1970; Carpenter & Narin, 1973; Gilbert, 1977; Narin, 1976). Hirst (1978), therefore, suggested constructing discipline-specific impact factors, but their operationalization in terms of discipline-specific journal sets has remained a problem. Should such sets be defined with reference to the groups of researchers under evaluation (Moed et al., 1985) or rather in terms of the aggregated citation patterns among journals (Pinski & Narin, 1976; Garfield, 1998)? How can one disentangle the notion of hierarchy among journals and the juxtaposition of groups of journals in the various disciplines (Leydesdorff, 2006)?
Furthermore, as Price (1965) noted, different types of journal publications within similar fields can be expected to vary also in terms of their citation patterns. Within each field, some journals follow developments at the research front (e.g., in the form of letters), while other journals (e.g., review journals) have a longer-term scope. Thirdly, journals differ in terms of their “interdisciplinarity,” with Nature and Science as the prime examples (Narin et al., 1972), while others include sections of both general interest and disciplinary affiliations (e.g., PNAS and the Lancet). In addition to the “multidisciplinarity” or “interdisciplinarity” of journals at a general level, “interdisciplinarity” can also occur at the very specialized interface between established fields of science, as in the case of biotechnology and nanotechnology.
Three indicators of journals were codified in the ISI databases: impact factors, immediacy indices, and the so-called subject categories. These indicators are based on the Journal Citation Reports, which offer aggregated citation data among journals. However, the subject categorization of the ISI has remained the least objective among these indicators because the indicator is not citation-based. The ISI-staff assigns journals to subjects on the basis of a number of criteria, among which are the journal’s title, its citation patterns, etc. (McVeigh, personal communication, 9 March 2006).
An unambiguous categorization of the journal set in terms of subject matters seems impossible because of the fuzziness of the subsets (Bensman, 2001). In addition to intellectual categories, journals belong to nations, publishing houses, and often to more than a single discipline (Leydesdorff & Bensman, 2006). The potential “interdisciplinarity” of journals makes it difficult to compare journals as units of analysis within a specific reference group of “disciplinary” journals.
“Interdisciplinarity” is often a policy objective, while new developments may take place at the borders of disciplines (Caswill, 2006; Zitt, 2005). New developments may lead to new journal sets or be accommodated within existing ones (Leydesdorff et al., 1994). For example, recent developments in nanotechnology have evolved at interfaces among applied physics, chemistry, and the material sciences. The delineation of a journal set in nanotechnology is therefore not a sine cure, while in the meantime a much more discrete set of journals in biotechnology has evolved. Existing classifications may have to be revised and innovated from the perspective of hindsight (Leydesdorff, 2002). The U.S. Patent and Trade Office, for example, has launched a project to reclassify its existing database using “nanotechnology” as a new category at the level of individual patents.
Reclassification at the level of individual articles would mean changing the (controlled) keywords with hindsight (Lewison & Cunningham, 1988). However, this is unnecessary since scientific articles are organized into journals by a strong selection process of submission and peer review. The recursive selection processes lead to very strong structures and correspondingly skewed distributions. Garfield (1972, at p. 476) argued that a multidisciplinary core for all of science comprises no more than thousand journals.
The citation structures among journals are updated each year because of changes in citation practices. However, in the case of “interdisciplinary” developments the classification may be more ambiguous because different traditions and standards are interfaced. Cross-links (e.g., citations) provide inroads for change in an otherwise (nearly) decomposable system (Simon, 1973).
The development of a measure of interdisciplinarity at the level of journals derived from this destabilizing effect on citation structures could be extremely useful as an early-warning indicator of new developments. In a previous attempt to develop such indicators, Leydesdorff et al. (1994) were able to show that new developments can be traced in terms of deviant being-cited patterns in various groups of neighboring journals. However, the opposite effect, namely that this deviant pattern also indicates new developments, could not be shown (Leydesdorff, 1994; Van den Besselaar & Heimeriks, 2001). Cross-links may have other functions as well. Like most research in the bibliometric field, these analyses of interdisciplinarity were based on the assumption that journals can be grouped either using the ISI subject categories (e.g., Leeuwen & Tijssen, 2000; Morillo et al., 2003) or on the basis of clustering citation matrices (Doreian & Farraro, 1985; Leydesdorff, 1986; Tijssen et al., 1987).
Before one can delineate groups of journals in “interdisciplinary” fields, one would need an indicator of “interdisciplinarity” at the level of individual journals. To what extent do articles in a specific journal feed into or draw upon different intellectual traditions? The focus on the position of individual agents in networks—in this case journals—has been developed in social network analysis more than in scientometrics (Otte & Rousseau, 2002).
2. Centrality Measures in Social Network Analysis
Social network analysis has developed as a specialty in parallel with scientometrics since the late 1970s. In a ground-laying piece, Freeman (1977) developed a set of measures of centrality based on betweenness. Freeman stated that “betweenness” as a structural property of communication was elaborated in the literature as the first measure of centrality (Bavelas, 1948; Schimbel, 1953). In a follow-up paper, Freeman (1978) gradually elaborated four concepts of centrality in a social network, which have since been further developed (Hanneman & Riddle, 2005; De Nooy et al., 2005):
These measures and their further elaboration into relevant statistics were conveniently combined in the software package UCINet that Freeman and his collaborators have developed since the 1980s (Bonacich, 1987; Borgatti et al., 2002; Otte & Rousseau, 2002). A number of visualization programs for networks like Pajek and Mage interface with UCINet. The visualization and the statistics have become increasingly integrated.
Centrality in terms of degree is easiest to grasp because it is the number of relations a given node maintains. Degree can further be differentiated in terms of “indegree” and “outdegree,” that is, incoming or outgoing relations. In the case of a citation matrix, the total number of references provided by a textual unit of analysis (e.g., an article or a journal) can then be considered as its outdegree, and instances of its being cited as the indegree. Degree centrality is often normalized as a percentage of the degrees in a network.
“Betweenness” is a measure of how often a node (vertex) is located on the shortest path (geodesic) between other nodes in the network. It thus measures the degree to which the node under study can function as a point of control in the communication. If a node with a high level of betweenness were to be deleted from a network, the network would fall apart into otherwise coherent clusters. Unlike degree, which is a count, betweenness is normalized by definition as the proportion of all geodesics that include the vertex under study. If gij is defined as the number of geodesic paths between i and j, and gikj is the number of these geodesics that pass through k, k’s betweenness centrality is defined as (Farrall, 2005):
i ≠
j ≠ k
“Closeness centrality” is also defined as a proportion. First, the distance of a vertex from all other vertices in the network is counted. Normalization is achieved by defining closeness centrality as the number of other vertices divided by this sum (De Nooy et al., 2005, p. 127). Because of this normalization, closeness centrality provides a global measure about the position of a vertex in the network, while betweenness centrality is defined with reference to the local position of a vertex.
Eigenvector analysis brings us back to approaches that are familiar from multivariate analysis. Principal component and factor analysis decompose a matrix in terms of the latent eigenvectors which determine the positions of nodes in a network, while graph analysis begins with the vectors of observable relations among nodes (Burt, 1982). How can these be grouped bottom-up using algorithms? For example, core-periphery relations can be made visible using graph-analytical techniques, but not by using factor-analytical ones (Wagner & Leydesdorff, 2005).
Betweenness is a relational measure. One can expect that a journal which is “between” will load on different factors because it does not belong to one of the dense groups, but relates them. The factor loadings of such journals may depend heavily on the factor-analytic model (e.g., the number of factors to be extracted by the analyst). For example, one might expect inter-factorial complexity among the factor loadings in the case of inter- or multidisciplinary journals (Van den Besselaar & Heimeriks, 2001; Leydesdorff, 2004). Closeness is less dependent on relations between individual vertices because a vertex can be close to two (or more) densily connected clusters. Closeness can thus be expected to provide us with a measure of “multidisciplinarity” within a set while betweenness may provide us with a measure of specific “interdisciplinarity” at interfaces.
3. Size, impact, and centrality
While the impact factor and the immediacy index are corrected for size (because the number of publications in the previous two years and the current year, respectively, is used in the denominator; cf. Bensman, forthcoming), centrality measures are sensitive to size. A further complication, therefore, is the possibility of spurious correlations between different centrality measures. Large journals (e.g., Nature) which one would expect to be “multidisciplinary” rather than “interdisciplinary,” might generate a high betweenness centrality because of their high degree centrality.
Normalization of the matrix for the size of patterns of citation can suppress this effect (Bonacich, personal communication, 22 May 2006). Fortunately, there is increasing consensus that normalization in terms of the cosine and using the vector-space model provides the best option in the case of sparse citation matrices (Ahlgren et al., 2003; Chen, 2006; Salton & McGill, 1983). Using the cosine for the visualization, a threshold has to be set because the cosine between citation patterns of locally related journals will almost never be equal to zero. However, the algorithms for computing centrality first dichotomize this matrix.
![]() |
Actually, when I was working with visualizations of cosine-based journal maps (Leydesdorff, forthcoming-a, forthcoming-b), it occurred to me that the interdisciplinarity of journals corresponds with their visible position in the vector space. Figure 1, for example, shows the citation impact environment of Social Networks as an example. Among the 54 journals citing Social Networks more than once in 2004,[1] this journal is on the shortest path between vertices in 15% of the possible cases, followed by the Journal of Mathematical Sociology with a value of 11% on betweenness centrality. The other journals have considerably lower values. The visual pattern of connecting different subgroups also follows the intuitive expectation of “interdisciplinarity” among these journals.

Figure 2: Betweenness centrality of Social Network in its
citation environment before normalization with the cosine.
Figure 2 contrasts this finding with the betweenness centrality in the unnormalized networks. Social Networks is still the journal with the largest betweenness value (0.07), but the Journal of Mathematical Sociology now has a score of 0.01. This is even lower than the corresponding value for the American Sociological Review (0.03). The latter is a much larger journal with a distinct disciplinary affiliation (that is, sociology). In sum, the visualization using unnormalized citation data can be expected to show neither the cluster structure in the data nor betweenness centrality among groups of nodes. One needs a normalization in terms of similarity patterns (using a similarity coefficient like the Pearson correlation or the cosine) to observe the latent structures in this data.
The research question of this paper is to address the phenomenon of betweenness centrality in the vector space systematically. I will first study the different centrality measures in the non-normalized matrix, then in the cosine-normalized one, and finally in a few applications, including some with obvious policy relevance (nanotechnology and biotechnology).
4. Methods and Materials
The data was harvested from CD-Rom versions of the Journal Citation Reports of the Science Citation Index and the Social Sciences Citation Index 2004. These two databases cover 5,968 and 1,712 journals, respectively. Since 301 journals are covered by both databases, a citation matrix can be constructed among (5,968 + 1,712 – 301) = 7,379 journals. Seven journals are not processed by the ISI in the “citing” dimension, but we shall focus below on the “cited” dimension of this matrix. This focus enables us to compare the centrality measures directly with well-established science citation indicators like impact factors, immediacy, etc.
Among the 7,379 vectors of the matrix representing the cited “patterns,” similarities were calculated using the cosine. Salton’s cosine is defined as the cosine of the angle enclosed between two vectors x and y as follows (Salton & McGill, 1983):
Cosine(x,y) = 
The cosine is very similar to the Pearson correlation coefficient, except that the latter measure normalizes the values of the variables with reference to the arithmetic mean (Jones & Furnas, 1987). The cosine normalizes with reference to the geometrical mean. Unlike the Pearson correlation coefficient, the cosine is non-metric and does not presume normality of the distribution (Ahlgren et al., 2003). An additional advantage of this measure is its further elaboration into the so-called vector-space model for the visualization (Chen, 2006).
Note that the two matrices—that is, the matrix of citation data and the matrix of cosine values—are very different: the cosine matrix is a symmetrical matrix with unity on the main diagonal, while citation matrices are asymmetrical transaction matrices with usually outliers (within-journal “self”-citations) on the main diagonal (Price, 1981). The topography of the vector space spanned by the cosine values is accordingly different from the topography of the multi-dimensional space spanned by the vectors of citation values.
Subsets can be extracted from the database in order to measure the relations among journals that are citing a specific journal. I shall call these subsets the local citation impact environments of the journal under study. Betweenness centrality and other centrality measures will be different within these local citation environments from their values in the global set because each two journals within a local set can also be related through the mediation of journals outside the subset.
For the computation of centrality measures I use exclusively the methods available within the Pajek environment. This allows for a one-to-one correspondence between the visualizations and the algorithmic results. (The normalizations are sometimes slightly different between UCINet and Pajek.) Although UCINet is faster and richer in providing various computational options, Pajek is currently able to analyze centrality in asymmetrical matrices in both directions. Given our interest in asymmetrical citation matrices, this can be an advantage. The analysis focuses on degree centrality, betweenness centrality, and closeness centrality because eigenvector analysis is used in Pajek only as a means for the visualization. When displaying the citation impact environments (Leydesdorff, forthcoming-a and forthcoming-b), I shall use the vertical size for the relative citation contributions of journals in a specific environment, and the horizontal size for the same measure, but after correction for within-journal citations.
5. Centrality at the level of the Journal Citation Reports
5.1 The asymmetrical citation matrix
The asymmetrical citation matrix contains two structures, one in the “cited” and another in the “citing” dimension of the matrix. Pajek provides options to compute the three centrality measures (degree, betweenness, and closeness) in both directions. Thus, six indicators can be measured across the file. The values on these six indicators can be compared with more traditional science citation indicators like “impact,” “immediacy,” and “total citations.” (The values of the six [two times three] centrality measures for the 7,379 journals are available online at http://www.leydesdorff.net/jcr04/centrality/index.htm .)
Rotated Component Matrix(a)
|
|
Component |
||
|
|
1 |
2 |
3 |
|
Number of issues |
.924 |
|
.185 |
|
Total number of references (citing) |
.909 |
.210 |
.237 |
|
Within journal “self”-citations |
.815 |
.152 |
|
|
Betweenness (citing) |
.740 |
|
.103 |
|
Total number of citations (cited) |
.672 |
.639 |
|
|
Immediacy |
|
.806 |
.267 |
|
Impact |
|
.802 |
.295 |
|
Indegree (cited) |
.405 |
.713 |
.381 |
|
Betweenness (cited) |
.261 |
.691 |
-.240 |
|
Closeness (cited) |
|
|
.776 |
|
Closeness (citing) |
.190 |
.413 |
.663 |
|
Outdegree (citing) |
.498 |
.356 |
.633 |
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.
a Rotation converged in 5 iterations.
Table 1: Three-factor solution of the matrix of 7,379 journals versus six centrality measures and a number of science (citation) indicators.
Table 1 shows the rotated three-factor solution for the matrix of 7,379 journals versus the various science indicators and centrality measures as variables. Three factors explain 73.5% of the variance. Factor One (46.9%) can be designated as indicating the size of journals, Factor Two (16.4%) registers the effects of citations (“impact,” etc.), and Factor Three (10.3%) seems to indicate the reach of a communication through citation. The strong relation between immediacy and impact has previously been noted by Yue et al. (2004). The further elaboration of the relation between centrality measures and science citation indicators would lead me beyond the scope of this study.
In Table 1, the three indicators on which we will now focus our attention are shown in boldface. First, one can note the difference in sign for “betweenness centrality” and “closeness centrality” on the third factor, but as expected, this negative correlation is overshadowed by the commonality between “betweenness centrality” and “indegree” on the first two factors.
Correlations
|
|
|
Indegree |
Betweenness cited |
Closeness cited |
|
Indegree |
Pearson Correlation |
1 |
.509(**) |
.651(**) |
| Sig. (2-tailed) |
|
.000 |
.000 |
|
| N |
7379 |
7379 |
7379 |
|
|
Betweenness cited |
Pearson Correlation |
.509(**) |
1 |
.210(**) |
| Sig. (2-tailed) |
.000 |
|
.000 |
|
| N |
7379 |
7379 |
7379 |
|
|
Closeness cited |
Pearson Correlation |
.651(**) |
.210(**) |
1 |
| Sig. (2-tailed) |
.000 |
.000 |
|
|
| N |
7379 |
7379 |
7379 |
** Correlation is significant at the 0.01 level (2-tailed).
Table 2: Correlations among the centrality measures in the cited dimension (N = 7,379).
Table 2 provides the correlation coefficients among the three centrality measures. Because of the large N (= 7,379) all correlations are significant. However, the correlation between closeness and betweenness is considerably lower (r = 0.21; p < 0.01) than the other correlations (r > 0.5; p < 0.01).
|
|
Indegree |
|
Betweenness |
|
Closeness |
|
Science |
4904 |
Science |
0.098921 |
Science |
0.538172 |
|
Nature |
4555 |
Nature |
0.067541 |
Nature |
0.522138 |
|
P Natl Acad Sci USA |
3776 |
P Natl Acad Sci USA |
0.039714 |
P Natl Acad Sci USA |
0.490666 |
|
Lancet |
2834 |
Lancet |
0.013324 |
Lancet |
0.456274 |
|
New Engl J Med |
2780 |
JAMA-J Am Med Assoc |
0.011943 |
New Engl J Med |
0.453366 |
|
J Biol Chem |
2674 |
New Engl J Med |
0.011665 |
JAMA-J Am Med Assoc |
0.442401 |
|
JAMA-J Am Med Assoc |
2510 |
Brit Med J |
0.009516 |
Ann NY Acad Sci |
0.441714 |
|
Ann NY Acad Sci |
2375 |
J Am Stat Assoc |
0.009486 |
J Biol Chem |
0.440729 |
|
Brit Med J |
2228 |
Ann NY Acad Sci |
0.008139 |
Brit Med J |
0.433717 |
|
Biochem Bioph Res Co |
2075 |
J Biol Chem |
0.007159 |
Biochem Bioph Res Co |
0.420714 |
Table 3: Top-10 journals on three network indicators of centrality in the
being-cited direction.
Table 3 shows the ten journals with highest values on these three indicators. The set for the “indegree” overlaps completely with “closeness,” and these two sets differ only by a single journal from the list for “betweenness:” the Journal of the American Statistical Association is included in the latter set, while Biochemical and Biophysical Research Communications is not included in this list. In other words, the three measures may indicate different dimensions, but they do not discriminate sufficiently among one another to provide us with a measure of “interdisciplinarity” or “multidisciplinarity” at the level of the file.
5.2 The centrality measures in the vector space
Let us turn now to the vector space of these 7,379 vectors, while continuing to focus on the cited dimension. Closeness centrality cannot be computed in the vector space since the network is not fully connected. Betweenness centrality and degree correlate at r = 0.69 (p < 0.01). Table 4 provides the top ten journals on these two indicators.
|
|
Degree |
|
Betweenness |
|
Science |
0.979534 |
Science |
0.2860 |
|
Nature |
0.958254 |
Nature |
0.2106 |
|
Sci Am |
0.950935 |
Sci Am |
0.1946 |
|
J Am Stat Assoc |
0.942667 |
J Am Stat Assoc |
0.1785 |
|
Ann NY Acad Sci |
0.935484 |
Brit Med J |
0.1471 |
|
P Natl Acad Sci USA |
0.928707 |
Lancet |
0.1469 |
|
Lancet |
0.925047 |
Ann NY Acad Sci |
0.1409 |
|
Biometrika |
0.921523 |
Am Econ Rev |
0.1366 |
|
New Engl J Med |
0.910952 |
P Natl Acad Sci USA |
0.1363 |
|
JAMA-J Am Med Assoc |
0.898075 |
Biometrika |
0.1350 |
Table 4: Top-10 journals in the vector space (being-cited direction).
Seven of the ten journals occur on both lists, and the order of the top four is the same. There are important differences from the top-10 lists provided in table 3. However, it is no longer clear what we are measuring. Both measures correlate, for example, at the level of r = 0.47 (p < 0.01) with the impact factor, but in themselves they don’t have a clear interpretation other than the fact that Science and Nature have the highest centrality at the global level, no matter how one measures the indicator.
6. The local citation impact environments
6.1. Social Networks as an example
Let us return to our example of the journal Social Networks for a more precise understanding of what centrality measures may mean in local citation environments. Social Networks is included in the Social Sciences Citation Index, but it relates also to journals which are included in the Science Citation Index. In the combined set, Social Networks is cited by 54 journals (as against 40 in the Social Sciences Citation Index). Figure 3 provides the visualization of these journals with the cosine as the similarity measure. The vertical and horizontal axes of the vertices are proportional to the citation impact in this environment with and without within-journal citations, respectively.
Eleven journals are grouped in the bottom right corner because they are isolates in this context. Social Networks, and to a lesser extent the Journal of Mathematical Sociology, are central in relating major clusters such as two groups of social-science journals (sociology and management science), a physics group, and a group of computer-science journals and statistics. However, the contribution of the two centrally positioned journals to the citation impact in this network is extremely small: only 0.41 % for Social Networks and 1.07% for the Journal of Mathematical Sociology.
![]() |
Figure 3: Citation impacts of fifty-four journals which cited Social Networks more than once in 2004 (N = 7,379; cosine ≥ 0.2).
Visual inspection of Figure 3 suggests that these two journals (Social Networks and the Journal of Mathematical Sociology) are central in relating the various clusters. Using betweenness as a measure, Pajek enables us to draw the vectors for the various measures of centrality and to display the vertices in terms of the values of these vectors. In Figure 1 above, “betweenness centrality” was thus used as the indicator in this same environment.
Correlations
|
|
Degree |
Between-ness |
Closeness |
Local impact |
|
|
Degree |
Pearson Correlation |
1 |
.724(**) |
.877(**) |
-.009 |
|
|
Sig. (2-tailed) |
|
.000 |
.000 |
.949 |
|
|
N |
54 |
54 |
54 |
54 |
|
Betweenness |
Pearson Correlation |
.724(**) |
1 |
.542(**) |
-.035 |
|
|
Sig. (2-tailed) |
.000 |
|
.000 |
.801 |
|
|
N |
54 |
54 |
54 |
54 |
|
Closeness |
Pearson Correlation |
.877(**) |
.542(**) |
1 |
-.001 |
|
|
Sig. (2-tailed) |
.000 |
.000 |
|
.991 |
|
|
N |
54 |
54 |
54 |
54 |
|
Local impact |
Pearson Correlation |
-.009 |
| ||