A user-friendly method for generating overlay maps (2012 update)
(appendix 1 to:
· Ismael Rafols, Alan L. Porter, and Loet Leydesdorff, “Science overlay maps: a new tool for research policy and library management,” Journal of the American Society for Information Science & Technology 61(9) (2010) 1871-1887; <pdf-version>;
· Loet Leydesdorff, Stephen Carley, and Ismael Rafols, “Global Maps of Science based on the new Web-of-Science Categories,” Scientometrics (in press)
We follow the method introduced in “Science overlay maps: a new tool for research policy and library management” (cf. Leydesdorff & Rafols (2009); Rafols & Meyer, 2010) to create overlay maps on the basis of a global map of science. The steps described below rely on access to the Web of Science and the files available in our mapping kit. The objective is to obtain the set of Web-of-Science Categories (WCs) for a given set of articles; provide this to network software; and output overlay information to add to a suitable basemap. We describe here below the procedures for using Pajek and/or VOSviewer.
First, the analyst has to conduct a search in the Thomson Reuters Web of Science (www.isiknowledge.com). Users should note that this initial step is crucial and should be done carefully: author names, for example, can be retrieved with different initials; addresses are sometimes inaccurate, and only some types of document, may be of interest (e.g., only so-called citable items: articles, proceedings papers, and reviews). Once the analyst has chosen a set of documents from searches at Web of Science, one can click the tab, Analyze results at the right top of the results page. At a new webpage, the selected document set can then be analysed along various criteria (top left hand tab). The Web of Science Category choice produces a list with the number of documents in each Category for Web of Science version 5. (This was called “ISI Subject Categories” in version 4. Note that the abbreviation “SC” for subject categories is still available, but these are differently composed.) The resulting list can be downloaded into a file with the default name analyze.txt.
This file “analyze.txt”—make sure that the file has this name!—can be transformed by the mini-programme WC10.exe to WC10.vec for upload into Pajek as a vector, and to the files vos4.csv, vos6.csv, and vos19.csv for use in VOSviewer (with 4, 6 or 19 base colors for the clusters, respectively).
One can download and install the freeware programme Pajek for network analysis and visualizations. After opening this programme, press F1 and read the basemap map10.paj (after saving it to disk). Then, go to the main menu File>Vector>Read to upload the above prepared file “WC10.vec.” Selecting from the menu Draw>Draw-Partition-Vector (alternatively, pressing Ctrl-Q), the overlay map is generated. At this stage, the size of nodes will often need adjustment, which can be done by selecting Options>Size of Vertices in the new draw window. In order to have the standard colour settings, the file Colour_Settings.ini can be loaded by going to Options>Ini File>Load in the main Pajek window. Crtl-L and Ctrl-D allow visualise and delete, respectively, the labels of each WC. Clickling on nodes allows to move WCs to other positions. The image can be exported selecting Export>2D>Bitmap in the menu of the Draw window. (See also here for improving the picture.) A further optional step can be to label the map in terms of factors, by importing this image into PowerPoint in order to label groups of clusters, as shown in the file basemaps.ppt.
(An alternative procedure for more experienced users is to download the records of a document set found in the Web of Science. This can be done by adding the Marked list (bottom bar) the desired documents; second, going to Marked list (top bar) and then downloading the documents in a Tagged Field format after selecting Subject category as one of the fields to download. The downloaded file should be renamed as data.txt and used as input into the program ISI.exe. One of the outputs of the programme ISI.exe is the file wc10.vec that can be used in Pajek as explained above. The advantage of this procedure is that ISI.exe also produces other files with information on fields such as authors or journals that may be of interest. Feel free to contact the authors in case of difficulty.)
Four macro fields (December 2010; February 2012)
In addition to the default colourization based on distinguishing 19 factors, we added a second partition to the basemap (map10.paj) with four groups as macro-fields: biomedical, environmental, physical, and social sciences. This clustering is based on the algorithm in VOSviewer (Waltman et al., 2011). Note that this attribution of four colours can also be used with the .vec file based on the 224 Categories as described above; one only has to change the choice of the partition and to redraw using Crl-Q or Draw>Draw-Partition-Vector. Similarly, one can use a clustering based on the 6-factors solution.
The easiest way to generate a science map is to use the visualisation programme VOSviewer. Click on the ‘Open’ tab in VOSviewer. The program WC10.exe generates three files which can be opened in VOSviewer as the so-called “map-files”: vos4.csv, vos6.csv, and vos19.csv. These files are (as above) based on 4, 6, or 19 clusters with different colors. (The extension “csv” stands for “comma-separated variables”; the files can be edited both in excel and using a text editor.) One is advised to consult the VOSviewer manual (in the left pane of the program after installation) for further options such as different colouring. For experienced users, the network file is available from here (cf. Leydesdorff & Rafols, in press).
Extension for GEPHI
Clement Levallois < CLevallois@rsm.nl > was so kind to make an excel file with a macro <gephi.xlsm> which allows for generating the corresponding input file for GEPHI (as an alternative to Pajek for the visualization). Save this file under the name gephi.xlsm by right clicking on the hyperlink.
Both procedures (ISI.Exe or wc10.exe) also provide a file wc10.dbf. This file can be used as input to the computation of the Stirling-Rao diversity measure using the instruction provided here.
For the installation for previous years (2007, 2008, and 2009) one can click here.
The full materials (citation matrix, cosine matrices, and rotated factor matrices) for 2010 are available from here.
Leydesdorff, L. & Rafols, I. (2009). A Global Map of Science Based on the ISI Subject Categories. Journal of the American Society for Information Science & Technology, 60(2), 348-362.
Leydesdorff, L. & Rafols, I. (in press). Interactive Overlays: A New Method for Generating Global Journal Maps from Web-of-Science Data, Journal of Informetrics (forthcoming).
Rafols, I. & Meyer, M. (2010). Diversity and Network Coherence as indicators of interdisciplinarity: case studies in bionanoscience. Scientometrics, 82(2), 263-287.
Ismael Rafols, Alan L. Porter, and Loet Leydesdorff, “Science overlay maps: a new tool for research policy and library management,” Journal of the American Society for Information Science & Technology 61(9) (2010) 1871-1887.
Waltman, L., van Eck, N. J., & Noyons, E. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4), 629-635.
Amsterdam, February 8, 2012
 Alternatively, one can use Word or http://www.fileformat.info/convert/text/utf2utf.htm for converting “analyze.txt” into a plain txt file with the same name (“analyze.txt”) first and then use wc10_win32.exe.