Large Networks among Institutional Addresses: InstPlus.EXE
This website provides a routine to generate a large network of institutional addresses based on co-authorship relations in a set of papers downloaded from the Web-of-Science (v5). The input is not limited to a number of institutions. The output is written as a file “mtrx.net” in the Pajek format (edgelist).
The routine uses as input a download in the tagged format of the WoS. This file has to be available in the same folder and named “data.txt” (see at http://www.leydesdorff.net/software/isi for further explanation). The first subfield of the address information under C1 is whole-number counted. For example:
C1 [Zhuang, Yongbing; Seong, Jong Geun; Do, Yu Seong; Jo, Hye Jin; Wang, Gang; Guiver, Michael D.; Lee, Young Moo] Hanyang Univ, Coll Engn, Dept Energy Engn, Seoul 133791, South Korea.
[Zhuang, Yongbing] Hunan Univ Arts & Sci, Coll Chem & Chem Engn, Changde 415000, Hunan, Peoples R China.
[Lee, Moon Joo] Hanyang Univ, Coll Engn, Sch Chem Engn, Seoul 133791, South Korea.
[Guiver, Michael D.] Tianjin Univ, Sch Mech Engn, State Key Lab Engines, Tianjin 300072, Peoples R China.
RP Guiver, MD (reprint author), Hanyang Univ, Seoul, South Korea.
“Hanyang Univ” is counted in the first and third address, but not in the RP (response) address. Consequently, there is one loop for Hanyang Univ, and two relations with “Hunan Univ Arts & Sci” (and two with “Tianjin Univ”). Between “Hunan Univ Arts & Sci” and “Tianjin Univ” only a single relation is counted.
The RP field is not included because since 2008 (WoS v.5), this field is nowadays mainly an extension of one of the addresses under C1. Before 2008, RP was used unsystematically as an extension or an additional address (Costas & Irribarren-Maestro, 2007).
August 30, 2015.
A further routine was added to InstPlus.exe which generates a second matrix “fmtrx.net” containing the same information as “mtrx.net”, but fractionally counted. In the above example, four addresses would be counted and each would be attributed 1/4th point. The co-occurrence contribution between each two of these addresses would in this case be (¼ * ¼) * 2 = 1/8. (The multiplication by two, because the file is written as edges subsuming mutual arcs.)
Note that the program does not disambiguate addresses, but processes the addresses as provided on the input file. For example “Gong Li Hosp” and “Gongli Hosp” are counted as two different addresses. The user is advised to disambiguate the address information before processing.
September 23, 2015.
Costas, R., & Iribarren-Maestro, I. (2007). Variations in content and format of ISI databases in their different versions: The case of the Science Citation Index in CD-ROM and the Web of Science. Scientometrics, 72(2), 167-183.