ISI.exe is freely available for academic usage.
The input file has to be saved as a so-called marked list in the tagged format from the Science Citation Index (Social Science Citation Index, Arts & Humanities Citation Index) at the Web-of-Science (v5). The default filename “savedrecs.txt” should not be used, but “data.txt” instead.
The program produces several output files in dBase format. Core.dbf contains the unique information per record; Au.dbf the authornames; af.dbf the authornames with full first names; cs.dbf the address information (including the response address if normal addresses are failing; Costas & Irribaren-Maestro, Scientometrics, 2007); cr.dbf the cited references; and csau.dbf the correspondence file between authors and addresses insofar as available in the input file.
These files can be read into Excel and/or SPSS for further processing. They can also be used in MS Access for relational database management. If the abstracts were also downloaded an additional file “abstract.txt” is generated which contains the same information for the abstract field in tabular (and delimited) format. A delimited “|” is inserted between the number of the document and the abstract file; this delimiter can be used, for example, in Excel.
The program is based on DOS-legacy software, but recompiled using Win32 software. One can best run it in a Command Box (C-prompt) under Windows. The programs and the input files have to be contained in the same folder. The output files are written into this directory. Please, note that existing files from a previous run are overwritten by the program. The user is advised to save output elsewhere if one wishes to continue with these materials. Using the C-prompt, the program provides error messages in case something goes wrong.
If you use this software, an appropriate reference for its source would be: Leydesdorff, L. (1989). Words and Co-Words as Indicators of Intellectual Organization. Research Policy, 18, 209-223.
BibCoupl additionally produced files for the analysis of bibliographic coupling within the set. The files with the extension “.dat” (cosine.dat and coocc.dat) are in DL-format (ASCII) and can be read directly into Pajek for the visualization (Pajek is freely available at http://vlado.fmf.uni-lj.si/pub/networks/pajek/ ). A number of additional databases are coproduced:
Click here for similar programs for Full Text and Co-Word Analysis. For directly reading abstracts into FullText, use the variant of this program called ISI2Abs.exe. This routine additionally produces each abstract available as text1.txt, text2.txt, etc., using sequential numbering.
Bibliographic coupling using the journal fields instead of documents can be done with BibJourn.EXE.
Merging downloaded ISI data files (tagged format from marked list) into a single file “data.txt”
(Suggested by Hamid Jamali, July 2, 2009.)
To use the software with ISI data, you need to download the data from the marked list of ISI in tagged format. ISI allows you to download up to 500 records each time. If you want to download more than 500 records you need to download them in several 500-batches and then combine them into a single file. Here is a simple method for doing that.
1. Put all the files in a single folder and name them in the sequence you want their data to be added to the end of each other, like 01.txt, 02.txt, 03.txt and so on. So the content of 02 will be copied at the end of 01, then the content of 03 will be copied to the end of the 01&02, and so on.
2. Open each file with notepad and delete "EF" (which marks the end of the ISI data file) from their end (except the last one) and "FN ISI Export Format VR 1.0" (which marks the beginning of the ISI data file) from their beginning (except the first one). Just remember not to leave extra paragraph marks (Enter) in the files.
3. In windows, go to run command (in start menu) type cmd and run, then use CD (change directory command) to go to the folder where you have saved the data. Then run “copy *.txt data.txt”. This command reads the content of all txt files in the folder and copies them in a single file named data.txt
Be aware that at the beginning of each file ISI inserts two invisible characters. The first lines of each download file should therefore not be removed.