Hands-on class practicing Science & technology indicators
During this class, we explore a science-based technology in terms of science and technology indicators. The example below will be “renewable energy” and we focus on the Netherlands. However, if you are more interested in another technology or another country, please, feel free to explore the issues with these different search terms. The exploration will be done in two stages: first, explorative, and then more systematically.
1. Exploration of patent databases
We have already encountered the use of patent statistics in one of the first sessions of this class (Sahal’s (1981) Pythagorean Perspective). Patent statistics are often based on using the database of the US Patent and Trade Office (USPTO) because this database is believed to provide us with a window on the remainder of the world: most companies will patent important inventions also in the USA (Jaffe & Trajtenberg, 1982). However, in parallel to the USPTO database, we also have the database of the European Patent Office (EPO), and the national patent databases (Sheu et al., 2006). (Among them, the Nederlands Octrooicentrum.) Additionally, there is an international database at the World Intellectual Property Organization (WIPO).
Patent databases are official registrations and thus the sites are freely accessible. However, they are not all equally easy to use for research purposes.
Let’s first turn to the USPTO database at http://patft.uspto.gov/netahtml/PTO/search-adv.htm. (This database can also be accessed at http://www.google.com/patents.) Click in the left column on Patents > Search > Advanced Search. Search with the following string: ttl/”renewable energy” (title includes “renewable energy”). If correct, you should have a return about 21 records. Study some of the records. Try breaking the search down into more components (ie. ttl/”geothermal energy”) and compare the results.
Extend your search to find inventors and/or assignees specifically located in the Netherlands using the corresponding search strings. Do not get disappointed with zero hits because the database is about inventions patented in the USA. Try a few other countries or in the case of the USA, use US-states as address fields. Try using different search criteria and terms. The USPTO itself provides statistics by country and by (US) patent class at http://www.uspto.gov/web/offices/ac/ido/oeip/taf/reports.htm .
Let’s repeat the exercise at the European Patent Office and World Intellectual Property Organization databases.
The EPO database can be found at http://ep.espacenet.com/ . Always use the advanced options for bibliometric searching. This time we can find approximately 250 records with “renewable energy” in the title (using the search option “Worldwide”). Can you explain the difference? Would you know a way to refine your searches at the EPO? (I was not able to find it.) Explore also the option of “Classification search”.
The WIPO database can be found at http://www.wipo.int . On the search page for patents you can find the “Field codes” (that is, the searchable terms) on the upper right side. Repeat the searches which you did above for the USPTO database and compare the results.
What are your conclusions from comparing these three databases? Write one paragraph about this as part of the paper to be handed in at the end of the class (see bottom). Consider also the advantages and disadvantages of using http://www.freepatentsonline.com/search.html as an alternative? How many patents can you find for ttl/“renewable energy” using this database? Explain the difference.
2. Exploration of the Science Citation Index
Using an IP address at the University of Amsterdam, you have access to the Science Citation Index via the so-called Web-of-Science. The database can be found in the digital library of the university (http://www.uba.uva.nl ) or directly accessed at http://apps.isiknowledge.com/WOS_GeneralSearch_input.do?product=WOS&search_mode=GeneralSearch&SID=P1AoBnfoMDhGn9ggA1L&preferencesSaved=&highlighted_tab=WOS .
Type “renewable energy” at the Search and evaluate your findings.
Let us now be more precise. Select the “Web of Science” as product, use “advanced search” and refine your search so that you have more control over the selection. Cross search uses an index which is generated automatically. Can we do better? For example, if one clicks among the field tag information next to “SO= Publication Name”, one obtains access to journal titles and can delimit the search domain as follows:
After composing the search, click on OK and continue with combining this set with the papers from the Netherlands. You should find 25 or more papers. Browse through these papers. Use the various analytical options provided by the database at the right side of the window for browsing (“Sort by”, “Analyze Results”, “Citation Reports”, etc.). In a new window, briefly repeat the exercise while again expanding the search criterea to include specific types of renewable energy. What happens? Keep both widows open, as we will come back to them shortly.
Try to analyze your data. Does this lead to follow-up questions? Would you be able to raise a research question about our topic of study (“renewable energy”)? How would you proceed systematically to research this area? What would be relevant dimensions and why? Can you write a one-page research proposal about this subject for the next time in class?
In this section we examine ways to analyze the data we have collected in terms of systems of relations. Why do we need relations? Authors may be related to different titles and titles to different authors. Thus, networks of relations can be spanned. A common measure of such relations is the extent to which papers cite the same previous papers. This is called bibliographic coupling. Similarly, co-citation is the configuration that a paper is cited by—rather than citing from—other papers.)
3.1. at the level of articles
At the bottom right side of the screen, you will have noted the option “Output Records”. Here you can save records – at a maximum of 500 at a time - for further processing. Two freeware programs are available at http://www.leydesdorff.net/software/isi/index.htm which allow the user to organize these files into “relational database management.” The simpler of the two programs is ISI.EXE which will operate on your saved records and organize them so that they can be read into MS Access. (Each table can also be read into Excel or SPSS, but these programs do not allow the user to make relations between the tables.) BibCouple.exe allows you to generate a file called “cosine.dat” which will enable you to visualize this network directly. (The file matrix.dbf comes available for statistical analysis using SPSS.)
Cosine.dat can be used for Pajek, a common and freeware network visualization program. Download it at http://vlado.fmf.uni-lj.si/pub/networks/pajek/ and install it at C:\pajek (or the desktop). Run your set with BibCouple.exe, open Pajek, and read the file “cosine.dat” into the program (by using File > Network > Read). Go to “Draw” and draw the figure. Explore the options. Use under “Layout” > Energy > Kamada-Kawai.
3.2. at the level of journals
At http://www.leydesdorff.net/jcr07/cited you can find the files with similar cosine values for all journals ready for use into Pajek. Find the file for the journal with the title Renewable Energy, save the file, and read it into Pajek. Draw the picture; explore the options. Use Kamada-Kawai (layout--> energy --> Kamada-Kawai) to arrange the nodes more systematically. Also, try Options --> Size --> of Vertices defined on input file. The vertical size of the nodes displays each node’s relative impact among this set of journals. (The horizontal size corrects for the “within journal self-citations.”) Embellish your picture using the various color and size options.
In another research project we created the following picture for the patent portfolio of China in 2005.
This picture is based on the International Classification Codes retrieved at the WIPO database using 1128 patents. The file contains 83 classification codes of which 65 are related. The pattern is shown in this picture. The nodes are sized in accordance with the logarithm of the number of patents in the corresponding category.
Try to generate this picture using the input file which is available at http://www.leydesdorff.net/wipo/china.txt . This is an input file for Pajek.
3.4. co-word analysis
The production of such input files for Pajek requires a number of in-between steps. At http://www.leydesdorff.net/uspto/uspto.exe a program can be found which allows you to sort a number of saved patents of the USPTO database into files. The patents have to be named p1.htm, p2.htm, p3.htm, …., etc.; the program prompts for the number of patents. Once the patents are processed, you can open the table with the titles and use this table, for example, for co-word analysis using a program at http://www.leydesdorff.net/software/ti/index.htm . A co-occurrence map of words may provide us with information of the semantic organization of a text, a document set or even en discourse (Leydesdorff & Hellsten, 2005).
One can use this program (ti.exe) also on the files which were saved from the Web-of-Science. Alternatively, one may have a set of documents (e.g., from other research) and use the corresponding facility for analyzing full texts (fulltext.exe) for the exploration of the semantic structures in the set.
Please, hand in for next week:
A.) One paragraph about your conclusions from comparing the three patent databases and
B.) A one-page research question.
Jaffe, A. B., & Trajtenberg, M. (2002). Patents, Citations, and Innovations: A Window on the Knowledge Economy. Cambridge, MA/London: MIT Press.
Leydesdorff, L., & Hellsten, I. (2005). Metaphors and Diaphors in Science Communication: Mapping the Case of ‘Stem-Cell Research’. Science Communication, 27(1), 64-99.
Sahal, D. (1981). Alternative Conceptions of Technology. Research Policy, 10, 2- 24.
Sheu, M., Veefkind, V., Verbandt, Y., Galan, E. M., Absalom, R., & Förster, W. (2006). Mapping nanotechnology patents: The EPO approach. World Patent Information, 28, 204-211.