Patents and patent citations
Patents and patent citations have been used by many authors to shed light on the innovative processes and products resulting from years of research and development within a firm or institutional setting. There is much data to be found within patents and patent citations, that may help a researcher analyse various inputs and outputs by analysing the patents granted to these firms and institutions. But first, in order to recognise what data is important to a researcher, we must look at what a patent is exactly.
Patents are, in very basic terms, the right to appropriate returns from research (Reitzig 2004). They, in effect, exclude other firms from practicing or producing the same processes and products. A patent delineates a piece of knowledge, by placing, in writing, the knowledge contained within the claims and descriptions within the patent document into a legal realm where the knowledge is protected by law against infringement. In order for a patent to be granted, the knowledge contained within the claim must be novel, inventive, industrially applicable, and useful. The United States Patent and Trademark Office (USPTO) give this definition of what a patent is:
The right conferred by the patent grant is, in the language of the statute and of the grant itself, “the right to exclude others from making, using, offering for sale, or selling” the invention in the United States or “importing” the invention into the United States. What is granted is not the right to make, use, offer for sale, sell or import, but the right to exclude others from making, using, offering for sale, selling or importing the invention. Once a patent is issued, the patentee must enforce the patent without aid of the USPTO. 
In addition to national patent offices, the EU has developed a European Patent Office (EPO). The office of the World Intellectually Property Organization can issue so-called PCT-patents. PCT stands for Patent Collaboration Treaty. To harmonise patent processes across the world, the OECD states that a patent is a member of a patent family (such as the one above) if and only if it is filed at the European Patent Office (EPO), the Japanese Patent Office (JPO), and is granted by the US Patent & Trademark Office (USPTO) (Eurostat, 2006).
Patents contain vast amounts of technical data, consisting of information pertaining to the assignee and country of assignee amongst many data variables and the data contained is supplied on an entirely voluntary basis which makes them important if only considering the information contained within (Hall 2000). When considering the usefulness of patents in an analysis, it is important to note that the sheer number of patents is less important due to the fact that the value of a patent may vary widely. A simple patent count may be used to adjudge a firm’s R&D spending during that period but numerous studies have shown that simple patent counts do not provide good indicators for much more than previously mentioned (Trajtenberg 1990). Patents have been used to illustrate the value of a technology but with limited success due to the degree of variance of the economic importance and value derived from the patents themselves (Trajtenberg 1990). Valuable data may be gathered from the patent itself, not only from the information pertaining to the art itself, but also that any references to another patent provide a wider sense of the state of the art when related to a specific technology and the innovation within the specific field (Archibugi and Pianta 1996). In a study of the perceived value of patents, Harhoff et al (Harhoff, Narin et al. 1999) conducted a study in which they found that the greater the number of times a patent was cited, the greater the economical worth of the patent, which leads us to discuss patent citations.
Patent citations work in much the same way as academic paper citations work except that instead of the citation being based on a voluntary scheme (such as with academic papers, where you only cite authors when you use some of their ideas etc), patent citations are added not only by the applicants of the patent, but also of the examiners of the patent application. Patents citations are determined by the examiner who, with the help of the data supplied by the applicant and their attorney, determines what citations are relevant or not (Leydesdorff 2006a). With these citations one can map, just as we did with the author and journal citations, the progress in a sense of the knowledge contained within the patent document. Trajtenberg (1990) argued that the number of citations of an individual patent was important and included a quote from The Office of Technology Assessment and Forecast in 1976 to demonstrate this.
…During the examination process, the examiner searches the pertinent portion of the “classified” patent file. His purpose is to identify any prior disclosures of technology …which might anticipate the claimed invention and preclude the issuance of a patent; which might be similar to the claimed invention and limit the scope of the patent protection….;or which, generally, reveal the state of the art of the technology to which the invention is directed…If such documents are found they are made known to the inventor and are “cited” in any patent which matures from the application…Thus, the number of times a patent document is cited may be a measure of its technological significance.
The number of citations a patent has can also been seen to be linked to the market value of the company owning the patent and the value of the technology (Hall, Jaffe et al. 2005).What we see is that it’s not just what patents cite yours, but also how many, which may determine the eventual value status of your patent and accordingly, your product.
Of course it’s not only citations within patents that can help an analysis, various other data contained within the patent documents also shed light on the subject you are investigating.
Almost all nations provide online access to their national patent databases. The European Patent Office provides an advanced search engine at http://ep.espacenet.com/advancedSearch?locale=en_EP which allows you to search worldwide. The World Intellectual Property Organization (WIPO) provides the so-called PCT patents online at http://www.wipo.int/pctdb/en/ . Only the USPTO database contains also the citation information. Note that the number of citations of a patent can increase day-by-day. Thus, it is important to note the date that you access the site.
Go to the USPTO database online at www.uspto.gov and in the left column, click on “patents”, then on “search patents”. On the left hand side of the screen, click on “advanced search” and it will take you to a basic search screen. The “query” box is where you would input various searches. Remember though that it is not a simple word search such as with Google, but the USPTO uses field codes to help narrow your search. The explanations for the various field codes can be found below the query box and if you click on any of them it will give you a more detailed description of what is involved.
Let’s do a basic search:
In the query box type “ttl/computer”. This will provide results for all patents that have the word “computer” in the title. There should be more than 25,000 search results. The first result is the newest patent granted with the search word in the title. Now if we change the search to “computer interface” as the patent title, see what you get. You need to add the quotation marks around a group of phrase.
Let’s say we’re interested in touch activated computer interfaces, if we modify the search to ttl/ “touch activated computer interface” we get 0 results. But if we now include another operator term, spec, to the search terms as such,
ttl/”computer interface” and spec/touch
we get 36+ results. Adding different search terms allows us to delve deeper into each patent. The “spec” term signifies that the search must look into the description and specifications of the patent but the same words must appear in the title. If we were to use the word “or” instead of “and” we would get over 117 000 results. This is because of the basic logic operators the search uses.
A patent can be broken down into many sections:
These sections relate directly to the knowledge content within the patents (the “what” part), and other sections relate more to the “who”, “where”, “when” of the patent such as what company is the patent granted to (AN), what country the patentee is based (ACN), what the inventor’s name is (IN) and so on. Refer to the help section as described earlier for more examples.
In the help section click on “How to use the advanced search page” and you will see some examples of the nested quick expressions or logic operators and how they work.
Some search logic operators include:
Have a look through the “help” section on the advanced search page, and click on “tips on field searching” to familiarise yourself with some of the search language involved, and how to correctly use the nested quick expressions.
For practice, search for patents issued between January 2000 and September 2006 with the title containing the word LED but not related to flashlights that use LEDs. You should get 885 patents. Remember what your content search terms as well as your operator search terms are. Wild card operators in the USPTO database are signified by a $.
Isd/200001$->200609$ and ttl/(LED andnot flashlight) andnot spec/flashlight andnot aclm/flashlight
Now that you have some of the basics down, and you have narrowed the list of patents that you think are relevant to the research and analyses you want to perform, you can download the relevant patents to your computer. The greatest benefit of having an automated download is that you do not have to click on each patent, then save it as html, and remembering what order you saved them in. Of course in the case of you needing only ten patents, you could do that, but if you want to download 900 patents then you will regret not using an automated program.
To do this, first define your search in the advanced search page. Only when you are happy with your search terms and the expected results (sometimes you may have to cast your search in wider terms to be sure you have collected everything that is relevant, because it is easier to delete what you don’t need in your analysis than to have to repeat the search process to find all that you need).
Once your results have been returned, you will see a total number of patents returned.
My search terms were:
ttl/((blu-ray or bluray) andnot (hd-dvd or hddvd)) or abst/((blu-ray or bluray) andnot (hd-dvd or hddvd)) or spec/((blu-ray or bluray) andnot (hd-dvd or hddvd))
It returned 94 patents. (Of course, you can use other search terms.) Click on the button “next 50 hits” and the click on the 4th or 5th patent down the list. Now that you have the patent in front of you, have a look through and read the text, just to see what a patent looks like. We have the title, abstract, inventor name, assignee name, US class, references, claims, description and so on, all of which can be used as search terms mentioned earlier.
Copy the URL of the patent into a text file and read it. Here is the URL of one of the patents I asked you to search for earlier regarding LEDs.
If you examine it, you can see how the USPTO patent results come about. You can see the operator terms, the search words, the patent number and if you look at the highlighted sections, you can see which result on which page your patent was. In this case, result number 64 on page 2. These two terms are what directs the automated download program along with your search terms. The download program, uspto1.exe, uses Visual Basic coding to send requests to the USPTO database using the search terms above, but knowing how many results are displayed on one page, it also tells the database in essence to “turn the page” when R=50 or any multiple of 50. That way, the program clicks the “next 50 hits” button so you don’t have to.
Downloading the Patents
Open a new folder and place the uspto1.exe file into it. Run uspto1.exe and paste the same URL that you looked at earlier into the indicated space. (The program works only with an address like the above one, that is, provided for patent numbers larger than the first fifty. The first fifty contain a different format.) Enter the number of patents and click run.
The program will now start the download from the first patent to the indicated number. Once you have your patents in the designated folder, they appear as html files with the name p1.htm, p2.htm, p3.htm etc. Additionally, a file searches.txt is generated (or appended) which registers the search strings sent to the database. (The routines access the Internet using the Microsoft Internet protocol in the file MSINet.OCX. If this file was not yet installed when installing another program, an error message may be generated by Windows since the file is not installed with the original installation of Windows. This error can be solved by following the instructions at http://www.leydesdorff.net/software/patentmaps/ocx.htm.)
Analysis of Patents
To analyse the patents we open the program uspto2.exe in the same folder as your downloaded patents. When prompted enter how many patents are to be analysed.
This program will search the html files for key words, such as assignee and title, and convert them into dBase files, which are accessible in both Excel and Access. They will be saved in the same folder as your patents and will look like this:
We will be using Access from this point on, so open Access on your computer. Open a new file and click on “blank database”. It will ask you to save it. Save it under whatever name you choose. The next screen will give you a smaller window with tables, queries, forms etc in the left column. Go to file, then get external data then import. Navigate to the files that uspto2.exe produced and double click on each one. Make sure to click only on the .DBF files, not the .DBT files. Once these have been imported, you will see them under the “tables” section in the smaller window. If you click on any of these, it will bring up the table related to them.
Look at all of them and see what data each table holds. For example, clicking on the TI file brings up the patent number, the title, year, date, abstract application number etc. (One can use the titles for drawing a semantic map using ti.exe.) If you click on the USCLASS table, it brings up the technological class in which the patent was granted. (With a bit of creativity in database management, you can also export the classes so that you can draw a cosine-based map among them.) The numbers alongside signify if it is the original class (1) or cross-reference class (2-). If you right-click on any “1”, then click “filter by selection” it only shows records with 1 in that field, so showing the original classes of all the patents. As there are different tables, each containing different fields of interest, we need to link them to make sense of them. To do this, go to the main window (F11) and click on the “relationships button” on the main menu. A smaller window should open that asks you which tables you want to add together. Highlight each table you want, then click “add”. In our case, let’s say we want to see the assignee name, the patent number, what primary class the patent is in, what country the assignee calls home and what year the patent was issued. So highlight TI, ASS, USCLASS and INV then click “add”. Each table now pops up on the working window. You can see that each has a scroll down list of what characteristics it has inside. Since we want to show what class, country, assignee and year our patents belong too, we need to link the tables using a unique identifier. Our unique identifier is the order in which it was downloaded, as it is the same for all the tables. So find the “nr” in each table and click and drag it to the “nr” in a different table. It will ask you to create a relationship. Click yes. An example is as shown:
Figure 1. Relationship window in Access
Once you’ve done that, close the relationship window and save it when prompted.
Now click on Queries in the main window, and create query in design window. It will now show the same window as for the relationship one. Highlight the tables that have the relevant data then click add. The tables will appear in a grey window and then click and drag the relevant sections from each table that you want to appear.
Figure 2. Query window with fields of interest.
Once you have that, click on the red exclamation mark at the top of the screen to display all your results.
From the results that come up, you can see that some patents are shown more than once, but this is due to the patent belonging to many different US classes, so right-click on the 1 in the CLASSNR field and exclude the other results. You may still see some duplicate records but these are due to the way Access treats each record. To change this, go to the query work window by clicking on the view icon in the main menu (it looks like a pencil and protractor). Then right click on the grey area, then properties and change unique values to “yes”. Then click the red exclamation mark again.
Using Access and the tables that you imported from the results of the uspto2.exe, construct a query table for the patents that you would like to reference. Be sure to include the PATNR column as this contains all the unique patent numbers. Copy and paste this list of numbers into a text document called “list.txt” (remember how many patents you copied across). Place the list and the Patref0.exe program into the same folder, run the patref0.exe program and enter how many patents you would like processed. Patref0.exe generates a set of files p1.htm, p2.htm, etc. on your disk.
Patref1.exe can be used to extract from these files another list—called “list2.txt”—which can be used as input for Patref2.exe. Patref2.exe accesses the USPTO databases and downloads the patents citing the patents listed in list2.txt. This program acts in much the same was as clicking the “referenced by” button in each patent document shown in Figure 3, but it does it automatically for the set of citing patents. If one is interested only in the number of citations, one can find this information in the file citing.dbf which is generated by patent1.exe; it is then not needed to download the citing patents in a next step.
Figure 3. Patent document and its patent references
When you do click the button it takes you to another page with either a “0 results” message, a list of patents that cite the one in question, such as with Figure 4, or to a single patent if there is only one citing patent. Patref1.exe will produce the text document called “list2.txt” that contains the URLs of all the cited patents.
Copy and paste list2.txt into another folder which contains patref2.exe. It might be a good idea to paste list2.txt into Excel before hand to give an exact number of URLs contained in the list. Run patref2.exe and enter how many patents when prompted. Patref2.exe will then start downloading the citing patents into that folder in the same way as uspto1.exe did: p1.htm, p2.htm, etc.
Figure 4. List of citing patents
Once you have the citing patents in the folder, run uspto2.exe on these patents and you will get the same dBase files, TI, ASS, INV etc for the citing patents. These can be imported into access and opened in the same way as before. It is a good idea to open another access database and name it “citing”. It also helps to change the name of the TI , ASS, INV, etc. in each database (the original and citing) to avoid confusing them as you will be combining them in the next step.
Matching Original Patents with Citing Patents
Now that we have the original patents from our initial search on the USPTO database, and the citing patents we need to match them to each other to maximize the data within. To do this, it may be wise to open a new Access database (you can combine the two but it becomes confusing if you do). Name this database as you choose. Import the .DBF files from each set of steps you followed above, so in the tables window in Access you would have the ASS original.dbf, TI original.dbf etc , the CITING.dbf, and the ASS citing.dbf, TI citing.dbf etc.
Open a new relationship window and add the various files as you did before. Be sure to include the relevant tables. As there are now two sets of files with the unique identifier of nr1, nr2, nr3 etc, we cannot link them this way as they refer to different patents when comparing the original and citing patents. To link them we use the CITING table as these contain the patent numbers which are unique to each set of original and citing patents. Figure 5 shows the correct links between the different patents.
Figure 5 Relationship windows to link the original patents and the citing patents.
From Figure 5 one can see that the different tables produced by uspto2.exe are linked by NR in each table but these are then linked by the CITING table and this link is between the patent numbers of each group.
Once you have saved this relationship, if you open a new query, and add tables and information you need, the relationships will appear in your query as well.
With this relationship, you can examine certain patents in each class, find their assignees and nationality and compare them with the citing patents, which classes they belong to, which assignees and nationalities and so on.
Thomas Gurney & Loet Leydesdorff,
Amsterdam, June 2007
Archibugi, D. and M. Pianta (1996). “Measuring technological through patents and innovation surveys.” Technovation 16(9): 451-468.
Hall, B. H., A. Jaffe, et al. (2005). “Market Value and Patent Citations.” RAND Journal of Economics 36(1): 16-38.
Hall, B. H., Jaffe, A. B., Trajtenberg, M. (2000). Market Value and Patent Citations: A First Look, National Bureau of Economic Research Cambridge, Mass., USA.
Harhoff, D., F. Narin, et al. (1999). “Citation Frequency and the Value of Patented Inventions.” Review of Economics and Statistics 81(3): 511-515.
Leydesdorff, L. (2006). “Patent Classifications as Indicators of Cognitive Structures.” (in preparation).
Reitzig, M. (2004). “The private values of'thickets' and'fences': towards an updated picture of the use of patents across industries.” Economics of Innovation and New Technology 13(5): 457-476.
Trajtenberg, M. (1990). “A Penny for Your Quotes: Patent Citations and the Value of Innovations.” RAND Journal of Economics 21(1): 172-187.