Research Computing >> Training & publications >> Ph.D. workshops, Summer 2004 >> Data for replication


Among the data necessary to replicate Baker and Wurgler is the list of IPOs between 1968 and 1999 they matched to Compustat. You may use a relatively clean list in SAS format available in the following path: /projects/nwu/pledesma. Do not circulate outside Kellogg any of the files listed in this page.

There are two versions of the file: bakerwurgler2.sas7bat and ipolist.sas7bdat.

The first one (bakerwurgler2.sas7bdat) includes matching CRSP PERMNO, CUSIP, company name (COMNAM, from CRSP), IPO date, starting and ending date of series in CRSP (st_date and end_date from CRSP's DSFNAMES file, respectively), the Compustat GVKEY, starting and ending fiscal year of the total assets series in Compustat (st_assets and end_assets, respectively), starting and ending fiscal year of the market capitalization data in Compustat (defined as data25*data199, st_mkval and end_mkval), the number of days ellapsed between the IPO date and the inclusion in CRSP (numday=st_date-date), the number of years ellapsed since the IPO year and availability of market capitalization data in Compustat (numyrs=st_mkval-year of IPO). Finally, it also includes the maximum number of years for which there is assets data and market capitalization data (yrsassets and yrsmkval, respectively). These variables should be helpful in screening firms with missing data on assets.

The second file, ipolist.sas7bdat, includes only two series: gvkey and corresponding IPO date. This file includes repeated GVKEYs for CUSIPs that appeared twice in SDC with different dates (see below).

To access it in your program, simply add a LIBNAME statement pointing to that directory. For example:

   libname ipos '/projects/nwu/pledesma/corpfin';

   proc sql;
     create table iposample as select, compann.yeara, compann.dnum,
     compann.data6, compann.data8, etc
     from ipos.ipolist, comp.compann
     where ipos.gvkey=compann.gvkey
           and yeara between 1968 and 1999 
           and dnum not between 6000 and 6999;

Construction of the IPO lists

The lists of IPOs and their corresponding GVKEY were constructed based on two spreadsheets kindly provided by Malcolm Baker, which contain lists of CUSIP numbers and dates:

  • ripo.xls: Includes the IPO dates and 8-digit CUSIP numbers provided by Jay Ritter between 1968 and 1995.
  • ipo.xls: Based on SDC Platinum, includes IPO dates and 6-digit CUSIP numbers between 1970 and 1999.

To match the CUSIP numbers to Compustat GVKEY, the CUSIPS in each lists were matched to the historical CUSIP variable in CRSP (NCUSIP) from the daily stock names file to retrieve the CRSP PERMNO. The PERMNO was then matched to the historical CRSP permno link to Compustat (NPERMNO) from the Merged CRSP-Compustat CSTLINK file. Any CUSIP numbers included in the SDC list that appeared in the Ritter list were discarded. Also discarded were duplicate GVKEY-IPO date combinations, and IPOs in 1999. However, there are 31 GVKEYS for which there are two IPO dates.

© 2001-2010 Kellogg School of Management, Northwestern University