Knowledge base
- DATA
-
Stock>>
Indices>>
Compustat >>
Mutual Funds >>
Ziman >>
STOCK1. Why does the history for PowerShares QQQ Trust only extend to April 12, 2007?
3. What is the meaning behind an exchange code of 0?
7. Are there advantages to using CRSP’s PERMNO instead of CUSIP number?
8. How are Shares Outstanding data collected and maintained??
9. What are ADRs and how are the shares outstanding computed?
10. What are equity securities and are they all included in the CRSP database?
12. Why isn't AX (Archipelago Holdings Inc) covered in the CRSP Stock Database?
1. Why does the history for PowerShares QQQ Trust only extend to April 12, 2007?
PowerShares QQQ Trust is an exchange traded fund; there are some special considerations regarding its association with another ETF security, NASDAQ 100 Trust Series I.
- PowerShares QQQ Trust begins its history on April 12, 2007 and the NASDAQ 100 Trust Series ends its history the previous day, April 11, 2007.
- The final distribution recorded for the NASDAQ 100 Trust Series lists PowerShares QQQ as its acquirer.
- By CRSP’s methodology, the security-level identifier, PERMNO, cannot be associated with multiple company-level identifiers, PERMCO. Thus, PowerShares QQQ cannot maintain the same security level identifier as the NASDAQ 100 Trust Series due to the securities having different PERMCO identifiers.
- To merge the return histories of both securities replace the initial null value for PowerShares QQQ Trust with the delisting return of the NASDAQ 100 Trust Series I.
Example:
2. Why do more NYSE-listed securities have missing closing bid & ask data during the period from 20001002 to 20010518?
In researching the bid and ask quotes, CRSP found a number of questionable data points. Registered market makers, usually off the primary listed exchange, periodically posted noncompetitive quotes that were captured as the closing bid and ask. These quotes had wide spreads, not typical of the day’s trading activity. In many cases, spreads were intentionally set as wide as possible with bids close to a penny and asks close to double the market price. Alternate quotes that more closely represented the trading activity were not immediately available, so a missing value is reported pending further review by CRSP. Roughly 5 % of all bid and ask data points are missing. Over one-third of these occur between October 2000 and May 2001.
3. What is the meaning behind an exchange code of 0?
If an issue leaves an exchange that is covered by CRSP (NYSE, AMEX, or NASDAQ) and later returns, the gap is marked in the Name History Array with an Exchange Code of 0. During this time, event data is not tracked and time series data is filled in with missing values.
CRSP will resume coverage of the security if it has primary listing on NYSE, AMEX, or NASDAQ and otherwise fulfills our universe requirements for the stock database. New PERMNO or PERMCO may be assigned, depending on events that took place during the off-CRSP period.
EXAMPLE:
PERMNO 83404
Namedt Enddt Cusip Ticker Company Name CLS SH Ex SIC
19960430 19990616 71892810 PHYX PHYSIOMETRIX INC 11 3 3840
19990617 20000611 71892810 PHYSIOMETRIX INC 11 0 3840
20000612 20010823 71892810 PHYX PHYSIOMETRIX INC 11 3 3840
20010824 20050729 71892810 PHYX PHYSIOMETRIX INC 11 3 3840This security ceased trading on NASDAQ on June 16, 1999. It resumed trading on NASDAQ on June 12, 2000. The interval between these dates is represented by a name line with Exchange Code set to zero, and the Ticker field blank.
4. When a security is delisted and then begins trading again at a later date, does CRSP pick it up again or is it lost forever?
If an issue leaves an exchange that is covered by CRSP (NYSE, AMEX, or NASDAQ) and later returns, the gap is marked in the Name History Array with an Exchange Code of 0. During this time, event data is not tracked and time series data is filled in with missing values.
CRSP will resume coverage of the security if it has primary listing on NYSE, AMEX, or NASDAQ and otherwise fulfills our universe requirements for the stock database. New PERMNO or PERMCO may be assigned, depending on events that took place during the off-CRSP period.
EXAMPLE:
PERMNO 83404
Namedt Enddt Cusip Ticker Company Name CLS SH Ex SIC
19960430 19990616 71892810 PHYX PHYSIOMETRIX INC 11 3 3840
19990617 20000611 71892810 PHYSIOMETRIX INC 11 0 3840
20000612 20010823 71892810 PHYX PHYSIOMETRIX INC 11 3 3840
20010824 20050729 71892810 PHYX PHYSIOMETRIX INC 11 3 3840This security ceased trading on NASDAQ on June 16, 1999. It resumed trading on NASDAQ on June 12, 2000. The interval between these dates is represented by a name line with Exchange Code set to zero, and the Ticker field blank.
5. What is when-issued trading? How are issues trading when-issued handled in the CRSP stock database?
Securities trade on a when-issued basis when officially they have yet to be issued, but they trade as if they have been. These transactions are formally settled after the securities have been issued.
INCLUDED IN THE SUBSCRIBER VERSION OF THE STOCK DATABASE
"Reorganization" - All of the shares of an established security are trading when-issued due to some type of restructuring
EXCLUDED FROM THE SUBSCRIBER VERSION OF THE STOCK DATABASE
"Leading prices" - All of the shares of a security are trading when-issued at the start of its trading history
"Additional" - A security issues more shares that trade when-issued, during the same time that the established shares of the security continue to trade regular-way
"Ex-distribution" - A portion of a security's shares trade when-issued and without entitlement to a particular distribution, during the same time that the rest of its shares trade regular way, with the entitlement to the distribution
6. What do the characters in the CUSIP number represent? Why do some values in the CUSIP field begin with a letter? Can type of security or share class be determined using only CUSIP number?
The CUSIP number consists of nine characters. The first six characters uniquely identify the issuer and have been assigned to issuers in approximate alphabetical sequence. The seventh and eighth characters identify the issue. The ninth character is used as a check digit and is not stored in the CRSP US Stock Databases. The CRSP/Compustat Merged Database (CCM) displays all nine characters of CUSIP number for covered securities.
CINS (CUSIP International Numbering System) identifies securities issued outside of the US or Canada that trade internationally. ADRs are issued in the US, so they are not in this category. Values for CINS begin with a letter indicating country or region.
While the seventh and eighth characters of the CUSIP number are used to distinguish different issues of the same issuer, there is no absolute scheme for determining type of security or share class solely on their values.
7. Are there advantages to using CRSP’s PERMNO instead of CUSIP number?
There are three situations that can cause missed links or mismatches when using CUSIPs from different sources:
1. CUSIPs can change over time for a security due to name changes or capital changes. If a database only contains the most recent CUSIP, or reassigns CUSIPs after trading stops, a backtesting universe identified by CUSIP will continue to drop links to the security data over time.
2. CUSIP allows a mechanism for third-parties to assign an unofficial CUSIP for a security otherwise unassigned. These CUSIPs contain a 9 in the 4th and 5th and/or 7th digit. If different third-parties select a different dummy CUSIP, the link between them can be missed or wrong. This is only an issue before 1968, before CUSIP existed, or in a few cases where foreign issues on US exchanges were assigned ISIN but never CUSIP.
3. CUSIP provides for the possibility of reusing a CUSIP, or equivalently, it may continue a CUSIP after a corporate event that could be considered significant enough to produce a new company or issue.
There is only one known case, PepsiAmericas in 2000 and 2001, when this occurred. CRSP’s name history for a security tracks changes in its CUSIP number, accessible by the unique identifier, PERMNO. On CRSP, a PERMNO can be associated with more than one eight-digit CUSIP number over time, but an eight-digit CUSIP number can only be associated with one PERMNO. CRSP never changes or drops CUSIPs that were ever active, so backtesting universes identified by CUSIP always link to the correct data. In the third situation mentioned above, CRSP will assign a dummy CUSIP to the older security to preserve the uniqueness of CUSIP to PERMNO.
8. How are Shares Outstanding data collected and maintained?
CRSP primarily relies on vendors for shares data. Shares Outstanding values are reviewed by CRSP researchers when a security is added to our database. Subsequent shares are run through an extensive set of CRSP filters which look primarily for magnitude jumps and drops in share values and reversals of values. Depending on the source, a lag of one or two months may be seen between when changes in Shares Outstanding values occur and when they are reflected in our sources.
9. What are ADRs and how are the shares outstanding computed?
Companies incorporated in foreign countries and trading on foreign exchanges can be traded in the U.S. stock market as American Depositary Receipts (ADRs).
An investment bank will buy shares of a non-US-traded stock in a foreign market and issue ADR shares on the US market. A depository bank handles the issuance and cancellation of ADR certificates.
The depositary bank sets the ratio of US ADRs per home country share. This ratio can be less than or greater than 1 in order to get the ADR in a comparable trading range with US equities. CRSP does not have a variable for the ratio of underlying ordinary shares to ADR shares. The shares outstanding values for ADRs listed in the CRSP stock database are ADR shares traded in the U.S. and not the underlying common shares.
10. What are equity securities and are they all included in the CRSP database?
In general, any security that represents ownership interest in an entity is equity. All the securities listed on CRSP are equity securities, but not all equity securities are listed on CRSP.
The following sharecodes are included on the subscriber version of the CRSP database:
10s - Ordinary Common Shares (domestic), Capital Stock, Global Registered Shares 20s - Certificates, Americus Trusts (exclusively sharecode 23) 30s - ADRs, ADSs (American Depository Receipts, American Depository Shares), New York Registry Shares 40s - SBIs (Shares of Beneficial Interest) 70s - Units (Units of Beneficial Interest, Units of Limited Partnership Interest, Royalty Trusts, Trust Units, Depository Units), ETFs (exlusively sharecode 73)The following sharecodes are maintained internally by CRSP, but are excluded from the subscriber version of the CRSP database:
50s - Warrants 60s - Rights 80s - Preferred Shares, Capital Income Securities (with interest rate) 90s - Bundled units, such as a common packaged with a warrant or right
Daily open prices are available for securities traded on NYSE, AMEX, and NASDAQ exchanges beginning June 15, 1992. They represent the first trade after market opens. For NYSE, additional daily open prices are available between December 1925 and June 1962.
If a security went public on a CRSP-covered exchange under the circumstances described above, its IPO data should be available.
12. Why isn't AX (Archipelago Holdings Inc) covered in the CRSP Stock Database?
Archipelago Holdings is an unusual case. It has unlisted trading privileges on AMEX, but its primary listing is on the Pacific Stock Exchange. Per CRSP convention, a security needs to have a primary listing on NYSE, AMEX, or NASDAQ for coverage in the stock database.
BACK TO TOP >> INDICES2. What is the coding scheme for the CRSP stand-alone index files?
1. If I create a portfolio based on a list of PERMNOs and one or more of the securities associated with those PERMNOs were delisted prior to the end of the given period, how are the returns for the remainder of the period calculated?
Once delisted, an issue is given no weight in the portfolio. The returns reflect the value- or equal-weighted average of the returns of the remaining securities.
EXAMPLE:
We create two portfolios. Port1 has two PERMNOs (10106 and 10107) and Port2 contains only one PERMNO 10107. PERMNO 10106 was delisted on 19950105.
PERMNO Begdt- Enddt 10106 19860312-19950105 PERMNO Begdt- Enddt 10107 19860313-20071130
Beginning 19950106, the first day PERMNO 10106 is no longer in the portfolio and the returns in Port1 match the returns on Port2.
Weight Usdcnt Ret port1 19950103 35565600.00 2 -0.015290 port1 19950104 35021812.50 2 0.007232 port1 19950105 35275100.00 2 -0.016467 port1 19950106 34642125.00 1 0.016771 port1 19950109 35223125.00 1 -0.006186 port1 19950110 35005250.00 1 0.012448 Weight Usdcnt Ret Port2 19950103 35513625.00 1 -0.015337 Port2 19950104 34968937.50 1 0.007269 Port2 19950105 35223125.00 1 -0.016495 Port2 19950106 34642125.00 1 0.016771 Port2 19950109 35223125.00 1 -0.006186 Port2 19950110 35005250.00 1 0.0124482. What is the coding scheme for the CRSP stand-alone index files?
Referencing the following coding scheme will allow users to determine file names in the stand-alone indices files. For example, the file DSIX.xls, is the Excel spreadsheet containing the daily indices for the NYSE, AMEX, and NASDAQ exchanges.
First character represents the frequency D = Daily M = Monthly Q = Quarterly A = Annual Second and third characters represent the data SI = Indices built on Market Capitalization Deciles SS = Indices built on Standard Deviation Deciles SB = Indices built on Beta Deciles Fourth character represents the exchange A = New York B = American C = New York + American O = Nasdaq X = New York + American + Nasdaq
COMPUSTAT1. What do the 6-digit, 8xxxxx series GVKEYs represent?
These represent records for Canadian companies, presented in Canadian dollars.
The CRSP Stock Database does not include trading that takes place on Canadian exchanges. If a Canadian company trades on a U.S exchange, there might be another GVKEY that links directly to a CRSP security.
Example
Compton Petroleum Corp, a Canadian company, has two records in Compustat. GVKEY 863673 reports Compustat data in Canadian dollars, while GVKEY 63673 reports Compustat data in US dollars.
863673 Selected Data Items: 12 12 - Sales (Net) Data Fiscal Item 12 Year Yearend SalesNet 2005 12 425.1620 63673 Selected Data Items: 12 12 - Sales (Net) Data Fiscal Item 12 Year Yearend SalesNet 2005 12 364.7580
Compton Petroleum Corp has a record in the CRSP Stock Database (PERMNO 91043) for its common stock trading on NYSE.
GVKEY 863673 is associated with PERMNO 91043 using linktype LX (Secondary link. Links to foreign company trading on a foreign exchange also with an issue trading on a US exchange. Care must be taken to avoid double-counting data if using a secondary link.)
863673 LNKBEGDT LNKENDDT PERMNO PERMCO LINKTYPE LINKFLAG 19950101 20051205 0 0 NR XXX 20051206 99999999 91043 50017 LX BBB
GVKEY 63673 is associated with PERMNO 91043 using linktype LC (Issue link that is a “standard” link. Company and price match.)
63673 LNKBEGDT LNKENDDT PERMNO PERMCO LINKTYPE LINKFLAG 20051206 99999999 91043 50017 LC BBB
BACK TO TOP >> MUTUAL FUNDS1. There are many cases when actual_12b1 equals 0.9999, but max_12b1 is much less than 0.9999. Why?
4. What is the unit of measurement for mgmt_fee?
5. How are CL, CS and CM dis_type codes defined?
6. Why do some mutual funds have mixed taxable and untaxed dividends?
7. How can I access CRSP Mutual Fund data?
8. What does a turnover value of 0 represent?
9. How are returns calculated for Mutual Funds?
1. There are many cases when actual_12b1 equals 0.9999, but max_12b1 is much less than 0.9999. Why?
The .9999 signifies that the fund does not currently have a 12_b1, even though they might have a maximum value they would allow if they were to instate one.
2. It appears that exp_ratio can be less than the mgmt_fee, even though exp_ratio includes the mgmt_fee. Why?
The expense ratio includes the management expense. What you do not see is any waivers that have been applied to bring the expense ratio down to a certain point. Currently CRSP does not track waivers.
3. The fiscal_yearend date always precedes begdt. If the fees are valid for the fiscal year ending on fiscal_yearend, what is the purpose of begdt and enddt?
The begin and end dates signify the dates for which the information in that row are valid. It is similar to the header table in the stock database. Any time one of the fields changes, a new record is created with the new information and a new set of begin and end dates. For instance, if a fund reports the same fund fee values with each data item the same for three months, then changes one of them, there would be one record with a three-month range between the begin and end date, and another record with a one-month range.
4. What is the unit of measurement for mgmt_fee?
The management fees are represented in percent format so that 0.5 represents .5%. Management fees are always a function of assets under management and not NAVs. For instance, if you have 100K invested in a mutual fund, over the course of a year you would pay 100,000 * .29% = $290 in management fees. To look at the total fees collected by a fund, multiply the Total Net Assets by the management fee. This is what the fund expects to receive in fees annually, not every month.
5. How are CL, CS and CM dis_type codes defined?
- CL=Long-Term Cap Gain
- CS=Short-Term Cap Gain
- CM=Mid-Term Cap Gain
The length of time associated with a particular term is defined by the tax code. Most often the short-term gain is for an investment sold at a profit that was held for less than a year. A long-term capital gain is defined as an investment sold at a profit that was held for more than a year. Mid-term capital gain is effective for July 29, 1997, through Dec. 31, 1997, for investments held for more than one year but less than 18 months. In this period a long term capital gain is declared on those investments held for more 18 months.
6. Why do some mutual funds have mixed taxable and untaxed dividends?
Tax-exempt funds may have a small portion of a taxable dividend. For tax-exempt funds, generally speaking, most if not all income dividends are exempt from federal income tax. It is possible that a tax-exempt fund could earn taxable interest income, in which case it would be distributed to shareholders as "taxable" income dividends. In these cases, a fund may be coded as having both DT and DU dis_type codes.
- DT=Taxable Dividends
- DU=Untaxed Dividends
7. How can I access CRSP Mutual Fund data?
The CRSP Survivor-Bias Free US Mutual Fund Database is provided in either SAS Data Sets or ASCII files. All SAS data sets may be accessed with SAS software version 8 or above. Almost all of the ASCII files are too large to be opened and parsed in Excel. Rather than using Excel or Microsoft Access, CRSP recommends that our subscribers load the ASCII data into a commercial-grade relational database or statistical package. CRSP includes generic create and load procedures with the ASCII data. These procedures, found in the file, mfdb_create_load_procedure.txt, may be used as a starting point for loading the data into the relational database of your choice.
When accessing the Mutual Funds data files, note that the Fund Header table is the central table for the database. It contains the most recent information for active and delisted funds. Researchers may find it useful to link information in this table to the other tables where information is categorized by data type, such as holdings, returns, and NAVs.
8. What does a turnover value of 0 represent?
The values for turnover represent a percentage for the fund. For example, a value of 0.54 represents 54% turnover, while a value of 1.39 represents 139% turnover.
Almost all the funds with a 0 turnover ratio are money market funds (100% cash allocation). A few funds that have a passive management style may also show very low or zero turnover.
9. How are returns calculated for Mutual Funds?
Mutual Fund returns are calculated as the change in NAV including reinvested dividends from one period to the next. NAVs are net of all management expenses and 12b-1 fees. Front and rear load fees, which are dependent on investment/divestment levels and time the investment is held are not included in the calculation.
The Dividends table in the Mutual Fund Database contains the NAV value at reinvestment. This value is used in the returns calculation.
When the first letter of the Distribution Type (dis_type) is "C" (Capital Gains) or "D" (Ordinary Dividends) returns on NAV are calculated as:
(Ending NAV / Beginning NAV) ( 1 + dis_amt / reinvest_nav) -1
where dis_amt = distribution amount.
When the first letter of the Distribution Type (dis_type) is "S" (Split), distribution amount is 0 and the NAV returns are calculated as:
(Ending NAV / Beginning NAV) ( 1 + spl_new / spl_old) -1
where spl_new is the new number of shares, and spl_old is the pre-split number of shares.
BACK TO TOP >> ZIMAN1. Why are the values of Concentration Ratio (conratio) and Herfindahl-Hirschman Index (hhi) the same for the equal-weighted and value-weighted indices in CRSP/Ziman Real Estate Data Series?
Concentration ratio (conratio) and Herfindahl Hirschman Index (hhi) are not dependent on the weighting of the stocks in a portfolio. They are dependent on the market cap of the different REITS that are eligible for the portfolio. The EW and VW indices are created from the same universe of stocks, so the conratio and hhi values are the same.
BACK TO TOP >> - INSTALLATION
1. What are silent installs and can I use them for installing CRSP data?
2. Can I run CRSPAccess on a network?
1. What are silent installs and can I use them for installing CRSP data?
CRSP databases may be installed in a non-interactive mode. Using a response file to answer the prompts generated by the InstallShield wizard, the install can be automated and run behind the scenes. The response file is a simple text file that contains the answers to the InstallShield. This may be used with supported Windows, Sun, and Linux operating systems.
While CRSP encourages you to use the silent option for typical installs with the default settings, you may edit response files for your own CRSP data file locations.
After copying the response file onto your computer, the command needed to run the silent install at the command prompt is:
Sun & Linux:
/cdrom/databasename/setupsolaris.bin -options /home/.../Responsefilename.txt
Windows:
setupwin32.exe -options\pathname\responsefilename.txt
A sample response file looks like the following:
################################################################################ # # InstallShield Options File Targeted to FAZ # # Wizard name: Install # Wizard source: assembly.dat # Created on: Fri Mar 04 11:38:24 CST 2006 # Created by: InstallShield Options File Generator # # This file contains values that were specified during a recent execution of # Install. It can be used to configure Install with the options specified below # when the wizard is run with the "-options" command line option. Read each # setting's documentation for information on how to change its value. # # A common use of an options file is to run the wizard in silent mode. This lets # the options file author specify wizard settings without having to run the # wizard in graphical or console mode. To use this options file for silent mode # execution, use the following command line arguments when running the wizard: # # -options "silentFAZ.txt" # ################################################################################ ################################################################################ # # Daily and Monthly Stock(200604), Solaris (IEEE Big Endian) # #The Wizard Value Command Line Option enables you to set a "global" wizard property # on the command line or in a response/options file. For example, use a Wizard # Value Command Line Option to set the expected response from the end user during # installation/uninstallation to the Replace Option property of a component. # # Global Wizard Property Description # ---------------------- ----------- # replaceExistingResponse Stores the end user response to whether they want # to replace a file that currently exists on their # system with the one being installed. # # replaceNewerResponse Stores the end user response to whether they want to # replace a file that currently exists on their system # with the one being installed if the existing file is # newer than the file being installed. # # removeExistingResponse Stores the end user response to whether they want # to remove a file that currently exists on their system. # # removeModifiedResponse Stores the end user response to whether they want to # remove a file that has been modified since installation. # Possible Values: # * yesToAll # * yes # * noToAll # * no # # overwriteJVM Determines whether to overwrite the "_jvm" directory, # if it already exists on the target system. The JVM # Resolution bean looks for the value of this property # which, if set to "no" or "cancel," prevents the # directory from being overwritten. Possible values are # as follows: # * yes # * no # * cancel # # Example # -G replaceExistingResponse=yesToAll # # -G replaceExistingResponse=yesToAll # -G overwriteJVM=yes ################################################################################ # # Daily and Monthly Stock(200604), Solaris (IEEE Big Endian) Install # Location # # Specifies to install or uninstall the product in silent mode, where the # installation/uninstallation is performed with no user interaction. -silent ################################################################################ # # Custom Dialog: License # # The initial state of the License panel. The accept and reject option states # are stored as Variables and must be set with -V # -V LICENSE_ACCEPT_BUTTON="true" ################################################################################ # # Custom Dialog: License # The initial state of the License panel. The accept and reject option states # are stored as Variables and must be set with -V # -V LICENSE_REJECT_BUTTON="false" ################################################################################ # # Daily and Monthly Stock(200604), Solaris (IEEE Big Endian) Install # Location # # The install location of the product. Specify a valid directory into which the # product should be installed. If the directory contains spaces, enclose it in # double-quotes. For example, to install the product to C:\Program Files\My # Product, use # # -P installLocation="C:\Program Files\My Product" # # For unix installations -P installLocation="/home/testcds/crspdata/" # For Windows installations -W SetWinProdInstallLoc_setid.value="C:\silentwindata2" ################################################################################ # # Daily and Monthly Stock(2006004, Solaris (IEEE Big Endian) Install # Location # # Override the Typical action of creating the crsp.kshrc file in the user.home area. # For example, to not install the crsp.kshrc file at all. use # # -P UnixCrspKshrc_fid.active="false" # -P UnixCrspKshrc_fid.active="false" ################################################################################ # # Daily and Monthly Stock(200604), Solaris (IEEE Big Endian) Install # Location # # The relative install location of the maz database in the product. # Specify a valid directory into which the # product should be installed. If the directory contains spaces, enclose it in # double-quotes. For example, "maz" will create a sub-directory fo the current # product installation location use # -P maz_fid.installLocation="maz" ################################################################################ # # Daily and Monthly Stock(200604), Solaris (IEEE Big Endian) Install # Location # # The relative install location of the daz database in the product. # Specify a valid directory into which the # product should be installed. If the directory contains spaces, enclose it in # double-quotes. For example, "daz" will create a sub-directory fo the current # product installation location use # -P daz_fid.installLocation="daz" ################################################################################ # # Custom Dialog: InstallType # # The Installation Type to be used when installing the product. Stored as a # Variable and must be set with -V. # -V IS_SELECTED_INSTALLATION_TYPE=typical ################################################################################2. Can I run CRSPAccess on a network?
CRSPAccess can be installed on a file server on a Windows XP operating system. Data, programs, and libraries can be installed onto a file server machine that can be accessed by users. A separate client installation program, client_environment.exe, is provided to configure workstations. Configuring a workstation involves installing program shortcuts to CRSP programs and setting environment variables on the client workstation.
Client_environment.exe can be set to run on a user or system level. At the user level, it sets the environment variables on a computer for the current user. On the system level, it sets the environment variables for all users of that machine and requires administrator privileges to run.
To Install:
- Install the software and data onto a mapped network drive. (For example: H:\CRSP292 for software, H:\crspdata for data).
- From the ACCBIN folder in the mapped network drive, copy Client_environment.exe onto each workstation that will be used to access CRSP data and software.
- At each workstation, run the Client_Environment.exe and specify the locations of the CRSPAccess (CAGS or CMGS) and data locations on the file server.
- Click on DO on the lower left corner of the client_environment.exe box.
- The CRSPAccess shortcuts can be viewed from your start menu.
Please note that at this point in time, CRSP cannot support configurations for client servers. Users may find success with client server installations, but because of the large number of iterations that CRSP cannot possibly replicate, we do not feel that we can adequately support these configurations.
3. When I try to install my CRSP database on DVD, I get an error message that the setuplinux.bin or setupsolaris.bin files are read only and cannot be executed.
CRSP creates DVDs with a "hsfs" default. If you get the "read only" error message, most likely your DVD reader is mounted as "udfs" and will need to be remounted as "hsfs" in order for the CRSP DVD to be read.
Your system administrator should be able to remount the drive with little effort.
To diagnose and confirm that this mounting difference is the problem, execute the following commands:
- Load the DVD onto your drive on Unix or Linux.
- Type the following command:
> df -n(If it's Unix)> df -T(If it's Linux)- After typing the above commands and pressing enter, the drive that mounts the DVD should list as "hsfs". If it lists as "udfs, the drive needs to be remounted as "hsfs".
- CRSPSift
1. What data can I access using CRSPSIFT?
2. What items are available for Indices and Portfolios when using TsQuery?
3. What are the options for installing CRSPSift for Single and Multipler users?
4. How difficult is it to move from CRSPAccess utilities to the new CRSPSIFT software?
5. What do I need to run CRSPSIFT?
1. What data can I access using CRSPSIFT?
This first phase of the CRSPSIFT software, CRSP’s new Windows interface, provides access tools for the 1925 and 1962 CRSP US Stock and Indices Databases, and the CRSP\Compustat Merged Database (For Compustat subscribers). Subsequent phases of the software will provide support for additional CRSP databases.
2. What items are available for Indices and Portfolios when using TsQuery
When using TsQuery for extracting data for Indices and Portfolios, while users will see the full range of items in the drop down menu boxes, a subset of these items return data. Lists of available data items follow:
Indices Daily and Monthly items:
- Capitalization, End of Period
- Capitalization, End of Previous Period
- Date
- Index Count Total
- Index Count Used
- Index Level of Returns
- Index Level of Returns on Income
- Index Level of Returns Without Dividends
- INDNO
- INDCO
- Returns
- Returns, Cumulative
- Returns on Income
- Returns on Income, Cumulative
- Returns Without Dividends
- Returns Without Dividends, Cumulative
Portfolio Daily and Monthly Items:
- Index Count Total
- Index Count Used
- Returns
- Weight Summation for the Members of a Portfolio
3. What are the options for installing CRSPSift for Single and Multipler users?
The first version of the CRSPSift software, available for general release in January 2007, was designed, tested and supported for installation on local disk for use with data installed either on a local disk or on a network drive.
CRSP plans for future versions of CRSPSift to have fully-supported options that include a range of network installations for both software and data, and efficient installations for multi-users on single computers.
Introduction of the CRSPSift software has raised questions among some of our subscribers regarding how the software can be installed under current license agreements as well as what installations can technically be executed.
Subscription Agreements
Simply stated, current subscription agreements pertain to CRSPSift for data access just as they do for CRSPAccess utilities. Note that depending upon your subscription classification, restrictions may apply to the number of seats and site licenses allowed per your agreement. Regardless of where and how CRSPSift, CRSPAccess, and CRSP data are installed, it is important that sufficient security exists to permit access to only eligible users, as defined in your subscription agreement.
Technical Installation Notes for CRSPSift Version 1.1
The most common installation requests that we receive from our subscribers are outlined below. Information applies in all cases when the CRSP databases are installed either on the local disk or a network file server drive with either read/write or read-only access. Note that when the data are installed on a shared network drive, if there is heavy concurrent use, network bandwidth limitations may prevent effective usage.
Single User on Single Computer - Single Software Install
CRSPSift has been extensively tested and is fully supported with the software installed on a local disk with either user or administrator privileges.
Multiple Users on a Public Computer – Single Software Install
Also tested and supported by CRSP, this configuration is often utilized in a lab environment and is typically done with Administrator privileges. An additional step must be taken after completing the regular installation of CRSPSift. A shortcut must be created for the SIFT.exe file in the “\CRSPSift\bin\” directory. The shortcut can be created either by the User or by an Administrator. Regardless of the privileges used when installing the CRSPSift, Data Environments that are created are shared by all users.
Administrators:
If the shortcut is created with administrator privileges, it should be copied and pasted in the “C:\Documents and Settings\All Users\Desktop\” directory so that all users can see the shortcut to CRSPSift on their desktop and can use CRSPSift by double clicking on the icon, but will not be able to edit or delete the shortcut. Optionally, the shortcut could be moved to “C:\Documents and Settings\All Users\Start Menu\Programs” to place the shortcut on the Start menu.
All users can create and edit Data Environments. CRSP recommends that users override default locations for storing output files and saved queries and instead, save to a location where they can ensure that they do not run the risk of their work being inadvertently overwritten or accessed by others.
Users:
If the shortcut is created at the user level each user will create the shortcut and copy it to his desktop. Data Environments are shared and as in the case above, users should exercise care to write output to safe locations.
Multiple Users on Single Computer – Multiple Software Installs
With User privileges, CRSPSift can be installed on a single computer under separate user accounts. This holds an advantage over a single install with a shortcut for All Users in that risk of files in a shared directory being overwritten is eliminated. This may be appropriate for some lab settings where a limited number of installations are needed or where a unique data environment per user is desired. With this configuration, CRSP recommends that rather than installing in the default CRSPSift folder, each user overwrites the default to point to c:\documents and settings\USERNAME\CRSPSift1.1. Regardless of where CRSPSift installations are placed, all users can point to either the same or different copies of the data.
Single Network Install for Multiple Users
Version 1.1 of CRSPSift does not allow network installations. This is due to machine level default settings to allow only FullTrust to the local machine in which the software is installed. Microsoft provides a Code Access Security Policy tool, caspol.exe, with its .NET Framework software to change these defaults.
WARNING: CRSP strongly recommends that those responsible for your network security be consulted before executing this command.
DISCLAIMER: CRSP has not extensively tested CRSPSift Version 1.1 when running from a network drive. Therefore, this is not a fully supported configuration.
In addition to security and support implications, consideration should be given to the behavior of the CRSPSift software when users are accessing it simultaneously or one at a time.
Executing the Code Access Security Policy Tool
The following command requires machine administrator privileges and must be executed on each computer where access is desired to grant FullTrust to the network location of CRSPSift.
C:\> c:\windows\microsoft.net\framework\v2.0.50727\caspol –m –ag 1.2 –url \\Myserver\MyShare\MyPath\CRSPSift\* FullTrust {enter} Microsoft ® .NET Framework CasPol 2.0.50727.42 Copyright © Microsoft Corporation. All rights reserved. The operation you are performing will alter security policy. Are you sure you want to perform this operation? (yes/no) yes {enter} Added union code group with “-url” membership condition to the Machine level.
CRSP values your feedback on which installation configurations best serve your needs and what you would like to see in future releases of CRSPSift. Please write to support@crsp.chicagogsb.edu or contact us at 312.263.6400, select option 2.
4. How difficult is it to move from CRSPAccess utilities to the new CRSPSIFT software?
The first phase of CRSPSIFT was created specifically with existing subscribers in mind. Our intent has been to create a smooth transition for users from the existing CRSPAccess tools to CRSPSIFT.
Users may copy existing request files created for use with CRSPAccess ts_print, into the TsQuery module of CRSPSIFT using the Direct Edit function. These request files may be run and saved as new TsQuery files. Alternately, if a user creates a new query file in TsQuery and would like to execute the request directly using CRSPAccess, the query file may be saved as an old ts_print request file and run in the old tools.
Command line utilities, stkprint, indprint, and cstprint, have new counterparts in CRSPSIFT that run in a clean interface. The new utilities are intuitive, yet provide the option for users to apply old command line syntax, for those who really like to be in the driver’s seat! Users can save and recall query files using all utilities, preventing the need to remember long, complex syntax.
5. What do I need to run CRSPSIFT?
- CRSPSIFT runs on Windows XP, Service Pack 2
- Microsoft .Net Framework 2.0 and Macromedia Flash Player 8 are also required.
- Installation of the software searches for these components. If they are not already installed, they are available on the CRSPSIFT installation CD. Administrator privileges are required to install these two components.
- CRSPSIFT software requires roughly 300 MB.
- CRSP 1925 or 1962 US Stock or Stock & Indices Databases, and the CRSP\Compustat Merged Database (for Compustat subscribers only)
- CRSPAccess
1. Why does the ts_print version label in CRSPAccess 2.97 read 2.95?
2. How can I retrieve adjusted stock data from CRSP?
3. What returns can be accessed for the S&P 500 Index?
4. How do I access S&P 500 constituents?
1. Why does the ts_print version label in CRSPAccess 2.97 read 2.95?
The label in the upper left corner of the interface reads 2.95. There were no changes to the interface for this release and this serves to reinforce its being phased out. To confirm that you are using CRSPAccess 2.97, you may open the version.txt file located in the \acclib\ folder of the software to confirm:
The Windows interface for ts_print is included for the last time in the 2.97 version of CRSPAccess. The interface does not provide filter options and indices references for the newly added NYSE Arca data, but will support INDNOs and request files that access these data.
CRSPSift for Windows supports all CRSPAccess functionality and provides an easy-to-use interface for all utilities as well as some new features. CRSP encourages Windows subscribers to install and explore CRSPSift if you have not yet done so.
The ts_print command line utility remains current and supported on all platforms: Windows, Linux and Sun Solaris.
2. How can I retrieve adjusted stock data from CRSP?
CRSP data are stored in their unadjusted form. CRSP prices, dividends, shares, volume, and dividend data can be adjusted for split events and distributions to make the data directly comparable at different times during the history of a security.
stkprint
Using stkprint utilities, an adjustment base date is chosen as the anchor date. All data on this date are unadjusted, and other data are converted, based on the split events between the base date and the time of that data. The adjustment base date is that which is selected by the user and from what point in time the data are normalized. All events are applied on the ex-distribution date.
Price and dividend data can be adjusted for just stock splits and stock dividends, or for all distributions, including splits and dividends, spin-offs, stock distributions and rights.
ts_print Adjusted items may also be accessed using ts_print, though it doesn't offer quite as much flexibility as the stkprint utilities. Ts_print provides adjusted prices, shares, volumes, and dividends. Values are adjusted for the entire range of the security's coverage.
Example:
Microsoft's calendar year-end trading prices between 1986 and 2005 are represented below.
- Columns 1-3 are accessed from mstkprint, using the split adjustment base date at three different points in time. The command /djyyyymmdd is used, where the date is the base date used for the adjustment, followed by a 0 to adjust for splits only, or 1 to adjust for all price factors.
- Column 1 adjusts data for the full time range, from December 31, 2005 and back.
- Column 2 adjusts data from 1986 forward. Since Microsoft didn't begin trading until 1986, the time series represented removes all splits for the full time range.
- Column 3 represents data that were split adjusted using a base date of December 31, 1995.
- Columns 4 & 5 contain price data accessed using ts_print and represent the two available options
- Column 4 adjusts data for the full time period, equivalent to using the first base date as December 31, 2005 - in column 1.
- Column 5 displays unadjusted data for the full time period.
mstkprint
ts_print
/dj200512310
/dj198603310
/dj199512310
adjusted price
unadjusted price
(1)
(2)
(3)
(4)
(5)
19861231
0.168
48.250
2.681
0.168
48.250
19871231
0.377
108.500
6.028
0.377
54.250
19881230
0.370
106.500
5.917
0.370
53.250
19891229
0.604
174.000
9.667
0.604
87.000
19901231
1.045
301.000
16.722
1.045
75.250
19911231
2.318
667.500
37.083
2.318
111.250
19921231
2.668
768.375
42.688
2.668
85.375
19931231
2.520
725.625
40.313
2.520
80.625
19941230
3.820
1100.250
61.125
3.820
61.125
19951229
5.484
1579.500
87.750
5.484
87.750
19961231
10.328
2974.500
165.250
10.328
82.625
19971231
16.156
4653.000
258.500
16.156
129.250
19981231
34.672
9985.500
554.750
34.672
138.688
19991231
58.375
16812.000
934.000
58.375
116.750
20001229
21.688
6246.000
347.000
21.688
43.375
20011231
33.125
9540.000
530.000
33.125
66.250
20021231
25.850
7444.800
413.600
25.850
51.700
20031231
27.370
7882.560
437.920
27.370
27.370
20041231
26.720
7695.360
427.520
26.720
26.720
20051230
26.150
7531.200
418.400
26.150
26.150
3. What returns can be accessed for the S&P 500 Index?
The S&P 500 Composite Index is a value-weighted index created by Standard and Poor's. Since March 1957, the index contains 500 securities. Prior to that time, the index was called the S&P 90, containing 90 securities. These have been combined into a single time series. S&P Composite levels are collected from public sources such as the Dow Jones New Service, the Wall Street Journal and the Standard and Poor's Statistical Service.
S&P 500 Composite Index levels and returns exclude dividends. As a result, the Return with Dividends (ret) variable returns a -88, or missing return code. S&P 500 Composite index levels, returns without dividends (retx), and cumulative returns without dividends (cumaret) can be accessed in the CRSP Stock & Indices databases.
In the CRSP Stock & Indices databases and the Supplemental Indices stand-alone files, total returns, cumulative total returns and other membership data are calculated by CRSP for the CRSP Value-Weighted S&P 500 and CRSP Equal-Weighted S&P 500 Indices are calculated by CRSP and provide total returns, cumulative total returns and membership data for the S&P 500 aggregated constituents.
INDNO Index Name Total Return
Cum TR
(ret, cumtret)Return w/o Div
Cum TR w/o Div
(retx, cumaret)Database 1000500 CRSP S&P 500 Value-Weighted YES YES Stock & Indices only
Supplemental Indices1000501 CRSP S&P 500 Equal-Weighted YES YES Stock & Indices only
Supplemental Indices1000502 S&P 500 Composite NO YES Stock
Stock & Indices4. How do I access S&P 500 constituents?
The S&P 500 historical constituent list can be generated by using the stkprint utilities. The ability to create it requires CRSPAccess with either the Daily or Monthly Stock Product and the Indices Product.
The basic commands using dstk_print or mstk_print on the command line are below. Since no date range is specified, default dates for a recent time window will be displayed and used.
dstkprint /g16 /fs /sq /of dsp500.list
CRSP 1925 Daily US Stock & Indices, data ending 20051230 Using default dates 20050930 - 20051231
mstkprint /g16 /fs /sq /of msp500.list
CRSP 1925 Monthly US Stock & Indices, data ending 20051230 Using default dates 20050930 - 20051231
A date range can be specified, as in the examples below:
dstkprint /g16 /fs /sq /of dsp500.list /dt1925-20051031
mstkprint /g16 /fs /sq /of msp500.list /dt1925-20051
