Decennial Surname Identifiers

These data contain disambiguated, person-level surname identifiers. These identifiers group together the same, or similar, surnames. We create a unique identifier for each surname and attach it to individual level records in the 2000 and 2010 decennial data. Several versions of the surname identifier are available: one that groups together only exactly identical surnames, one that groups together surnames based upon SAS DQ95 match codes, and one that groups together surnames based upon SAS DQ90 match codes. SAS DQ match codes blend soundex and other string normalization algorithms to create hash codes that group together similar bits of text. The SAS DQ95 will be ``more strict'' than a SAS DQ90. The surname identifiers are longitudinally consistent between the 2000 and 2010 decennial data, meaning a single surname identifier will be used to identify the same surname in the 2000 and 2010 surname ID files. With these data, users will be able to identify whether any two individual records in the 2000 and/or 2010 decennial census data share the same surname.

Note: Users accessing these surname identifiers must not have surname PII in their project space.

Data and Resources

This dataset has no data

Additional Info

Field Value
Record creation timestamp 2024-10-11 15:01 UTC
Record last modification timestamp 2026-06-26 19:39 UTC
Agency dataset ID 2155
Alternative title(s)
    Program
    Program Title
    Preferred citation
    Source(s) Census Bureau
    Other Source(s)
    URL(s) for general survey info
      Authorizer(s) Census Bureau
      Other Authorizer(s)
      Funder(s)/Sponsor(s)
        Other Funder(s)/Sponsor(s)
        Additional information about restricted dataset
        DOI
        Earliest Year of Data Available 2000
        Most Recent Year of Data Available 2010
        Smallest geographic unit Address
        Smallest geographic unit (other)
        Universe
        1. Surnames in the 2000 and 2010 Decennial Census
        Spatial coverage
        1. National (50 States, DC, Puerto Rico)
        Classification Demographics
        Unit of observation Individual
        Unit of observation (other)
        Sample

        We create a unique identifier for each surname and attach it to individual level records in the 2000 and 2010 decennial data. Several versions of the surname identifier are available: one that groups together only exactly identical surnames, one that groups together surnames based upon SAS DQ95 match codes, and one that groups together surnames based upon SAS DQ90 match codes.

        Frequency of data collection Every ten years
        Frequency of data collection (other)
        Method of data collection Other (specify)
        Method of data collection (other)
        1. Census Constructed Data
        Reference date
          Reference date (other)
          Data collection notes
          Number of cases
          Number of variables
          Linkage capabilities
          Linkage variables
          • Address
          • Master address file identification number (MAFID)
          Where to apply https://www.census.gov/about/adrm/fsrdc.html
          Fees

          Possible seat fee dependent on consortium membership. URL for more information: https://www.census.gov/about/adrm/fsrdc/about/fsrdc-network-fees.html

          Access modality FSRDC
          Usage restrictions

          Users accessing these surname identifiers must not have surname PII in their project space. Census Bureau restricted-use data must be used for statistical purposes and cannot be used for purposes of enforcing laws or regulations or for profit.

          Public Use File available
          Public-use version
          Summary of differences
          Geography differs from restricted-use file (RUF)
          Variable detail differs from Restricted Use File (RUF)
          FTI flag
          Trust level Trust Level 4
          Provisioned by state
            Provisioned by other geographic unit
              Provisioned by month
                Provisioned year(s)
                1. 2000
                2. 2010
                Provisioned by other time unit
                  Provisioning units other than time or geography
                    Can non-citizens apply Yes, if residency requirements are met
                    Variable selection requirement
                    Supporting documentation
                    Documentation for application