1. Attachments are working again! Check out this thread for more details and to report any other bugs.

How to analyze traffic accident data ... FARS

Discussion in 'Gen 2 Prius Main Forum' started by bwilson4web, May 10, 2009.

  1. bwilson4web

    bwilson4web BMW i3 and Model 3

    Joined:
    Nov 25, 2005
    27,068
    15,372
    0
    Location:
    Huntsville AL
    Vehicle:
    2017 Prius Prime
    Model:
    Prime Plus


    Working with the National Highway Traffic Safety Administration (NHTSA) database, the Fatality Accident Reporting System (FARS) and excel, I've figured out how to analyze the data to extract detailed records. For example, in 2007, there were these known accidents involving pedestrians and Prius:
    Column 1 Column 2 Column 3 Column 4
    0 ST mm/dd/yyyy Vehicles Prius Model
    1 CA 1/5/2007 1 2004
    2 CA 1/24/2007 1 2006
    3 FL 1/26/2007 1 2007
    4 FL 9/12/2007 3 2006
    5 TX 10/18/2007 2 2005
    6 AZ 10/21/2007 2 2005


    There were 5,432 records in the accident file giving a raw rate of 0.11% of all accidents with involving both a pedestrian and a Prius. In this rough report, there were three accidents involving more than one vehicle and pedestrian. A more rigorous analysis would take just the single vehicle pedestrian accidents.

    GET THE ACCIDENT DATA
    1. Start with NHTSA Encyclopedia
    2. Click on "FARS FTP" link at the top
    3. Select the directory for the year of interest
    4. Download the ZIP file of the data
    5. Open file into a drectory
    You should now see a list of files:
    ACCIDENT.DBF
    miacc.DBF
    midrvacc.DBF
    miper.DBF
    person.DBF
    vehicle.DBF
    vehnit.DBF​
    DATA ANALYSIS

    If you DBase or Microsoft Access, these files should load fine and you can begin to analyze the data. However, I have a Macintosh and the built-in databases were unable to do anything with these files.

    In my case, a Macintosh with excel, which will "Open" any DBF file and each will load fully except for "person.DBF", which is too large to fit in one spreadsheet. But I only needed:
    ACCIDENT.DBF - all accidents for the year
    vehicle.DBF - all vehicles​
    The first row will have the name of each field per column and each subsequent row has the values. Since it loads them as text and floating point numbers even though there are no floating point fields, I select the spreadsheet and format all numbers without any fractions.

    "ACCIDENT.DBF"

    This file lists each accident for the year including a column for "PED", pedestrians. So I simply deleted all non-pedestrian accidents and found 5,452 records. A stricter file would keep records with just one pedestrian and one vehicle, ignoring the multi-vehicle accidents that also involve a pedestrian. Now this file does not include vehicle information but it does include "ST_CASE" (state case number) that is a universal index in all of the files.

    "vehicle.DBF"

    This file lists all vehicles involved in an accident and has a column for "ST_CASE" but in the wrong location. I make a copy of that column and do an 'insert paste' into the first column of the "vehicle.DBF" file. This will be the key needed to lookup each vehicle. But you will also need a number for each column. Going to the bottom of the spreadsheet, I add another row that is a counter for each column with the "A" column being "1."

    Adding Vehicle Information To ACCIDENT.DBF Columns

    Select a column in "ACCIDENT.DBF" and insert the function "vlookup". This function needs:
    1. ST_CASE cell - using an absolute reference for the column and relative row, identify the cell with the state case number.
    2. vehicle.DBF table - using just the data fields, not the title row nor column count row, identify the whole spreadsheet data as a table. On a Mac, I click on the upper left data cell and then SHIFT-click on the lower right.
    3. vehicle field "MAKE_MOD" - this field includes both the manufacturer and the last thee digits are the model number. The number 49046 is the Toyota Prius.
    Add as many additional vehicle fields as desired in additional columns.

    UNDERSTANDING THE DATA FIELDS

    Using Google, find a copy of "Analytic Reference Guide 1975 to 2006" (FARS06.pdf) for a text description of the data fields.

    WHAT TO DO NEXT?

    It would be great if we could find a gas-only vehicle with known sales figures and a similar sales pattern to the Prius to compare and contrast. For example, the Scion and Prius might give some insights if one or the other shows an abnormally high rate of pedestrian accidents versus their sales numbers. Anyone interested in helping with say the Scion sales figures?

    Comments?
    Questions?
    Suggestions?

    Bob Wilson

    ps. I wrote a quick "c" hack to pass through the "person.DBF" file and generate a comma delimited, text file. The tricky part was dealing with the binary 'endian' but once solved, it works well enough. This is not a 'general purpose' DBase file program, no run-time arguments, everything is hard-coded as needed to get the data. But if you have an interest, send me a PM and I'll forward the source.
     
  2. dogfriend

    dogfriend Human - Animal Hybrid

    Joined:
    Feb 26, 2007
    7,512
    1,185
    0
    Location:
    Carmichael, CA
    Vehicle:
    2007 Prius
  3. jayman

    jayman Senior Member

    Joined:
    Oct 21, 2004
    13,439
    639
    0
    Location:
    Winnipeg Manitoba
    Vehicle:
    2004 Prius
    Oh Bob, there you go again throwing hard facts and figures into a highly emotional debate. A certain talk radio host with a penchant for boner pills and prescription narcotics will be out to get you
     
  4. bwilson4web

    bwilson4web BMW i3 and Model 3

    Joined:
    Nov 25, 2005
    27,068
    15,372
    0
    Location:
    Huntsville AL
    Vehicle:
    2017 Prius Prime
    Model:
    Prime Plus
    Looking at the 2007 accident data, the NHTSA report DOT HS 810 993 covering 2007 fatal accidents, and the Prius sales history, I did a 'back of the envelope' analysis to compare "miles per pedestrian deaths" for all vehicles and the Prius:
    • 554,000,000 miles per pedestrian death, all vehicles (~4,700 pedestrians NHTSA)
    • 1,023,000,000 miles per pedestrian death, Prius (~7 deaths)
    Now there are a lot of variables that are not accounted for such as vehicle weight (lighter ones are less deadly,) speed (urban vs rural), and other factors. Also, we are looking at very small numbers of deaths, 7 associated with the 6 Prius pedestrian accidents. But I wanted to look at a worst case scenario. What is the worst case rate of Prius-pedestrian accidents?

    The model I used started with the Prius sales per year minus 5% loss per year and 15,000 miles per vehicle per year:

    Column 1 Column 2 Column 3 Column 4 Column 5 Column 6
    0 Year Sales Running in 2007 miles/year
    1 2000 5 562 3 884 58 260 000
    2 2001 15 556 11 435 171 525 000
    3 2002 20 119 15 567 233 505 000
    4 2003 24 627 20 058 300 870 000
    5 2004 53 991 46 290 694 350 000
    6 2005 107 897 97 377 1 460 655 000
    7 2006 106 971 101 622 1 524 330 000
    8 2007 181 221 181 221 2 718 315 000


    477,000 Prius on USA roads in 2007
    7,162,000,000 Prius vehicle miles in 2007
    6 pedestrian-Prius accidents in 2007
    7 pedestrians in the accidents (actual pedestrian fatalities may be less)

    1,023,116,000 miles per Prius vehicle per pedestrian death in 2007 (approximate)

    1.64 fatalities per 100 million vehicle miles in 2007 (NHTSA report DOT HS 810 993)
    60,975,610 miles per all vehicles per fatality in 2007
    11% the number of pedestrian deaths of all fatalities

    554,324,000 miles per all vehicle per pedestrian death in 2007

    This 'back of the envelope' data can not be used as an exact metric of relative risk. Variables such as vehicle weight, urban vs. rural, and other aspects have not been factored. A proper multi-variable analysis is what the National Highway Traffic Safety Administration should do. But based upon these and Dr. Christopher Hogan's 2008 report, the Prius appears to have no detectable, higher risk to pedestrians.

    Bob Wilson
     
  5. ChapmanF

    ChapmanF Senior Member

    Joined:
    Mar 30, 2008
    23,073
    14,982
    0
    Location:
    Indiana, USA
    Vehicle:
    2010 Prius
    Model:
    IV
    This is timely - I just noticed the hybrid noisemaker question has a bit in the May 9 - 16 issue of The Economist. I didn't have time to read it when I spotted it, and now I'll be all curious how they covered it until I have the issue in my hands again with time to read it.

    -Chap
     
  6. bwilson4web

    bwilson4web BMW i3 and Model 3

    Joined:
    Nov 25, 2005
    27,068
    15,372
    0
    Location:
    Huntsville AL
    Vehicle:
    2017 Prius Prime
    Model:
    Prime Plus
    You may want to scan the comments from the online version and the letters to the editor in the next print version. I have been pleased at the majority of negative comments these article receive. Folks genuinely see this as nonsense, laughable, even without the facts and data. Then adding the facts and data and even the supporters of this nonsense fade away. But that has nothing to do with how Congress thinks.

    Now I want to share this trace of my letter to my congressman from the postal service:
    Current Track & Confirm e-mail information provided by the U.S. Postal Service.

    Label Number: 0407 2085 9470 1206 1598

    Service Type: Delivery Confirmation(TM)

    Shipment Activity Location Date & Time
    --------------------------------------------------------------------------------
    Processed HUNTSVILLE AL 35813 02/17/09 6:25pm

    Acceptance HUNTSVILLE AL 35802 02/17/09 8:19am

    --------------------------------------------------------------------------------

    USPS has not verified the validity of any email addresses submitted via its
    online Track & Confirm tool.
    As near as I can tell, my letter has never been delivered. But this makes sense because of the anthrax scare. It appears there is a postal mail handling facility and it seem to delay and/or lose some mail or at least my letter to my Congressman. What this means is we need to use alternative methods.

    My plan is to put together a web site with all of the supporting facts and data. This allows us to write a one page letter with a summary in opposition and cite specific web pages with the backup facts and data. This will also work for Letters to the Editor and other communications.

    Bob Wilson
     
  7. CharlesJ

    CharlesJ Member

    Joined:
    Dec 13, 2008
    467
    29
    38
    Location:
    Monterey
    Vehicle:
    2008 Prius
    Have you also emailed it to him? No delay there:D
     
  8. hobbit

    hobbit Senior Member

    Joined:
    Mar 23, 2005
    4,089
    468
    0
    Location:
    Bahstahn
    Vehicle:
    2004 Prius
    Model:
    N/A
    Any rough stats on how many of those incidents were caused and/or
    aggravated by vehicles following too closely? I'd love to extract
    some numbers on that for another project I'm thinking about.
    .
    _H*
     
  9. bwilson4web

    bwilson4web BMW i3 and Model 3

    Joined:
    Nov 25, 2005
    27,068
    15,372
    0
    Location:
    Huntsville AL
    Vehicle:
    2017 Prius Prime
    Model:
    Prime Plus
    I couldn't find anything coded that way. See if you can google up a copy of "FARS07CVMAN.PDF," that covered the encoding.

    Bob Wilson
     
  10. ChapmanF

    ChapmanF Senior Member

    Joined:
    Mar 30, 2008
    23,073
    14,982
    0
    Location:
    Indiana, USA
    Vehicle:
    2010 Prius
    Model:
    IV
    Ah, figures you'd already have seen it.

    Also figures your comment seemed better focused than most. Funny how many commenters either didn't get that Rosenblum's work does show a perceptibility difference, or didn't get that the accident data show that it doesn't correspond to any pedestrian safety difference. That last is the only policy point that needs making (the question before making new policy should be "is there a problem?" not "is there some interesting difference?"). Too bad it can get so easily lost in a bunch of comments that are sort of all over the map.

    -Chap