Field Sobriety Tests Explained
Why to NEVER Take a Field Sobriety Test – Debunking an Avalanche of Misinformation and Deception About Field Sobriety Tests
By: William C. Head – Atlanta DUI Attorney
© 2016, William C. Head – All rights reserved
No Field Sobriety Test in Use Today by Police is Correlated to Driving Impairment – PERIOD.
The emergence of the Internet has led to a regurgitation of false and misleading information about the three “field sobriety test” exercises. Almost all of the misinformation is derived from law enforcement agencies connected to the federal government, state governments, or some sycophantic “research” project funded — directly or indirectly — by our federal tax dollars. Apparently trusting that the Government would NEVER deceive its citizens, well-meaning private and non-profit companies inadvertently post garbage about field sobriety tests on their websites.
When I searched Google for high-ranking websites, I saw the “AAA DUI Justice Link” ranking near the top of Google rankings. Under the guise of increasing traffic safety, this fawning driving safety organization spits out incorrect information about the accuracy of field sobriety tests, and posts hand-fed dribble from NHTSA and MADD statistics. These numbers about the reliability of field sobriety tests are inaccurate and clearly biased beyond doubt. The Government has altered field sobriety test guidelines and modified important field sobriety test procedures so often that 95% of all DUI arrests are made by police officers with outdated training.
I did find a disclaimer at the very bottom of the field sobriety test page published by AAA, which they predictably misnamed these evaluations “Standard Field Sobriety Tests”, and that disclaimer reads as follows:
The information and content provided on this site has been collected from various third party sources, and does not necessarily represent the opinions or judgments of AAA. AAA is not responsible for, and makes no warranties or representations as to, the accuracy of the information, which is provided “as is.”
Unfortunately, this important disclaimer is buried in the fine print (it was actually posted in a 7.5 font size, but I have enlarged it here), and the 40+ errors in the tripe published by AAA in citing proven government lies and deceptions about the claimed reliability of the 3 optional field sobriety test exercises is simply inexcusable. Below, I will clarify and dispel many of these errors about NHTSA’s field sobriety tests.
Just like the other bloated driving safety Website that has been highlighted in this article, FINDLAW has posted totally INCORRECT information about the reliability of the so-called “field sobriety tests.” When we were children, we would chant, “LIAR, LIAR pants on FIRE!”
I now denounce FINDLAW for being so quick to assume that a government agency is telling the truth about police sobriety tests that they authorized, paid for, and put in print, at TAXPAYER COST, that cannot be scientifically proven to be true.
At this link: http://dui.findlaw.com/dui-arrests/field-sobriety-tests.html
[hyperlink intentionally removed in order to not perpetuate the misinformation],
the world’s largest legal website (their claim, not mine) spits out false information about roadside sobriety tests. Not a single peer-reviewed study or article has ever verified the unscientific “validation studies done in the mid-1990s.”
FINDLAW has done a great disservice by using their monolithic size to get to the top of search results for “field sobriety tests,” and posting flawed information that can and will MISLEAD thousands of accused drunk drivers who are seeking scientifically correct Internet data about sobriety “tests.”
Looking at their information, the only question I now have, is “who copied from whom,” as I compare their very similar posting to the one displayed at the AAA site I have already identified above as putting out the same drivel.
FINDLAW has had some copywriter summarize and regurgitate propaganda from NHTSA, or some other government misinformation site, talking about the sobriety field tests as being capable of identifying impaired drivers. POPPYCOCK! When your initial study was flawed from the outset, and all other data flows from flawed data, nothing good can be expected. These evaluations should be called “false incrimination tests.”
The Findlaw site summarizes the NHTSA 3-test battery as follows:
The Standardized Field Sobriety Test (SFST) endorsed by the National Highway Traffic and Safety Administration (NHTSA) consists of the horizontal gaze nystagmus (HGN), walk-and-turn (WAT) and one-leg stand (OLS).
Then, Findlaw goes on to explain the mechanics of each of the three “standardized” police sobriety tests, but not one word is mentioned about the NUMEROUS true research scientists who have taken NHTSA to task over acting as though these roadside exercises are more that “party games” that can lead to false DUI DWI arrests.
Next, Findlaw repeats the long-discredited mid-1990s validation studies that were created by the federal government in an effort to CLAIM (falsely) that somehow, police officers a decade or so after the original standardized manual was published, had suddenly developed “superpowers” and were not able to accurately GUESS which roadside drunk driving subjects were “over the 0.08 limit” (the new standard that was being used in several states at the time, and which is now the standard in every US jurisdiction), versus the far worse statistics recorded for officers who had been unable to even come close to these alleged statistics 15 years earlier.
Plus, the earlier effort to GUESS who was “over the legal limit” was attempted (under controlled conditions). They were tasked with GUESSING which drivers were 0.10 or more. Here is the completely misleading and scientifically-unsupported information, being “sold” by FINDLAW:
Taken as a whole, the three components of the SFST accurately indicate alcohol impairment in 91 percent of all cases and 94 percent of cases if explanations for some of the false positives are accepted, according to a 1998 study cited by the NHTSA.
Plus, the author must point out the pernicious and highly dangerous misconception — believed by almost all Americans — that they MUST attempt to do the optional, voluntary and self-incriminatory field sobriety tests requested by a police officer who is standing at their driver’s window of their vehicle or that they will go to jail. Having asked thousands of prospective jurors about this concept, over 90% of prospective citizens on jury duty BELIEVE they must attempt these bogus tests. This is the biggest lie of all, and officers get special semantics training on how to verbally coax, coerce, trick and berate citizens into engaging in these roadside “party games.” The attempt to “pass” these evaluations and the roadside recording of whatever is said by the citizen will later be the centerpiece of the criminal prosecution against the unsuspecting, accused intoxicated driver.
The truth is that a failed field sobriety test usually can be explained by one of these 5 flawed actions or defective field sobriety test procedures:
- The officer instructs the sobriety tests incorrectly.
- The person asked to perform the field sobriety test is too old or out of shape to EVER “pass” sobriety tests.
- The surface conditions, lighting conditions or weather conditions impact the detained person’s ability to execute field sobriety test exercises.
- The officer’s scoring method is subjective and hyper-technical so as to assure failure of the sobriety tests.
- The inherent flaws in the scoring protocol cause the subject to fail.
Since the inception of SFST “research,” hundreds of millions of taxpayer dollars have been committed to this project, after the first field testing manual was released for distribution to police in 1984. It was called the National Highway Traffic Safety Administration’s “Standardized Field Sobriety Tests.” Most of the taxpayer money spent by the federal government since 1975 went to Dr. Burns and her cronies. Ironically, these evaluations are NOT reliable to identify impaired drivers and have never been correlated to prove “driving impairment.” From that starting point, let’s learn more about how this boondoggle has operated to the detriment of millions of innocent drivers on America’s highways over the past 30 plus years.
To Begin, Let’s Review how Current DUI Laws are Set Up across America to count on the Field Sobriety Tests to help make their case
First, every state in the Union allows drivers age 21 and over to DRINK AND DRIVE. Thus, driving under the influence of alcohol is a “crime of degree,” wherein a person can consume a little alcohol and be legal to drive, but the same person drink one more drink and become a criminal. An alcohol-based DWI-DUI is generally proven two ways: (1) by being too impaired to drive, due to overconsumption, or (2) being “over the legal limit.” Some states require a prosecutor to elect which type of DUI-alcohol to pursue, but most states allow the prosecutor to try to prove DUI by any number of theories, in the alternative.
Second, almost every state has passed statutes called administrative license suspension laws or administrative license revocation laws. Under these laws, when a driver is arrested for DUI-DWI and given his or her “implied consent” rights, which explain the administrative sanctions of either REFUSING to be tested for alcohol (or drug) levels, or who SUBMITS to testing and has a BAC level over the legal limit, can face loss of driving privileges or restricted driving privileges or a financial penalty or even face some jail time in the different states. In many states, the TOTAL loss of all driving privileges for REFUSING to be tested can be a powerful negotiating tool for police officers to use to coerce a driver into entering a guilty plea to a driving while impaired charge. In other words, the arresting officer may be willing to withdraw the administrative suspension or revocation, if the accused drunk driver agrees to enter a guilty plea to the underlying criminal charge. Many citizens with excellent drunken driving cases capitulate and enter a guilty plea, rather than lose their right to drive, with at least restricted privileges.
Third, a threshold determination of “probable cause to arrest” for DUI is a legal issue in many cases. Any officer arresting for driving while intoxicated knows that he or she must be able to back up the arrest decision with tangible proof of “impairment.” This proof typically is gathered from THREE sources: (1) What bad driving conduct was observed (e.g., swerving across lane lines, running a red light)? (2) What are the manifestations of an impaired driver, like slurred speech, unsteadiness on his or her feet, an admission of consumption of alcohol (or drugs)? & (3) How did the driver perform, if he or she attempted to perform ANY field sobriety tests — (which are 100% voluntary and optional but this little “secret” is seldom disclosed by an investigating officer BEFORE you are arrested)?
By doing NO field sobriety test exercises, and by keeping your mouth shut, most drivers protect themselves from needless self-incrimination. Think of this three-prong “probable cause” stool, and ask how this stool would STAND with just one leg, or two legs, and not three?
When you do not KNOW your legal rights, you can’t EXERCISE your legal rights. So, over 1 million clueless American drivers attempt to perform the non-scientific, non-standardized and not-much-better-than-flipping-a-coin field sobriety tests, and end up in jail. Be smart and JUST SAY NO to all roadside field tests.
The THREE so-called “Standardized” Field Sobriety Tests – The Eye Test (HGN), the Walk the Line Test (WAT) and the One Leg Stand Test (OLS)
Burns and her other researchers originally collected data about currently-used police roadside tests for intoxication. Three of these tests, when administered in a standardized manner, were claimed to create “a highly accurate and reliable battery of tests for distinguishing BACs above 0.10.” These are the CLAIMED levels of the three evaluations, but none of their work was exposed to peer review, and the statistics gathered from these federally-funded studies fall far short of being complete. Therefore, full review of HOW BAD these methods and results really were cannot be precisely quantified.
Horizontal Gaze Nystagmus (HGN) – [the EYE test]
Walk-and-Turn (WAT) – [the walk-the-line test]
One-Leg Stand (OLS) – [the balance test]
NHTSA’s manual reported laboratory test data (from Burns and company) and reported the following numbers, after the 1981 study was completed:
HGN, by itself, was 77% accurate
WAT, by itself, was 68% accurate
OLS, by itself, was 65% accurate
This was the LAST time any controlled investigation was done by NHTSA, regardless of claims in later manuals of accuracy levels of 90% or more. Other researchers who have reviewed these reported accuracy numbers by NHTSA have pointed out the failure of the investigation team to FIRST establish how well totally sober persons, at various ages, could perform the evaluations. This is called “establishing norms” in the field of clinical psychology. Moreover, the reported statistics omitted some of the false positive statistics and did not account for inter-rater variations which put these numbers far lower. The term “inter-rater” means that different officers tested the same subject and came up with different numbers. These statistics were collected in the two controlled studies (1977 and 1981) by Dr. Burns and her field sobriety test observers.
What is the origin of the Use of Field Sobriety Tests now used by DUI-DWI officers?
In thenearly days of law enforcement of motor vehicle traffic, no systematic methods for screening suspected drunken drivers existed. Meaningless “evaluation” methods, such as having a person blow into an officer’s hat, or requesting that the suspect pick up coins that the officer had tossed on the ground were used. The common instruction for the coin pickup test was to pick up each coin in the order of denomination (i.e., the nickel first, then the dime and then the quarter).
In the early 1970s, a Ph.D. candidate in California, who was interested in “testing and measurement,” saw an opportunity to write her thesis about the horrors of drunken driving. As her thesis was being finished, she began to write letters and give lectures about the NEED for better, standardized roadside testing protocols for all police officers to use. This outspoken proponent was Dr. Marcelline Burns of California. She pointed out the lack of uniformity of field sobriety testing across America, and posited that it SHOULD be possible to create field tests that were administered and graded in a standardized, controlled and systematic fashion, so that the motoring public would be given “fair” tests of their sobriety before being arrested. The stated GOAL was “to make our highways safer” by removing drunk drivers through better police training on the proper methods of screening suspected drunk drivers for alcohol impairment through roadside testing.
Prompted by the challenge suggested by newly-minted psychology-researcher Burns, in 1974, the federal government put out an RFP (request for proposal — bids), asking “research” scientists who wanted to bid for this federal money to submit their ideas for developing a standardized group of “roadside sobriety evaluations” that would enable officers across America — regardless of State or agency — to identify the suspected impaired drivers who should be arrested for drunken driving.
Because alcohol was (and still is) the most common type of driving impairment by far, the study was funded to create some “standardized” roadside exercises that would target drunken drivers only — not drugged drivers. These evaluations should be easy for officers to do, and not be the type tests that would put the officer at risk for his or her safety. Because alcohol was by far the most common type of driving impairment, the study was limited to creating some roadside exercises that would target drunk drivers only and not drugged drivers.
Dr. Burns’ supervising Psychology professor, Dr. Herb Moskowitz, encouraged her to see if she could get someone at NHTSA interested in her ideas. A seasoned psychology professor, Moskowitz also knew that federal grants were available to “researchers,” if the proper public safety presentation caught the attention of the right people in charge of federal highway funds. The big selling point of this proposal was, plain and simply stated, that more DUI-DWI convictions could be obtained by “standardizing” a few agility and mental acuity exercises to be utilized by investigating officers.
Burns’ Initial Field Sobriety Test investigation was to determine which field tests officers were already using
Thus, the quest to create standardized, police sobriety tests was on in the mid-1970s, before NHTSA requested proposals to bid on contracts for conducting research to identify the best “field sobriety tests” that an officer could use at roadside. Marcelline Burns, a research psychologist and the director of the Southern California Research Institute and her group submitted a technical and cost proposal and her group, the Southern California Research Institute (SCRI), was awarded the preliminary contract by NHTSA to do the initial research in 1975.
So, beginning in 1975 federal studies were sponsored by the Department of Transportation, through NHTSA. The National Highway Traffic Safety Administration (NHTSA) is a branch of the Department of Transportation (DOT). The initial contracts were all made with the Southern California Research Institute (SCRI) to determine which of the field sobriety tests were the most defendable to identify drivers who were at or above the legal limit. In other words, SCRI initially was not asked to develop field sobriety tests by officers around the USA. Several other federal contracts were later funded for follow-up work on developing a new set of rules for screening and evaluating drinking drivers by way of field sobriety tests.
Dr. Burns’ group’s first study was reported to NHTSA in 1977, but the research work by Burns for the 1977 study actually began in 1975. Her methodology was to conduct a literature search of related police field test materials to see what types of roadside tests were being “taught.” As part of this phase, she rode with police officers in various jurisdictions across the USA to develop a list of currently-used tests. Eventually, she came up with sixteen (16) tests that she identified as potential field sobriety tests. Then, the next phase of this research was to look at the six (6) best field sobriety tests, in order to evaluate these for their reliability, safety and ease of being instructed and scored at the roadside.
The 1977 Laboratory Study of Six Field Sobriety Tests conducted by Marcelline Burns and her Cohorts
Using those tests, SCRI conducted some initial, limited research on police sobriety tests with a small group of people, and (after evaluating safety and ease-of-instruction criteria), narrowed the longer list of 16 possible field tests down to six (6) tests to be evaluated in the 1977 study. The goal here was to see which field sobriety test evaluations proved to have the best correlation to identifying drivers whose BAC level was 0.10 grams percent or more. Like with ALL other NHTSA-sponsored field sobriety test studies, NONE were ever correlated to DRIVING “impairment.”
This first controlled study involved 238 drinking subjects and ten (10) police officers. The study took about one year to finish. The upshot of the 1977 SCRI study was the recommendation by Burns and company for the use of three (3) of the six (6) “finalists” roadside tests, namely, the walk-and-turn evaluation, the one-leg stand evaluation, and the horizontal gaze nystagmus “eye” test evaluation, or the HGN test. Following terminology used at that time, the HGN test was referred to in the SCRI report as “alcohol gaze nystagmus test”).
The other tests used in the study were the frequently-used finger-to-nose touching test while eyes were closed, the finger count test, where a person uses his or her thumb to touch the fingertips of the same hand, and count forward and backward “1-2-3-4 and then 4-3-2-1”, and the tracing of a circle on paper test. A modified Romberg test, the alphabet recital test and subtraction tests were also interchangeably used during these preliminary phases of the 1977 study, since these police sobriety tests had been popular with police officers.
The California residents selected as subjects for the research were licensed drivers and were also alcohol consumers. No teetotalers were tested. Significantly, the test subjects were not screened for prescribed drugs, nor illegal drugs, and the physical conditions of the subjects such as height, weight and medical limitations were not documented. The participants were instructed to fast (not eat) for four hours or more before they were given measured doses of alcohol. The purpose here was to be able to dose the subjects with measured amounts of alcohol and reach a predictable peak alcohol BAC level at about the same time. But, the participants in the field sobriety tests did not know the amount of alcohol that they were consuming. Each study participant, after consuming alcohol, was given a portable breath test on a preliminary breathalyzer device to determine their BAC level, and then each participant was subjected to the six field sobriety tests mentioned above.
The 1977 study concluded that the modified Romberg test and the finger-to-nose test merely reflected the presence of alcohol, but did not increase the predictive ability of this field sobriety testing. In other words, the finger-to-nose and Romberg test did not add anything to the predictability of a subject’s level of intoxication. It is also interesting to note that in this 1977 study NONE of these field sobriety tests were recommended by Burns and company for use as roadside sobriety tests. That recommendation to NOT USE as roadside evaluations includes the finger count exercise, the finger-to-nose while eyes closed exercise, the modified Romberg test (where a test subject is asked to lean his or her head back while standing erect, close the subject’s eyes, and then silently estimate the passage of 30 seconds), the regular A to Z alphabet recitation test (not the bastardized, non-consistent, confusing versions given by many law enforcement officers who make up their own “tests” (such as saying, “I want you to recite the alphabet starting with the letter E – echo — to the letter V – victor, without singing it), and the circle tracing on paper test.
Nevertheless, these extra, useless and misleading so-called “tests” are sometimes requested by police officers to be performed by unsuspecting citizens as “add-ons” in order to gather even more possible unfavorable performances on audio and video, so that a jury will convict the person. Rogue police officers know that they get away with such antics in many states, and neither state police “standards” or prosecutors handling the cases will admit the uselessness and unproven reliability of these evaluations.
Astonishingly, the finger count sobriety test evaluation that was eliminated early by Burns and company as being UNRELIABLE has been resurrected in 2009 by park rangers and nautical “police” patrolling our lakes and waterways and park lands, and the bogus “finger count” evaluation is now used as one of the field sobriety test procedures to determine which boaters will be arrested for BUI. The field tests used to arrest people for boating under the influence are (with the exception of HGN testing) the castoff field sobriety tests that Dr. Burns and her team abandoned over 35 years ago. Plus, the HGN test has been validated on a floating surface, with a moving horizon and an equally moving and shifting field sobriety test “administrator,” such as a Department of Natural Resources officer standing on a rocking and rolling watercraft.
Police officers seeking convictions rather than justice have no qualms about requesting frightened drivers to try to perform these unscientific sobriety tests, and our dysfunctional court system in America ALLOWS it in most states. These facts alone points to the wisdom of NOT ever attempting ANY field sobriety tests — on land or in the water. Plus, guess who was paid to conduct the 1993 research on BUI tests that and claimed that these sobriety tests could be performed while seated? Dr. Burns and her company! Ironically, these “police sobriety tests” were declared unreliable by Burns in the earlier, controlled studies.
The 1981 Study and getting new data on the Three Standardized Field Sobriety Tests
The 1977 sobriety test study recommended further review (or “we need more federal money”), and NHTSA once more awarded the Southern California Research Institute a contract to conduct new testing in order to try to shore up and bolster the dismally poor standardization numbers. Significantly, instead of six types of field sobriety tests, the study was now whinnied down to the three SFST evaluations first released in 1984, and which are still being used today. However, the manuals used by police to learn field sobriety test procedures have been altered and re-written about a DOZEN times, most recently in October of 2015. That is the HGN (horizontal gaze nystagmus), the WAT (walk and turn) and the OLS (one leg stand) roadside field evaluations.
This second “laboratory” study involved some changes in the AGES of the test subjects and some changes in the array for the bracketing of test subjects (by age groups). Plus, big changes to the numbers of totally alcohol-free test subjects was made, and the amount of ethanol that the dosed subjects received. These changes helped the researchers get BETTER “reliability” numbers in their 1981 study, by simply being smarter about setting up the testing. So, while the three field sobriety tests settled upon MAY be the “best available” roadside agility and psycho-physical sobriety tests, by altering the array of dosed subjects, the age bracketing and the number of alcohol-free test subjects, the two studies cannot be said to be demographically identical. See this discussion below from one of Allen Trapp’s seminar speeches.
So, in the 1981 study only the three test battery was used. The 1981 study, like the 1977 study, was done only in a laboratory setting, except for a handful of experiments conducted at the end of the study. Burns states that the law enforcement officers again made their decisions to arrest or not to arrest based on the prediction that the subject’s BAC was over or under a 0.10. A total of 296 subjects were involved in the 1981 study.
Furthermore, some additional “divided attention” scoring components were added to the field sobriety test guidelines in midstream during the 1981 study. Again, the 1977 protocols and methodology and the 1981 “adjustments” make the two laboratory studies like comparing “apples to oranges” in some aspects. For example, Burns describes a divided attention component of the walk and turn sobriety test as the portion of the test wherein the subject is requested to stand with one foot in front of the other on the line, while listening to the instructions. This is also referred to as the “instructional phase”.
The standardization aspect of the 1981 field sobriety test study was to establish consistency in the administration guidelines, the instructions, the demonstrations, and the scoring. The objective was to ensure that if an officer in Florida does the three-test field sobriety test battery, and an officer in Oregon also does the identical FST tests, the two officers should do it the same way and reach the same conclusions (assuming the same dosed subject was being evaluated. The order in which the tests were given was considered by Dr. Burns to be irrelevant. Yet, all police training manuals call for the administration of the HGN evaluation FIRST, then the WAT and then the OLS.
In the 1981 study, out of 118 decisions by the officers to arrest after administering field sobriety tests, 32 percent of them were wrong. (Source: 1981 report, page 27, Table 8). This is only slightly better than the 1977 study which had a 47 percent error rate of false arrests. Also, in the 1981 study, 18 percent of the subjects who had no alcohol in their system were misjudged by the officers to be “over the legal limit,” (Source: 1981 report, page 22, Table 4) Dr. Burns attempted to explain the horrific reliability numbers from the earlier studies. Without admitting the shoddiness of her methodology, she wrote that because she and her team made no effort to screen the test subjects for drugs that these people may have been impaired on substances other than alcohol.
This pathetic attempt to bolster her three sobriety test battery is a clear admission of sloppiness in setting up and proceeding with a federally-funded project with far-reaching implications for the public. Maybe the federal government should request reimbursement of these wasted funds, and invalidate the entire study, since none of the subjects, including those who had ingested alcohol, had been screened for drugs before drinking. Also omitted from the boasts of “reliability” were the false positives rendered by officers in the 1981 study, where these officers incorrectly judged 31 percent of the people tested at a 0.05 BAC to be impaired, when most states have enacted statutes that provide that an adult driver whose BAC level is 0.05 or less is presumed to NOT be impaired.
Dr. Greg Kane, MD, of Colorado has summarized the deception and intentional misinformation that has been spawned by these so-called “studies:”
- Studies arranged by NHTSA
- Research contractors picked by NHTSA
- Research contractors paid by NHTSA
- No financial disclosure
- Research results published by NHTSA in-house, without outside peer review
- No review by an independent biostatistician. As far as I can tell, no review by any statistician
In the USA governments convict people of DUI crimes using pseudo-medical “science” based on secret data that was not originally subject to outside peer review and cannot now be examined by the defendant—or anyone else.
Source: From Dr. Greg Kane, MD at his website: www.sfst.us
Another very important point needs to be made. As opposed to all other subsequent field sobriety validation studies and field testing done by Burns and others for NHTSA, no other field sobriety test study or report was later conducted under controlled (laboratory) supervision— except the 1981 and the 1977 study. This is very significant, since REAL scientists totally discount and give little or no credibility to any “reports”, validation studies or other published data that falsely claims to have been done under scientific standards. See some of these critics’ reports at the end of this article.
Specifically, the purpose of undergoing later so-called “validation studies” was to try to accommodate critics who pointed out that the numbers obtained in the 1977 and 1981 laboratory work was to predict those who were at 0.10 grams percent or higher. The national BAC level was being lowered in many states to 0.08 grams percent. Under President Bill Clinton’s administration, this became mandatory (to lower their BAC level) for states to obtain certain valuable federal highway funds. So, NHTSA needed to “create” some proof that officers could be as accurate or more accurate at predicting a 0.08 BAC driver. By 2005, every state had lowered its BAC level to 0.08 for adult (age 21 and over) drivers.
Reality check about the Bogus “Validation Studies” – Just Use your Common Sense
Start with this “common sense” CONCEPT. If you were going to test your marksmanship with a rifle, and the target was being placed at 50 yards away, do you think that you would get MORE shots within the RINGS of the target that was 8 inches in diameter or within the rings of a BIGGER target that was 10 inches in diameter? Does this REALLY take much analysis? Not for anyone looking for the truth.
Anyone not brain dead would know that the shooter would have more “hits” within the rings of the target when the area of the target is LARGER. The same exact principle applies to trying to identify drinking subjects at a LOWER BAC level than the previous higher “over the limit” BAC level used in 1977 and 1981. Yet, between 1995 and 1998, NHTSA sanctioned these bogus “validation” studies to be conducted in three states: Colorado, Florida and California. No mandatory laboratory oversight and independent observation was utilized for these police sobriety tests. No double-blind testing was done. Officers were told that their “arrests” were being monitored to determine how often they made the RIGHT decision. Consequently, the officers tended to ONLY arrest the people who were obliterated by alcohol. In one of the three studies, the median BAC level was about double the legal limit of 0.08 grams percent. In another study, the police were permitted to use hand-held breathalyzers, which was specifically forbidden by the stated rules pertaining to these validation studies. Guess who got these contracts and oversaw the “validations?” Dr. Marcelline Burns.
The “findings” of the police officers who KNEW they were being studied, yielded reliability numbers in the 90+% or better range. So, (putting logic aside) on the more difficult “target,” the officers in the three “validation studies” were darn near perfect. Because these three taxpayer-funded reports were sanctioned by the Government, this passes as being “scientific.” Anyone who believes this should be trying to buy the Brooklyn Bridge, or some Jack-in-the-Beanstalk beans. Other REAL scientists who reviewed the raw numbers from the field sobriety test validation studies have pointed out the “loaded deck” given to the officers who were part of the testing. Burns and the other apologists claim that these incredible and “fixed” numbers were the result of highly trained DUI task force officers making the arrests. Complete hogwash! Yet, police trainers across America have been brainwashing officers into believing that — due to their training — they can perform field tests at nearly 100% reliability.
Allen Trapp, a brilliant Carrollton, Georgia DUI trial attorney who passed away in September of 2015, did a comprehensive analysis of the “cooked” numbers of the so-called “validation studies for one of the author’s seminars about 10 years ago. These important points made by Trapp show specifically how NHTSA’s lack of oversight and complete abandonment of the scientific method allow this fraud to be perpetrated on the American driving public:
However, the most interesting statistics from the 1981 study as discussed by Cole and Nowaczyk, (The Champion, August 1995) involve the “dosing differential” of the subjects tested. Most of the subjects (78 percent) were dosed with either high BAC (about 0.15) or low BAC (0.05 and below). (Source: 1981 report, page 15, Table 4) These should have been easy decisions since it should, as a practical matter, be easy for the officers to score an individual as being above a 0.10 BAC when they are 0.15 BAC and above. The same would be true of someone 0.05 and below. NHTSA claims an overall accuracy rate of 0.80 when using the three-test battery, however, this overall accuracy rate of .80 is questionable when over two-thirds (78 percent) should be considered “gimmies” (either dosed high or low, hence the “dosing differential”). In other words, the data of the individuals dosed between 0.05 and 0.15 would undoubtedly have an accuracy rate of much less, however, that data is unavailable. Cole and Nowaczyk opine that one factor in determining the “improval” of the false arrest numbers (47 percent in 1977 down to 32 percent in 1981) could be due in part to the dosing differential.
The number of subjects dosed in the mid-range (0.05 to 0.15) went down from 27 percent (Source: 1977 report, page 19, Figure 5) in the 1977 study to 22 percent in the 1981 study. In other words, only 22 percent of the subjects in the 1981 study were in the more difficult to determine range of between 0.05 to 0.15 BAC. The 1981 study claims a “reliability study” as part of the research in 1981. Reliability basically refers to consistency, or the ability get the same results each time. The reliability portion consisted of asking 145 of the subjects back for retesting two weeks after the original study. The “reliability factor” was a 0.77. This “reliability correlation coefficient” is based on a scale from almost zero to a 1.00. It is interesting to note that a correlation coefficient of 0.9 or above is expected for academic reading tests such as the SAT. This inter-rater reliability coefficient dropped to 0.57 (Source: Page 35, Table 14) when done by different officers. So, when different officers tested the same subjects at the same dose level, the reliability level was very pathetic, and far below scientific acceptability. Dr. Spurgeon Cole states that the scientific community expects reliability coefficients to be in the high 0.80s or 0.90s for a test to be scientifically reliable. This statistic is quite significant and is one of the reasons that judges should not allow an officer to testify that the accused failed the particular test.
The age and gender of the subjects used in the 1981 project, as with the 1977 study, are highly significant when considering any interpretation of the results. In the 1981 study a whopping 80% of the subjects were between the ages of 21 and 34. Again, as with the 1977 study about two thirds of them were male. (Source: 1981 report, page 14, Table 2) The use of a predominately male population in their twenties means that we should question the applicability of the test results to the population as a whole.
Source: Demolishing Police Testimony about SFST Reliability and Accuracy, Allen M. Trapp. Jr., 2006
This is the introduction written by Burns for the Florida “study (which is undated):”
During the years 1975 – 1981, a battery of field sobriety tests was developed under funding by the National Highway Traffic Safety Administration (NHTSA), U.S. Department of Transportation (Burns and Moskowitz, 1977; Tharp, Burns, and Moskowitz, 1981). The tests include Walk-and-Turn (WAT), One-Leg Stand (OLS), and Horizontal Gaze Nystagmus (HGN). NHTSA subsequently developed a training curriculum for the three-test field sobriety test battery, and initiated training programs nationwide. Traffic officers in all 50 states now have been trained to administer the Standardized Field Sobriety Tests (SFSTs) to individuals suspected of impaired driving and to score their performance of the tests.
At the time the SFSTs were developed, the statutory blood alcohol concentration (BAC) for driving was 0.10% throughout the United States. The limit now has been lowered in a number of states to 0.08% for the general driving population. “Zero tolerance” is in effect in some jurisdictions for drivers under age 21, and commercial drivers risk losing their licenses at a BAC of 0.04%. It is likely that additional states will enact stricter statutory limits for driving. In light of these changes, a re-examination of the battery was undertaken by McKnight et al. (1995). They reported that the test battery is valid for detection of low BACs and that no other measures or observations offer greater validity for BACs of 0.08% and higher.
Despite Dismal Reliability, the Field Sobriety Test “Battery” was put in Print and Distributed multiple times in the NHTSA Field Sobriety Test Manuals.
Although six field sobriety tests were used and were a part of the 1977 NHTSA study, none were selected as being indicators of anything, let alone as indicators of alcohol intoxication. Some interesting statistics came out of the 1977 study. Of primary significance was the error rate of the 10 officers involved in the study. Their error rate (false positives, meaning arrest decision was made, yet the person was under a 0.10 BAC) was an astounding 47 percent! You have to read the report, because you will not find this in a NHTSA training manual used by law enforcement in giving police sobriety tests. That is to say, in the 1977 study the officers made the decision to “arrest” a total of 101 people. Of those people “arrested”, 47 percent had a BAC under 0.10 percent. (Source: 1977 report, page 25) This high false positive percentage was totally unacceptable, even according to the author(s) of the study. Marcelline Burns first tried to attribute the high error rate to the inexperience of the officers used in the study.
If this was true, it would seem inexplicable that Burns would again use inexperienced officers in the 1981 NHTSA study. It is significant that approximately 80% of the subjects used in the 1977 study were in their twenties, and about two thirds of the test subjects were male. (Source: 1977 report, page 18, Figure 4) Physical dexterity clearly wanes as people age, with rare exceptions for those who are fanatics about staying fit.
So, the stated task of NHTSA was to find suspected impaired drivers who put public safety at risk. The target group was impaired from imbibing too much alcohol. The study did not test or even consider estimating percentages of drugged drivers who “failed” the field sobriety tests. NHTSA focused only on those test subjects who were dosed with alcohol.
NHTSA, a division of the United States Department of Transportation, eventually approved a group of scientists at the Southern California Research Institute (SCRI) based in Los Angeles, California. Not surprisingly, the winning bid was awarded to Dr. Burns. The primary authors of the final field sobriety test study and report were Dr. Marcelline Burns, Ph.D. and Dr. Herbert Moskowitz, Ph.D., both of whom operated the Southern California Research Institute.
The STANDARDIZED Field Sobriety Tests are NO LONGER STANDARDIZED.
Ironically, the creators of the standardized field sobriety tests similarly posted a disclaimer (like the AAA disclaimer above) within their training manuals — in all of them released for police training from 1984 through 2006. In 2013, NHTSA abandoned entirely this all-important “standardization” disclaimer. As can be seen below, backlash from prosecutors, DRE officers and standardized field sobriety test instructors has caused NHTSA to put it back in the manual.
The latest 2013 version of the NHTSA SFST manual TOTALLY deleted this critically-important warning to officers that if they DON’T follow every field sobriety testing protocol and SFST screening procedure meticulously, then the validity of all field sobriety test exercises is COMPROMISED. So, thousands of police officers were defectively trained on this manual for two years. It is of the utmost importance that the reader understand that this paragraph was the ONLY “standardization” language in the entire NHTSA SFST manual. Seeing their flagrant and inexcusable error, after criminal defense attorney cross-examination of arresting officers, at trials pointed out the deception and duplicity involved in removing the BOLD TYPEFACE admonition in the field sobriety test manual, NHTSA issued a new sobriety test manual in October of 2015. Significantly, the “admonition” has been taken out of BOLD print and buried in Section VIII, on page 13 of the newest field sobriety test manual.
Thus, these official, “so-called “tests” are anything BUT tests, when you compare these evaluations to highly reliable, important testing such as SAT, ACT, IQ and other truly standardized testing. Imagine that you are going to take the SAT, or the MCAT or the LSAT and the time-keeper shortens your group’s time to finish, decides that she will need to not take the lunch break and does not control loud and disruptive students who are throwing spit balls across the rows of seats, and one test subject has a boom box that is belting out rap music at 80 decibels.
REAL “scientific” education competency tests are standardized against a “norm,” and are repeatable and reliable — to better than 90% repeatability. When you take an IQ test, thousands of sample tests and controlled test monitoring and administration goes into the effort, because we all know that a score of 100 is the benchmark for having an “average” IQ. Without doing the crucial establishment of norms, how would a person ever trust that their IQ test was yielding accurate information? Yet, NOT A ONE of the field sobriety tests were exposed to comparison to any “norms,” and none of the sobriety tests come close to achieving these lofty percentages when predicting who is above the legal limit and who is not. In fact, due to the very nature of what the 3-test battery of sobriety tests seek to score, none of these police sobriety tests will EVER approach 90% or greater repeatability.
The Standardization Paragraph that was REMOVED in 2013
Below is the now-omitted “standardized” language. It was always printed in BOLD print (the only bold print in the entire manual) and in ALL CAPS until the release of the 2013 Standardized Field Sobriety Test participant guide:
IT IS NECESSARY TO EMPHASIZE THIS VALIDATION APPLIES ONLY WHEN:
- THE TESTS ARE ADMINISTERED IN THE PRESCRIBED, STANDARDIZED MANNER
- THE STANDARDIZED CLUES ARE USED TO ASSESS THE SUSPECT’S PERFORMANCE
- THE STANDARDIZED CRITERIA ARE EMPLOYED TO INTERPRET THAT PERFORMANCE.
- IF ANY ONE OF THE STANDARDIZED FIELD SOBRIETY TEST ELEMENTS IS CHANGED, THE VALIDITY IS COMPROMISED.
It took Research from OUTSIDE the USA to Point out the Flaws in NHTSA’s SFSTs
A British study reported in 2009 highlighted the folly of Burns’ oversights and failure to establish “norms” for various age groups. In an article published in Accident Analysis and Prevention, vol. 41, p. 412 to 418 (2009), three medical researchers (Dixon, Clark and Tiplady) found the following statistics when comparing reliability of FIT (field impairment tests like WAT and OLS) and RITA (roadside impairment testing apparatus, an analytical device like our portable breath testers):
One hundred and twenty two healthy volunteers aged 18–70 years took part in this two-period crossover evaluation. The volunteers received a dose of alcohol and placebo, in the form of a drink, on separate days. Doses were calculated to produce blood alcohol concentrations of 90 mg/100 ml and RITA and FIT testing was carried out between 30 and 75 min post-drink. FIT was found to have a diagnostic accuracy of 62.7%. However, there was a substantial age effect for FIT scores, with volunteers aged over 40 showing failure rates on placebo similar to the failure rates on alcohol of younger volunteers. The accuracy of RITA was between 66 and 70%, not significantly higher than that of FIT. However, RITA did not show a marked age effect. Advantageously, this could result in fewer false positives being recorded if RITA were deployed at the roadside. Horizontal gaze nystagmus (HGN) was also investigated and posted an accuracy of 74%. The inclusion of HGN as one component of a UK roadside impairment test battery warrants further exploration with other drugs.
Title: Evaluation of Roadside Impairment Test device using Alcohol
Like being at a Carnival, you can now understand the “Shell Game” played by NHTSA
A carnival barker will try to lure you over to his table and let you wager a few dollars and try to guess which walnut shell of the three in front of you is covering the pea underneath it.
It looks like an easy bet to win. But, you don’t know that the carnival barker has rigged the game. The same principle applies to the NHTSA “standardized” field sobriety tests, except instead of losing a small bet at a carnival, you go to jail, arrested for DUI. After the “standardization” paragraph was “lifted” in 2013, NHTSA had to correct this planned deception, because someone told the folks in Washington that this crucial paragraph was the ONLY provision in the manual to provide the AURA of these field sobriety tests being “scientific” in any way.
Knowing the foregoing information, you clearly can see the “shell game” of the now “RE-Standardized” field sobriety tests. The seven steps to NHTSA abandoning the myth of standardization in 2013, over the course of three decades followed this path:
(1) First, the “researchers” get the federal grant money to “study” possible tests to be used at the roadside. In today’s dollars, the total would be in excess of $5 Million;
(2) The “researchers” never present their “research” paper for publication or submit it to experts in the field of testing and measurement in order to seek any kind of peer review. They did not even publish the full data collected during the field tests, to enable other scientists to either agree or disagree with their methodology and to prove that they followed the “scientific method.” In fact, we now know that the Burns and Moskowitz team DID NOT follow proper testing and measurement” methodology, because no “norms” were ever established for totally unimpaired individuals, much less did they seek to establish common-sense brackets for variables like age, obesity, being free from prescribed medications and other important variables. So, by virtue of these omissions, a perfectly fit 18 year old male athlete is the same as a 60 year old woman who is 70 pounds overweight, and suffering from the usual insults brought on by her advanced age and sedentary lifestyle;
(3) Let NHTSA spend millions more to roll out the standardized field sobriety tests and utilize hundreds of millions of federal and state tax dollars to train millions of police officers, telling each of them that these psycho-physical evaluations are the best thing since sliced bread, when it comes to identifying drunk drivers;
(4) Along the way over the next two decades, get MORE federal tax money for more bogus “validation studies” and “robustness studies” for field sobriety tests and other dribble to be printed up and distributed to police officers who regularly perform field sobriety testing;
(5) Arrest 30 million PLUS citizens using these bogus sobriety tests as the primary basis of the officers’ arrest decisions, all over America;
(6) Yield to pressure from judicial decisions that point out the complete farce that the field sobriety test battery represents has led to NHTSA and IACP (in 2013) altering the training manuals to delete the all-important standardization paragraph. None of the roadside field sobriety tests are truly scientific tests, due to the multiple variables related to test subjects (age, physical health, agility, prescribed medications, flow of adrenaline during police encounter), uncontrollable physical conditions surrounding the evaluations such as weather, lighting and slope & grade of the roadway or sidewalk, as well as defective administration and instructions by police in virtually 100% of the actual video-recorded instances of these police tests. In the past two decades, word of the unreliability and optional nature of these field sobriety tests has begun to be picked up by the general public. This has caused an increasing number of juries to acquit accused DUI drivers when they learn (from expert witness testimony introduced by skilled drunk driving defense attorneys) that the field sobriety test “3-test battery” is pure poppycock and conjecture. As this release of new information has evolved, the entire bogus claim of “standardization” being the KEY to the reported statistics and percentages of reliability was unceremoniously abandoned in 2013 by NHTSA and IACP from the then-existing training manuals. Thousands of new officers and hundreds of older officers seeking retraining were taught the “standardized” field sobriety tests without this key portion of the training manual.
(7) In October of 2015, after loud and continuous complaints from DRE officers, SFST Instructors, criminal defense attorneys, judges, prosecutors and others, NHTSA had to “eat crow” and redo the manuals to add back the standardization language. Instead of doing the RIGHT thing and putting in in BOLD PRINT (as all other field sobriety test manuals had done before 2013), and adding it at the END of Section VIII, NHTSA put it in regular typeface and buried it on page 13, so that officers being trained will not focus on the importance of proper field sobriety test procedures being followed.
This “shell game” had to be changed because scores of DUI lawyer specialists around the country who have taken the full police training courses from NHTSA-trained field sobriety instructors, starting in July of 1994, in Atlanta, GA. Now, dedicated DUI attorneys like the author have taken the police training courses multiple times — both the practitioner course and the instructor course – to be able to run circles around the police-trained officers who are regurgitating information from the course that will not withstand cross-examination.
The best DUI lawyers know when the officer is performing the evaluations incorrectly, and these attorneys have been using cross-examination to neutralize the bogus tests, in the eyes of the jury. So, when you follow the federal money, the proponents of the tests made a bundle off foisting the tests on the American public. American taxpayers paid out hundreds of millions of dollars to train officers on the pseudo-tests. Courts and local governments racked up billions of dollars in fines, and poor old John Q. Public still THINKS that he has to attempt to perform the optional, voluntary “fixed” tests. Until word is spread to NEVER do the field sobriety tests, the cycle of dishonesty and deception surrounding the so-called roadside sobriety tests will continue.
How Court Decisions and Appeals have helped Debunk the False Claims of the Field Sobriety Test Proponents
One of the most important cases in the last two decades in analyzing and debunking the so-called “field sobriety test” battery is United States v. Horn, 185 F. Supp.2d 530 (D. Md., 2002), which dealt with a case that occurred on a military base (and was therefore handled by a federal judge).
The exhaustive review of the reliability of field sobriety test evidence led the federal court to rule as follows:
Horn has filed a motion in limine to exclude the evidence of his performance on the field sobriety tests, asserting that it is inadmissible under newly revised Fed.R.Evid. 702 and the Daubert/Kumho Tire decisions. The Government has filed an opposition, and Horn has filed a reply. In addition, a two day evidentiary hearing was held, pursuant to Fed.R.Evid. 104(a), on November 19 and 20, 2001, and additional testimonial and documentary evidence was received, which is discussed in detail below. At the conclusion of this hearing, the following ruling was made from the bench, the Court also announcing its intention subsequently to issue a written opinion on this case of first impression:
(1) The results of properly conducted SFSTs may be considered to determine whether probable cause exists to charge a driver with driving while intoxicated (“DWI”) or under the influence of alcohol (“DUI”);
(2) The results of the SFSTs, either individually or collectively, are not admissible for the purpose of proving the specific blood alcohol content (“BAC”) of a driver charged with DWI/DUI;
(3) There is a well-recognized, but by no means exclusive, causal connection between the ingestion of alcohol and the detectable presence of exaggerated horizontal gaze nystagmus in a person’s eyes, which may be judicially noticed by the Court pursuant to Fed.R.Evid. 201, proved by expert testimony or otherwise;
(4) A police officer trained and qualified to perform SFSTs may testify with respect to his or her observations of a subject’s performance of these tests, if properly administered, to include the observation of nystagmus, and these observations are admissible as circumstantial evidence that the defendant was driving while intoxicated or under the influence. In so doing, however, the officer may not use value-added descriptive language to characterize the subject’s performance of the SFSTs, such as saying that the subject “failed the test” or “exhibited” a certain number of “standardized clues” during the test;
(5) If the Government introduces evidence that a defendant exhibited nystagmus when the officer performed the horizontal gaze nystagmus test, the defendant may bring out either during cross examination of the prosecution witnesses or by asking the Court to take judicial notice of the fact that there are many causes of nystagmus other than alcohol ingestion; and
(6) If otherwise admissible under Fed.R.Evid. 701, a police officer may give lay opinion testimony that a defendant was driving while intoxicated or under the influence of alcohol. In doing so, however, the officer may not bolster the lay opinion testimony by reference to any scientific, technical or specialized information learned from law enforcement or traffic safety instruction, but must confine his or her testimony to helpful firsthand observations of the defendant.
This federal judge in Horn took two days of scientific evidence from DUI expert witnesses in the Maryland cases before ruling on the scientific reliability of the standardized field sobriety tests. Multiple REAL experts in the fields of testing and measurement and medicine were called upon to testify about the statistics yielded in the NHTSA studies.
One of those experts was an expert in the field of “testing and measurement,” Dr. Spurgeon Cole. In his testimony and published writings, the former Clemson University clinical psychology professor was highly critical of the claimed reliability of the SFSTs if used to prove the precise level of a suspect’s alcohol intoxication or impairment. His 1994 article “Field Sobriety Tests: Are They Designed for Failure?” (co-authored with Professor Ron Nowaczyk) published in the journal Perceptual and Motor Skills, analyzed the 1977 Report, the 1981 Final Report, and the 1983 Field Evaluation report published by NHTSA regarding the SFSTs.
The Cole and Nowaczyk study observed the following:
(1) 47% of the subjects tested in the 1977 NHTSA laboratory study who would have been arrested by the testing officers for driving while intoxicated (BAC of 0.10 or greater) actually had BACs below 0.10;
(2) in the 1981 Final Report, 32% of the participants in the lab study were incorrectly judged by the testing officers as having BACs of 0.10 or greater; and
(3) the accepted reliability coefficient for standardized clinical tests is .85 or higher, yet the reliability coefficients for the SFSTs, as reported in the NHTSA studies, ranged from .61 to .72 for the individual tests and .77 for individuals that were tested on two different occasions while dosed to the exact same BAC. More alarmingly, inter-rater reliability rates (where different officers score each subject) ranged from .34 to .60, with an overall median rate of .57.
In most states, like Georgia, no such scrutiny has ever been given to the field sobriety test “battery.” To the contrary, in Georgia, even where the arresting officer ADMITS doing the field sobriety tests incorrectly, the Georgia Court of Appeals will not uphold the trial court’s exclusion of these “voodoo” evaluations. State v. Pierce, 266 Ga. App. 233, 596 S.E.2d 725 (2004). The most shoddy field sobriety test, even the pseudo-scientific HGN evaluation, is routinely allowed to be heard by a jury for whatever weight and credibility the jury wants to give it. This not only abrogates the prior rulings of the Georgia Court of Appeals (e.g., State v. Pastorini, 222 Ga. App. 316, 474 S.E.2d 122 (1996), but completely ignores the concept of proper “scientific evidence.”
Legal decisions from other states, like Tennessee, State v. Murphy, 953 S.W.2d 200 (Tenn. Supreme Court, 1997); Ohio in State v. Homan, 89 Ohio St. 3d 421, 732 N.E.2d 952 (2000); State v. Lasworth, 131 N.M. 739, 42 P.3d 844 (N.M. Ct. App. 2001) for three opinions written by appellate courts that have no vested interest in seeing that every DUI charge is prosecuted to the maximum degree, regardless of the unscientific, bogus procedures behind the tests.
Numerous other states, including Texas, Alabama and Mississippi, do not permit HGN evidence to be admitted at trial. Other states admit it ONLY if an expert lays a proper foundation showing that this psycho-physical field sobriety test was done correctly, and following good scientific procedures. State v. Murphy, 953 S.W.2d 200 (Tenn. 1997). Skilled DUI attorneys who specialize in DUI defense call these the “NHTSA SFSTs.”
How should people faced with a DUI investigation cope with the inexplicable differences in judicial interpretation of the fairness of field sobriety testing? It is NEVER taken any roadside evaluations, since these are 100% optional and voluntary. Simple solution: JUST SAY NO. Any knowledgeable DUI lawyer will tell anyone who listens that taking bogus field sobriety tests that can lead to both your arrest and possible conviction is insane. If you submitted to sobriety tests, and are facing trial, seek out the best DUI lawyer you can, and let him or her bring in a DUI expert witness to educate your jury.
Important Scientific Analysis and Articles Reviewing the Reliability of Standardized Field Sobriety Tests
Statistical Evaluation of Standardized Field Sobriety Tests, by Hlastala, Polissar & Oberman, Journal of Forensic Science, vol. 50, issue 3, 2005
Standardized Field Sobriety Tests (SFSTs) are used as qualitative indicators of impairment by alcohol in individuals suspected of DUI. Stuster and Burns authored a report on this testing and presented the SFSTs as being 91% accurate in predicting Blood Alcohol Concentration (BAC) as lying at or above 0.08%. Their conclusions regarding accuracy are heavily weighted by the large number of subjects with very high BAC levels. This present study re-analyzes the original data with a more complete statistical evaluation. Our evaluation indicates that the accuracy of the SFSTs depends on the BAC level and is much poorer than that indicated by Stuster and Burns. While the SFSTs may be usable for evaluating suspects for BAC, the means of evaluation must be significantly modified to represent the large degree of variability of BAC in relation to SFST test scores. The tests are likely to be mainly useful in identifying subjects with a BAC substantially greater than 0.08%. Given the moderate to high correlation of the tests with BAC, there is potential for improved application of the test after further development, including a more diverse sample of BAC levels, adjustment of the scoring system and a statistically-based method for using the SFST to predict a BAC greater than 0.08%.
Steven J. Rubenzer, The Standardized Field Sobriety Tests: A Review of Scientific and Legal Issues, Law and Human Behavior, American Psychology, vol. 32, issue 4, August 2008
“[T]he research that supports their (field sobriety test) use is limited, important confounding variables have not been thoroughly studied, reliability is mediocre, and their developers and prosecution-oriented publications have oversold the tests.”
“The theory that alcohol affects SFST performance is clearly subject to falsification if BAC is the primary criterion, and there are numerous studies that correlate SFST performance and BAC level. The proposition that SFSTs are related to driving impairment is also falsifiable but more difficult to test. Whereas impairment on a closed driving course might readily be correlated with SFST performance, some significant performance deficits occur only in response to rare events or in interaction with other vehicles or drivers (e.g., road rage). The theory that SFST performance is related to driving performance is falsifiable, but as yet untested.”
Cole and Nowaczyk, Are the Field Sobriety Tests Designed to Fail? Perception and Motor Skills, vol. 79, August 1994
Field sobriety tests have been used by law enforcement officers to identify alcohol-impaired drivers. Yet in 1981 Tharp, Burns, and Moskowitz found that 32% of individuals in a laboratory setting who were judged to have an alcohol level above the legal limit actually were below the level. In the 1977 study, the number was 46% improperly arrested. In the Cole and Nowaczyk study, two groups of seven law enforcement officers averaging over 12 years of experience in DUI arrests each viewed videotapes of 21 sober individuals attempting to perform a variety of field sobriety tests or normal-abilities tests, e.g., reciting one’s address and phone number or walking in a normal manner. Officers judged a significantly larger number of the individuals as impaired when they performed the field sobriety tests than when they performed the normal-abilities tests. The need to reevaluate the predictive validity of field sobriety tests is discussed.
The Horizontal Gaze Nystagmus Test: Fraudulent Science in the American Courts, J.L. Booker, vol. 44, No. 3, p. 133-139 (2004) Science and Justice: The Journal of Forensic Science Society
Bypassing the usual scientific review process and touted through the good offices of the federal agency responsible for traffic safety, it was rushed into use as a law enforcement procedure, and was soon adopted and protected from scientific criticism by courts throughout the United States. In fact, research findings, training manuals and other relevant documents were often held as secrets by the state. Still, the protective certification of its practitioners and the immunity afforded by judicial notice failed to silence all the critics of this deeply flawed procedure. Responding to criticism, the sponsors of the test traveled the path documented in this paper that led from mere (if that word can ever truly apply to a matter of such gravity) carelessness in research through self-serving puffery and finally into deliberate fraud – always at the expense of the citizen accused.
In 1998 the integrity of the statistical evaluation of the original research upon which the validity of the tests rested was unfavorably reviewed. In 2001 new research indicated that the Horizontal Gaze Nystagmus (HGN), the cornerstone of the test battery was fundamentally flawed and that the HGN test was improperly conducted by more than 95% of the police officers who used it to examine drivers suspected of driving while intoxicated (DWI). This summary critique demonstrates that it is scientifically meretricious and that the United States Department of Transportation indulged in deliberate fraud in order to mislead the law enforcement and legal communities into believing the test was scientifically meritorious and overvaluing its worth in the context of criminal evidence…
Other Booker publications:
Booker, J.L.., “The Field Test Paradox” – Voice for the Defense, 1996, vol. 25, pages 8-10.
Booker, J.L., The Application of the ‘Known and Potential Rate of Error’ Criterion to the Standardized Battery of Field Sobriety Tests – Voice for the Defense, 1998, vol. 27, pages 24-27.
Booker, J.L., End Position Nystagmus as an Indicator of Ethanol Intoxication – Science and Justice, 2001 –vol. 41, pages 113-116.
Horizontal Gaze Nystagmus: A Review of Vision Science and Application Issues, Journal of Forensic Sciences, Rubenzer, Steven J. & Stevenson, Scott. (2010). Journal of Forensic Sciences, 55(2), 394-409
Summary of the findings:
The Horizontal Gaze Nystagmus (HGN) test is one component of the Standardized Field Sobriety Test battery. This article reviews the literature on smooth pursuit eye movement and gaze nystagmus with a focus on normative responses, the influence of alcohol on these behaviors, and stimulus conditions similar to those used in the HGN sobriety test. Factors such as age, stimulus and background conditions, medical conditions, prescription medications, and psychiatric disorder were found to affect the smooth pursuit phase of HGN. Much less literature is available for gaze nystagmus, but onset of nystagmus may occur in some sober subjects at 45 degrees or less. We conclude that HGN test used by police is limited by a large variability in the underlying normative behavior. The screening methods used and the inconsistent testing environments that are encountered at the roadside often affect the reliability of reported results. Plus, the original NHTSA-funded studies in 1981 and 1977 suffer from a lack of rigorous validation in their laboratory settings.
From Dr. Greg Kane, MD at his website: www.sfst.us
Simple, but it doesn’t work. This is one of those times the simple and obvious answer is wrong. It turns out:
- There are two different kinds of medical tests: direct and indirect.
- Each kind of test has its own formula for accuracy. If you mix formulas, if you use the wrong formula for your kind of test, the answer you get will be wrong.
- For the SFST, NHTSA uses the wrong formula.
- Using the wrong formula, the accuracy NHTSA calculates for the SFST is spectacularly wrong.
Dr. Kane, on his website, explains NHTSA’s “funny math” with a highly understandable grid and breakdown of what “accuracies” really mean:
Accuracy is more complicated than you think
Do the SFST 100 times and you’ll get the correct answer ACCURACY percent of the time. That’s what NTHSA teaches DUI officers. But it doesn’t work. Accuracy is more complicated than common sense makes you think.
Here’s a Field Sobriety Test accuracy table from NHTSA’s original “scientific” FST validation project, Psychophysical Tests for DWI Arrest 1977. I’ve changed the labels to make the thing easier to read; you can get the original at PDFs. In this project each person tested had two measurements made: blood alcohol and a Field Sobriety Test. The question is, “what percent of the time did the FST measurements correctly predict the alcohol measurements?” The answer to that question will be the accuracy of the SFST.
The table sorts people by test result. Look under the pink label FST coordination test. People who failed the FST were counted in the Fail ↓ column. People who passed were counted in the Pass ↓ column. Over on the side, people whose measured alcohol was high went in the ALCOHOL high → row. People whose alcohol was low went it the ALCOHOL low → row. Tables like this set up True Positive, True Negative, False Positive and False Negative results in a way that makes it easy to answer important questions about this FST.
When people were guilty, how accurate was this FST? Look at the row ALCOHOL high→. Follow the red arrow across to the “% Correct Decisions” column. See the red circle around 84? In this study when people had a high alcohol level, the FST gave the correct answer 84% of the time. When people were guilty, the accuracy of this FST was 84%.
When this FST said people were guilty, how accurate was that prediction? Look at the column Fail ↓. Follow the orange arrow down to the “% Correct Decisions” row. See the orange circle around 53? When people failed this FST, the test was correct only 53% of the time. When this test said people were guilty, the accuracy of the test was 53%—a coin toss.
Wait, wait, wait! Those two accuracies are both about people who were guilty. How come they’re different— 84%, 53%? The answer is, the two accuracies answer questions that are subtly different. One is about people who are guilty. The other is about people the test says are guilty. Those groups are subtly different. They count different groups of people. So the accuracies are different. And notice that although the difference in what groups count is subtle, the difference in accuracy—84% vs. a coin toss—is dramatic.
When people were innocent, how accurate was this FST? Blue circle, 73%.
When this FST said people were innocent, how accurate was that prediction? Green circle, 93%.
How often did this FST give the correct answer? Pink circle, 76%.
Dr. Kane explains the “incorrect math” used by NHTSA on this page: http://sfst.us/Accuracy.html