<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-7197809943968080149</id><updated>2011-12-05T10:12:09.762-08:00</updated><category term='“In Classroom of Future'/><category term='lee cronbach'/><category term='directors of research evaluation'/><category term='ARRA'/><category term='data systems'/><category term='FERA'/><category term='RFP'/><category term='productive collaborations'/><category term='focusing evaluations'/><category term='program effectiveness'/><category term='Roberto and Brad'/><category term='technology infrastructure'/><category term='AYP'/><category term='rigorous research'/><category term='INSTITUTE OF EDUCATION SCIENCES'/><category term='SIIA'/><category term='&quot;Enhancing Education Through Technology&quot;'/><category term='Mark Lipsey'/><category term='ISTE'/><category term='technology curricula'/><category term='EXPERIMENTAL DESIGNS'/><category term='statistical analysis'/><category term='rational economic model'/><category term='RELEVANCE'/><category term='EETT'/><category term='Data Driven School Improvement'/><category term='dissemination model'/><category term='Experimental control'/><category term='incentive projects'/><category term='Bob Slavin'/><category term='rigorous evidence'/><category term='&quot;new york times&quot;'/><category term='smaller scale evaluations'/><category term='social policy'/><category term='Education Week'/><category term='best evidence encyclopedia'/><category term='empirical education'/><category term='Software and Information Industry Association'/><category term='educational decision-makers'/><category term='Dr. Chris Dede'/><category term='Compliance anxiety'/><category term='product evaluation'/><category term='ed-tech'/><category term='multi-year longitudinal data'/><category term='D3M'/><category term='R and E'/><category term='education technology'/><category term='RIGOR'/><category term='value-added modeling'/><category term='FERPA'/><category term='program implementations'/><category term='Stagnant Scores”'/><category term='VAM'/><category term='scientifically-based evidence'/><category term='data-driven decision making'/><category term='experimental program'/><category term='funding-achievement puzzle'/><category term='Randomized control'/><category term='product effectiveness research'/><category term='local relevance'/><category term='“Grading the Digital School”'/><category term='academic growth'/><category term='Arne Duncan'/><category term='REL'/><category term='value-added” analyses'/><category term='local research'/><category term='scientific evidence'/><category term='innovation'/><category term='&quot;Ensure transparency and accountability&quot;'/><category term='statistical techniques'/><category term='IES'/><category term='Barack Obama'/><category term='data-driven decision-making research'/><category term='Miller-McKeon'/><category term='local program evaluations'/><category term='evidence-based policy decisions'/><category term='John Merrow'/><category term='Scientifically Based'/><category term='&quot;Matt Richtel&quot;'/><category term='SELECTION BIAS'/><category term='local evaluation'/><category term='American education policy'/><category term='program implementation'/><category term='educational reform'/><category term='state education agencies'/><category term='credible research'/><category term='NCLB'/><category term='relevant'/><category term='professional development programs'/><category term='evaluation of state and local education programs and policies'/><category term='what works Clearinghouse'/><category term='statistical calculations'/><category term='continuous improvement'/><category term='i3'/><category term='Office of Educational Technology'/><category term='scientific research'/><category term='Margaret Honey'/><category term='Ellen Mandinach'/><category term='Data warehouses'/><category term='Lugar-Bingaman'/><category term='OMB'/><category term='Regional Education Labs'/><category term='Title IID'/><category term='Brookings Institution'/><category term='Citizens for Responsibility and Ethics in Washington (CREW)'/><category term='Redrock Reports'/><category term='Kathleen Kennedy Manzo'/><category term='methodological rigor'/><category term='methodological standards'/><category term='John Easton'/><category term='rigorous program evaluation'/><category term='OEEI'/><category term='Jim Goodnight'/><category term='random assignment experiments'/><category term='usability'/><category term='rigorous evaluations'/><category term='Department of Education'/><category term='H.S. Bloom'/><category term='&quot;measurable value of technology&quot;'/><category term='Regional Educaion Labs'/><category term='generalization'/><category term='cross-validation model'/><category term='educational innovation'/><category term='research organization'/><category term='modal effect'/><category term='generalizability'/><category term='RANDOMIZATION'/><category term='randomized experiments'/><category term='effectiveness'/><category term='reauthorization of ESEA'/><category term='Stimulus funds'/><category term='Consortium on Chicago School Research'/><category term='Reading First program'/><category term='statistical technologies'/><category term='achievement gaps'/><category term='Nina Rees'/><category term='SETDA'/><category term='new administration'/><category term='RANDOMIZING UNITS'/><category term='school technology'/><category term='Blueprint for Reform'/><category term='Keith Krueger'/><category term='CoSN'/><category term='Miller-McKeon Draft'/><category term='Economic Policy Institute'/><category term='reform agenda'/><category term='Debra Viadero'/><category term='Jim  Shelton'/><category term='Peter R. Orszag'/><category term='The Experimenting Society'/><category term='stimulus plan'/><category term='Senators Lugar and Bingaman'/><category term='Hanushek'/><category term='merit pay'/><category term='Denis Newman'/><category term='Scientifically Based Research'/><category term='INTERPRETING RESULTS'/><category term='statistical technique'/><category term='DEFINING RIGOR'/><category term='VAM calculations'/><title type='text'>Empirical Education</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>34</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-1785119190778573344</id><published>2011-12-05T09:55:00.000-08:00</published><updated>2011-12-05T10:12:09.798-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scientifically Based Research'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='SIIA'/><category scheme='http://www.blogger.com/atom/ns#' term='product effectiveness research'/><category scheme='http://www.blogger.com/atom/ns#' term='Software and Information Industry Association'/><category scheme='http://www.blogger.com/atom/ns#' term='Bob Slavin'/><title type='text'>Need for Product Evaluations Continues to Grow</title><content type='html'>&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;w:worddocument&gt;   &lt;w:view&gt;Normal&lt;/w:View&gt;   &lt;w:zoom&gt;0&lt;/w:Zoom&gt;   &lt;w:trackmoves/&gt;   &lt;w:trackformatting/&gt;   &lt;w:punctuationkerning/&gt;   &lt;w:validateagainstschemas/&gt;   &lt;w:saveifxmlinvalid&gt;false&lt;/w:SaveIfXMLInvalid&gt;   &lt;w:ignoremixedcontent&gt;false&lt;/w:IgnoreMixedContent&gt;   &lt;w:alwaysshowplaceholdertext&gt;false&lt;/w:AlwaysShowPlaceholderText&gt;   &lt;w:donotpromoteqf/&gt;   &lt;w:lidthemeother&gt;EN-US&lt;/w:LidThemeOther&gt;   &lt;w:lidthemeasian&gt;ZH-CN&lt;/w:LidThemeAsian&gt;   &lt;w:lidthemecomplexscript&gt;X-NONE&lt;/w:LidThemeComplexScript&gt;   &lt;w:compatibility&gt;    &lt;w:breakwrappedtables/&gt;    &lt;w:snaptogridincell/&gt;    &lt;w:wraptextwithpunct/&gt;    &lt;w:useasianbreakrules/&gt;    &lt;w:dontgrowautofit/&gt;    &lt;w:splitpgbreakandparamark/&gt;    &lt;w:dontvertaligncellwithsp/&gt;    &lt;w:dontbreakconstrainedforcedtables/&gt;    &lt;w:dontvertalignintxbx/&gt;    &lt;w:word11kerningpairs/&gt;    &lt;w:cachedcolbalance/&gt;    &lt;w:usefelayout/&gt;   &lt;/w:Compatibility&gt;   &lt;w:browserlevel&gt;MicrosoftInternetExplorer4&lt;/w:BrowserLevel&gt;   &lt;m:mathpr&gt;    &lt;m:mathfont val="Cambria Math"&gt;    &lt;m:brkbin val="before"&gt;    &lt;m:brkbinsub val="&amp;#45;-"&gt;    &lt;m:smallfrac val="off"&gt;    &lt;m:dispdef/&gt;    &lt;m:lmargin val="0"&gt;    &lt;m:rmargin val="0"&gt;    &lt;m:defjc val="centerGroup"&gt;    &lt;m:wrapindent val="1440"&gt;    &lt;m:intlim val="subSup"&gt;    &lt;m:narylim val="undOvr"&gt;   &lt;/m:mathPr&gt;&lt;/w:WordDocument&gt; &lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;w:latentstyles deflockedstate="false" defunhidewhenused="true" defsemihidden="true" defqformat="false" defpriority="99" latentstylecount="267"&gt;   &lt;w:lsdexception locked="false" priority="0" semihidden="false" unhidewhenused="false" qformat="true" name="Normal"&gt;   &lt;w:lsdexception locked="false" priority="9" semihidden="false" unhidewhenused="false" qformat="true" name="heading 1"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 2"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 3"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 4"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 5"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 6"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 7"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 8"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 9"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 1"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 2"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 3"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 4"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 5"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 6"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 7"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 8"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 9"&gt;   &lt;w:lsdexception locked="false" priority="35" qformat="true" name="caption"&gt;   &lt;w:lsdexception locked="false" priority="10" semihidden="false" unhidewhenused="false" qformat="true" name="Title"&gt;   &lt;w:lsdexception locked="false" priority="1" name="Default Paragraph Font"&gt;   &lt;w:lsdexception locked="false" priority="11" semihidden="false" unhidewhenused="false" qformat="true" name="Subtitle"&gt;   &lt;w:lsdexception locked="false" priority="22" semihidden="false" unhidewhenused="false" qformat="true" name="Strong"&gt;   &lt;w:lsdexception locked="false" priority="20" semihidden="false" unhidewhenused="false" qformat="true" name="Emphasis"&gt;   &lt;w:lsdexception locked="false" priority="59" semihidden="false" unhidewhenused="false" name="Table Grid"&gt;   &lt;w:lsdexception locked="false" unhidewhenused="false" name="Placeholder Text"&gt;   &lt;w:lsdexception locked="false" priority="1" semihidden="false" unhidewhenused="false" qformat="true" name="No Spacing"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 1"&gt;   &lt;w:lsdexception locked="false" unhidewhenused="false" name="Revision"&gt;   &lt;w:lsdexception locked="false" priority="34" semihidden="false" unhidewhenused="false" qformat="true" name="List Paragraph"&gt;   &lt;w:lsdexception locked="false" priority="29" semihidden="false" unhidewhenused="false" qformat="true" name="Quote"&gt;   &lt;w:lsdexception locked="false" priority="30" semihidden="false" unhidewhenused="false" qformat="true" name="Intense Quote"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="19" semihidden="false" unhidewhenused="false" qformat="true" name="Subtle Emphasis"&gt;   &lt;w:lsdexception locked="false" priority="21" semihidden="false" unhidewhenused="false" qformat="true" name="Intense Emphasis"&gt;   &lt;w:lsdexception locked="false" priority="31" semihidden="false" unhidewhenused="false" qformat="true" name="Subtle Reference"&gt;   &lt;w:lsdexception locked="false" priority="32" semihidden="false" unhidewhenused="false" qformat="true" name="Intense Reference"&gt;   &lt;w:lsdexception locked="false" priority="33" semihidden="false" unhidewhenused="false" qformat="true" name="Book Title"&gt;   &lt;w:lsdexception locked="false" priority="37" name="Bibliography"&gt;   &lt;w:lsdexception locked="false" priority="39" qformat="true" name="TOC Heading"&gt;  &lt;/w:LatentStyles&gt; &lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if !supportAnnotations]--&gt;&lt;!--[endif]--&gt;&lt;!--[if gte mso 10]&gt; &lt;style&gt;  /* Style Definitions */  table.MsoNormalTable  {mso-style-name:"Table Normal";  mso-tstyle-rowband-size:0;  mso-tstyle-colband-size:0;  mso-style-noshow:yes;  mso-style-priority:99;  mso-style-qformat:yes;  mso-style-parent:"";  mso-padding-alt:0in 5.4pt 0in 5.4pt;  mso-para-margin:0in;  mso-para-margin-bottom:.0001pt;  mso-pagination:widow-orphan;  font-size:11.0pt;  font-family:"Calibri","sans-serif";  mso-ascii-font-family:Calibri;  mso-ascii-theme-font:minor-latin;  mso-fareast-font-family:SimSun;  mso-fareast-theme-font:minor-fareast;  mso-hansi-font-family:Calibri;  mso-hansi-theme-font:minor-latin;  mso-bidi-font-family:"Times New Roman";  mso-bidi-theme-font:minor-bidi;} &lt;/style&gt; &lt;![endif]--&gt;&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;w:worddocument&gt;   &lt;w:view&gt;Normal&lt;/w:View&gt;   &lt;w:zoom&gt;0&lt;/w:Zoom&gt;   &lt;w:trackmoves/&gt;   &lt;w:trackformatting/&gt;   &lt;w:punctuationkerning/&gt;   &lt;w:validateagainstschemas/&gt;   &lt;w:saveifxmlinvalid&gt;false&lt;/w:SaveIfXMLInvalid&gt;   &lt;w:ignoremixedcontent&gt;false&lt;/w:IgnoreMixedContent&gt;   &lt;w:alwaysshowplaceholdertext&gt;false&lt;/w:AlwaysShowPlaceholderText&gt;   &lt;w:donotpromoteqf/&gt;   &lt;w:lidthemeother&gt;EN-US&lt;/w:LidThemeOther&gt;   &lt;w:lidthemeasian&gt;ZH-CN&lt;/w:LidThemeAsian&gt;   &lt;w:lidthemecomplexscript&gt;X-NONE&lt;/w:LidThemeComplexScript&gt;   &lt;w:compatibility&gt;    &lt;w:breakwrappedtables/&gt;    &lt;w:snaptogridincell/&gt;    &lt;w:wraptextwithpunct/&gt;    &lt;w:useasianbreakrules/&gt;    &lt;w:dontgrowautofit/&gt;    &lt;w:splitpgbreakandparamark/&gt;    &lt;w:dontvertaligncellwithsp/&gt;    &lt;w:dontbreakconstrainedforcedtables/&gt;    &lt;w:dontvertalignintxbx/&gt;    &lt;w:word11kerningpairs/&gt;    &lt;w:cachedcolbalance/&gt;    &lt;w:usefelayout/&gt;   &lt;/w:Compatibility&gt;   &lt;w:browserlevel&gt;MicrosoftInternetExplorer4&lt;/w:BrowserLevel&gt;   &lt;m:mathpr&gt;    &lt;m:mathfont val="Cambria Math"&gt;    &lt;m:brkbin val="before"&gt;    &lt;m:brkbinsub val="&amp;#45;-"&gt;    &lt;m:smallfrac val="off"&gt;    &lt;m:dispdef/&gt;    &lt;m:lmargin val="0"&gt;    &lt;m:rmargin val="0"&gt;    &lt;m:defjc val="centerGroup"&gt;    &lt;m:wrapindent val="1440"&gt;    &lt;m:intlim val="subSup"&gt;    &lt;m:narylim val="undOvr"&gt;   &lt;/m:mathPr&gt;&lt;/w:WordDocument&gt; &lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;w:latentstyles deflockedstate="false" defunhidewhenused="true" defsemihidden="true" defqformat="false" defpriority="99" latentstylecount="267"&gt;   &lt;w:lsdexception locked="false" priority="0" semihidden="false" unhidewhenused="false" qformat="true" name="Normal"&gt;   &lt;w:lsdexception locked="false" priority="9" semihidden="false" unhidewhenused="false" qformat="true" name="heading 1"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 2"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 3"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 4"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 5"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 6"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 7"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 8"&gt;   &lt;w:lsdexception locked="false" priority="9" qformat="true" name="heading 9"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 1"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 2"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 3"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 4"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 5"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 6"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 7"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 8"&gt;   &lt;w:lsdexception locked="false" priority="39" name="toc 9"&gt;   &lt;w:lsdexception locked="false" priority="35" qformat="true" name="caption"&gt;   &lt;w:lsdexception locked="false" priority="10" semihidden="false" unhidewhenused="false" qformat="true" name="Title"&gt;   &lt;w:lsdexception locked="false" priority="1" name="Default Paragraph Font"&gt;   &lt;w:lsdexception locked="false" priority="11" semihidden="false" unhidewhenused="false" qformat="true" name="Subtitle"&gt;   &lt;w:lsdexception locked="false" priority="22" semihidden="false" unhidewhenused="false" qformat="true" name="Strong"&gt;   &lt;w:lsdexception locked="false" priority="20" semihidden="false" unhidewhenused="false" qformat="true" name="Emphasis"&gt;   &lt;w:lsdexception locked="false" priority="59" semihidden="false" unhidewhenused="false" name="Table Grid"&gt;   &lt;w:lsdexception locked="false" unhidewhenused="false" name="Placeholder Text"&gt;   &lt;w:lsdexception locked="false" priority="1" semihidden="false" unhidewhenused="false" qformat="true" name="No Spacing"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 1"&gt;   &lt;w:lsdexception locked="false" unhidewhenused="false" name="Revision"&gt;   &lt;w:lsdexception locked="false" priority="34" semihidden="false" unhidewhenused="false" qformat="true" name="List Paragraph"&gt;   &lt;w:lsdexception locked="false" priority="29" semihidden="false" unhidewhenused="false" qformat="true" name="Quote"&gt;   &lt;w:lsdexception locked="false" priority="30" semihidden="false" unhidewhenused="false" qformat="true" name="Intense Quote"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 1"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 2"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 3"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 4"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 5"&gt;   &lt;w:lsdexception locked="false" priority="60" semihidden="false" unhidewhenused="false" name="Light Shading Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="61" semihidden="false" unhidewhenused="false" name="Light List Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="62" semihidden="false" unhidewhenused="false" name="Light Grid Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="63" semihidden="false" unhidewhenused="false" name="Medium Shading 1 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="64" semihidden="false" unhidewhenused="false" name="Medium Shading 2 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="65" semihidden="false" unhidewhenused="false" name="Medium List 1 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="66" semihidden="false" unhidewhenused="false" name="Medium List 2 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="67" semihidden="false" unhidewhenused="false" name="Medium Grid 1 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="68" semihidden="false" unhidewhenused="false" name="Medium Grid 2 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="69" semihidden="false" unhidewhenused="false" name="Medium Grid 3 Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="70" semihidden="false" unhidewhenused="false" name="Dark List Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="71" semihidden="false" unhidewhenused="false" name="Colorful Shading Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="72" semihidden="false" unhidewhenused="false" name="Colorful List Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="73" semihidden="false" unhidewhenused="false" name="Colorful Grid Accent 6"&gt;   &lt;w:lsdexception locked="false" priority="19" semihidden="false" unhidewhenused="false" qformat="true" name="Subtle Emphasis"&gt;   &lt;w:lsdexception locked="false" priority="21" semihidden="false" unhidewhenused="false" qformat="true" name="Intense Emphasis"&gt;   &lt;w:lsdexception locked="false" priority="31" semihidden="false" unhidewhenused="false" qformat="true" name="Subtle Reference"&gt;   &lt;w:lsdexception locked="false" priority="32" semihidden="false" unhidewhenused="false" qformat="true" name="Intense Reference"&gt;   &lt;w:lsdexception locked="false" priority="33" semihidden="false" unhidewhenused="false" qformat="true" name="Book Title"&gt;   &lt;w:lsdexception locked="false" priority="37" name="Bibliography"&gt;   &lt;w:lsdexception locked="false" priority="39" qformat="true" name="TOC Heading"&gt;  &lt;/w:LatentStyles&gt; &lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if gte mso 10]&gt; &lt;style&gt;  /* Style Definitions */  table.MsoNormalTable  {mso-style-name:"Table Normal";  mso-tstyle-rowband-size:0;  mso-tstyle-colband-size:0;  mso-style-noshow:yes;  mso-style-priority:99;  mso-style-qformat:yes;  mso-style-parent:"";  mso-padding-alt:0in 5.4pt 0in 5.4pt;  mso-para-margin:0in;  mso-para-margin-bottom:.0001pt;  mso-pagination:widow-orphan;  font-size:11.0pt;  font-family:"Calibri","sans-serif";  mso-ascii-font-family:Calibri;  mso-ascii-theme-font:minor-latin;  mso-fareast-font-family:SimSun;  mso-fareast-theme-font:minor-fareast;  mso-hansi-font-family:Calibri;  mso-hansi-theme-font:minor-latin;  mso-bidi-font-family:"Times New Roman";  mso-bidi-theme-font:minor-bidi;} &lt;/style&gt; &lt;![endif]--&gt;There is a growing need for evidence of the effectiveness of products and services being sold to schools.  A new release of SIIA’s product evaluation guidelines is now available at the&lt;span style="color: rgb(51, 51, 255);"&gt; &lt;/span&gt;&lt;a style="color: rgb(51, 51, 255);" href="https://www.sellingtoschools.com/products/product-evaluation-research-guidelines-publishers-developers"&gt;Selling to Schools website&lt;/a&gt; (with continued free access to SIIA members), to help guide publishers in measuring the effectiveness of the tools they are selling to schools.&lt;span style="font-size:100%;"&gt;&lt;span style="mso-spacerun:yes"&gt; &lt;/span&gt;&lt;/span&gt;   &lt;p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;line-height: normal"&gt;&lt;span style="font-size:100%;"&gt;&lt;/span&gt;&lt;/p&gt;It’s been almost a decade since NCLB made its call for “scientifically-based research,” but the calls for research haven’t faded away.  This is because resources available to schools have diminished over that time, heightening the importance of cost benefit trade-offs in spending.&lt;br /&gt;NCLB has focused attention on test score achievement, and this metric is becoming more pervasive; e.g., through a tie to teacher evaluation and through linkages to dropout risk.  While NCLB fostered a compliance mentality—product specs had to have a check mark next to SBR—the need to assure that funds are not wasted is now leading to a greater interest in research results.  Decision-makers are now very interested in whether specific products will be effective, or how well they have been working, in their districts.&lt;br /&gt;&lt;br /&gt;Fortunately, the data available for evaluations of all kinds is getting better and easier to access.  The US Department of Education has poured hundreds of millions of dollars into state data systems.  These investments make data available to states and drive the cleaning and standardizing of data from districts. At the same time, districts continue to invest in data systems and warehouses.  While still not a trivial task, the ability of school district researchers to get the data needed to determine if an investment paid off—in terms of increased student achievement or attendance—has become much easier over the last decade.&lt;br /&gt;&lt;br /&gt;The reauthorization of ESEA (i.e., NCLB) is maintaining the pressure to evaluate education products.  We are still a long way from the draft reauthorization introduced in Congress becoming a law, but the initial indications are quite favorable to the continued production of product effectiveness evidence.  The language has changed somewhat.  Look for the phrase “evidence based”.  Along with the term “scientifically-valid”, this new language is actually more sophisticated and potentially more effective than the old SBR neologism.  Bob Slavin, one of the reviewers of the SIIA guidelines, says in his &lt;a style="color: rgb(51, 51, 255);" href="http://blogs.edweek.org/edweek/sputnik/2011/10/evidence_of_evidence_in_senate_esea.html"&gt;Ed Week blog&lt;/a&gt;  that “This is not the squishy ‘based on scientifically-based evidence’ of NCLB. This is the real McCoy.”  It is notable that the definition of “evidence-based” goes beyond just setting rules for the design of research, such as the SBR focus on the single dimension of “internal validity” for which randomization gets the top rating.  It now asks how generalizable the research is or its “external validity”; i.e., does it have any relevance for decision-makers?&lt;br /&gt;&lt;br /&gt;One of the important goals of the SIIA guidelines for product effectiveness research is to improve the credibility of publisher-sponsored research.  It is important that educators see it as more than just “market research” producing biased results.  In this era of reduced budgets, schools need to have tangible evidence of the value of products they buy.  By following the SIIA’s guidelines, publishers will find it easier to achieve that credibility.&lt;br /&gt;&lt;br /&gt;&lt;p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;line-height: normal"&gt;&lt;span style="font-size:12.0pt;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:0in;margin-bottom:.0001pt;line-height: normal"&gt;&lt;span style="font-size:12.0pt;"&gt; &lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-1785119190778573344?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/1785119190778573344/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2011/12/need-for-product-evaluations-continues.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1785119190778573344'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1785119190778573344'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2011/12/need-for-product-evaluations-continues.html' title='Need for Product Evaluations Continues to Grow'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-841096551109723200</id><published>2011-09-08T15:14:00.000-07:00</published><updated>2011-09-09T14:44:08.975-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Stagnant Scores”'/><category scheme='http://www.blogger.com/atom/ns#' term='EETT'/><category scheme='http://www.blogger.com/atom/ns#' term='“In Classroom of Future'/><category scheme='http://www.blogger.com/atom/ns#' term='&quot;Matt Richtel&quot;'/><category scheme='http://www.blogger.com/atom/ns#' term='“Grading the Digital School”'/><category scheme='http://www.blogger.com/atom/ns#' term='&quot;new york times&quot;'/><category scheme='http://www.blogger.com/atom/ns#' term='&quot;measurable value of technology&quot;'/><category scheme='http://www.blogger.com/atom/ns#' term='&quot;Enhancing Education Through Technology&quot;'/><title type='text'>Comment on the NY Times: “In Classroom of Future, Stagnant Scores”</title><content type='html'>The New York Times is running a series of front-page articles on &lt;a href="http://www.nytimes.com/2011/09/04/technology/technology-in-schools-faces-questions-on-value.html?_r=2&amp;amp;scp=1&amp;amp;sq=%22digital%20school%22&amp;amp;st=cse" target="_blank"&gt;“Grading the Digital School.”&lt;/a&gt;  The first one ran Labor Day weekend and raised the question as to  whether there’s any evidence that would persuade a school board or  community to allocate extra funds for technology. With the demise of the  Enhancing Education Through Technology (EETT) program, federal funds  dedicated to technology will no longer be flowing into states and  districts. Technology will have to be measured against any other  discretionary purchase. The resulting internal debates within schools and their communities about the expense vs. value  of technology promise to have interesting implications and are worth  following closely.&lt;br /&gt;&lt;p&gt;The first article by Matt Richtel revisits a debate that has been  going on for decades between those who see technology as the key to  “21st Century learning” and those who point to the dearth of evidence  that technology makes any measurable difference to learning. It’s time  to try to reframe this discussion in terms of what can be measured. And  in considering what to measure, and in honor of Labor Day, we raise a  question that is often ignored: what role do teachers play in  generating the measurable value of technology?&lt;/p&gt;        &lt;p&gt;Let’s start with the most common argument in favor of technology,  even in the absence of test score gains. The idea is that technology  teaches skills “needed in a modern economy,” and these are not measured  by the test scores used by state and federal accountability systems.  Karen Cator, director of the US Department of Education office of  educational technology, is quoted as saying (in reference to the lack of improvement in test scores), “...look at all the other  things students are doing: learning to use the Internet to research,  learning to organize their work, learning to use professional writing  tools, learning to collaborate with others.” Presumably, none of these  things directly impact test scores. The problem with this perennial  argument is that many other things that schools keep track of should provide indicators of improvement. If as a result of technology,  students are more excited about learning or more engaged in  collaborating, we could look for an improvement in attendance, a  decrease in drop-outs, or students signing up for more challenging  courses.&lt;/p&gt;        &lt;p&gt;Information on student behavioral indicators is becoming easier to  obtain since the standardization of state data systems. There are some  basic study designs that use comparisons among students within the  district or between those in the district and those elsewhere in the  state. This approach uses statistical modeling to identify trends and  control for demographic differences, but is not beyond the capabilities  of many school district research departments&lt;sup&gt;1&lt;/sup&gt; or the resources available to the technology vendors. (Empirical has  conducted research for many of the major technology providers, often  focusing on results for a single district interested in obtaining  evidence to support local decisions.) Using behavioral or other  indicators, a district such as that in the Times article can answer its  own questions. Data from the technology systems themselves can be used  to identify users and non-users and to confirm the extent of usage and  implementation. It is also valuable to examine whether some students  (those in most need or those already doing okay) or some teachers  (veterans or novices) receive greater benefit from the technology. This information may help the district  focus resources where they do the most good.&lt;/p&gt;        &lt;p&gt;A final thought about where to look for impacts of technologies comes from a graph of the school district’s budget. While spending on  technology and salaries have both declined over the last three years,  spending on salaries is still about 25 times as great as on technologies. Any discussion of where to find an impact of technology  must consider labor costs, which are the district’s primary investment. We might ask whether a small investment in technology would allow the  district to reduce the numbers of teachers by, for example, allowing a  small increase in the number of students each teacher can productively  handle. Alternatively, we might ask whether technology can make a  teacher more effective, by whatever measures of effective teaching the  district chooses to use, with their current students. We might ask whether technologies result in keeping young teachers on  the job longer or encouraging initiative to take on more challenging  assignments.&lt;/p&gt;        &lt;p&gt;It may be a mistake to look for a direct impact of technology on  test scores (aside from technologies aimed specifically at that goal),  but it is also a mistake to assume the impact is, in principle, not  measurable. We need a clear picture of how various technologies are  expected to work and where we can look for the direct and indirect  effects. An important role of technology in the modern economy is providing people with actionable evidence. It would be ironic if  education technology was inherently opaque to educational decision  makers.&lt;br /&gt;— DN&lt;/p&gt;        &lt;p&gt;&lt;span class="reference"&gt;&lt;span style="font-size:78%;"&gt;&lt;sup id="footnote1"&gt;1&lt;/sup&gt; &lt;/span&gt;&lt;span style="font-size:78%;"&gt;Or we would  hope, the New York Times. Sadly, the article provides a graph of trends  in math and reading for the district highlighted in the story compared  to trends for the state. The graphic is meant to show that the district  is doing worse than the state average. But the article never suggests  that we should consider the population of the particular district and  whether it is doing better or worse than one would expect, controlling  for demographics, available resources, and other characteristics.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-841096551109723200?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/841096551109723200/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2011/09/comment-on-times-in-classroom-of-future.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/841096551109723200'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/841096551109723200'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2011/09/comment-on-times-in-classroom-of-future.html' title='Comment on the NY Times: “In Classroom of Future, Stagnant Scores”'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-5480990684403749168</id><published>2011-04-19T15:52:00.000-07:00</published><updated>2011-04-19T15:57:06.185-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='R and E'/><category scheme='http://www.blogger.com/atom/ns#' term='FERPA'/><category scheme='http://www.blogger.com/atom/ns#' term='RFP'/><category scheme='http://www.blogger.com/atom/ns#' term='evaluation of state and local education programs and policies'/><category scheme='http://www.blogger.com/atom/ns#' term='local research'/><category scheme='http://www.blogger.com/atom/ns#' term='REL'/><category scheme='http://www.blogger.com/atom/ns#' term='directors of research evaluation'/><category scheme='http://www.blogger.com/atom/ns#' term='Regional Educaion Labs'/><title type='text'>A Conversation About Building State and Local Research Capacity</title><content type='html'>&lt;p&gt;&lt;a href="http://ies.ed.gov/director/biography.asp" target="_blank"&gt;John Q Easton&lt;/a&gt;,  director of the Institute of Education Sciences (IES), came to New  Orleans recently to participate in the annual meeting of the &lt;a href="http://aera.net/" target="_blank"&gt;American Educational Research Association&lt;/a&gt;.   At one of his stops, he was the featured speaker at a meeting of the  Directors of Research and Evaluation (DRE), an organization composed of  school district research directors. (DRE is affiliated with AERA and was  recently incorporated as a 501(c)(3)).  John started his remarks by  pointing out that for much of his career he was a school district  research director and felt great affinity to the group.  He introduced  the directions that IES was taking, especially how it was approaching  working with school systems. He spent most of the hour fielding  questions and engaging in discussion with the participants.  Several  interesting points came out of the conversation about roles for the  researchers who work for education agencies.&lt;/p&gt;        &lt;blockquote style="font-weight: bold; color: rgb(51, 51, 255);"&gt;&lt;p&gt;“...in parallel to building a research culture in  districts, it will be necessary to build a practitioner culture among  researchers.”&lt;/p&gt;&lt;/blockquote&gt;        &lt;p&gt;Historically, most IES research grant programs have been aimed at  university or other academic researchers.  It is noteworthy that  even  in a program for “&lt;a href="http://ies.ed.gov/funding/ncer_rfas/stateandlocal.asp" target="_blank"&gt;Evaluation of State and Local Education Programs and Policies&lt;/a&gt;,”  grants have been awarded only to universities and large research firms.  There is no expectation that researchers working for the state or local  agency would be involved in the research beyond the implementation of  the program.  The  &lt;a href="https://www.fbo.gov/index?s=opportunity&amp;amp;mode=form&amp;amp;id=45e0c97bbd9b6ac09b05df2b59428f10&amp;amp;tab=core" target="_blank"&gt;RFP&lt;/a&gt;  for the next generation of Regional Education Labs (REL) contracts may  help to change that. The new RFP expects the RELs to work closely with  education agencies to define their research questions and to assist  alliances of state and local agencies in developing their own research  capacity.&lt;/p&gt;    &lt;p&gt;Members of the audience noted that, as district directors of  research, they often spend more time reviewing research proposals from  students and professors at local colleges who want to conduct research  in their schools, rather than actually answering questions initiated by  the district.  Funded researchers treat the districts as the “human  subjects,” paying incentives to participants and sometimes paying for  data services. But the districts seldom participate in defining the  research topic, conducting the studies, or benefiting directly from the  reported findings.  The new mission of the RELs to build local capacity  will be a major shift. &lt;/p&gt;    &lt;p&gt;Some in the audience pointed out reasons to be skeptical that this  REL agenda would be possible.  How can we build capacity if research  and evaluation departments across the country are being cut?  In fact,  very little is known about the number of state or district practitioners  whose capacity for research and evaluation could be built by applying  the REL resources. (Perhaps, a good first research task for the RELs  would be to conduct a national survey to measure the existing capacity.)&lt;/p&gt;    &lt;p&gt;John made a good point in reply: IES and the RELs have to work  with the district leadership—not just the R&amp;amp;E departments—to make  this work.  The leadership has to have a more analytic view.  They need  to see the value of having an R&amp;amp;E department that goes beyond test  administration, and is able to obtain evidence to support local  decisions. By cultivating a research culture in the district, evaluation  could be routinely built in to new program implementations from the  beginning. The value of the research would be demonstrated in the  improvements resulting from informed decisions. Without a district  leadership team that values research to find out what works for the  district, internal R&amp;amp;E departments will not be seen as an important  capacity. &lt;/p&gt;    &lt;p&gt;Some in the audience pointed out that in parallel to building a  research culture in districts, it will be necessary to build a  practitioner culture among researchers.  It would be straightforward for  IES to require that research grantees and contractors engage the  district R&amp;amp;E staff in the actual work, not just review the research  plan and sign the FERPA agreement.  Practitioners ultimately hold the  expertise in how the programs and research can be implemented  successfully in the district, thus improving the overall quality and  relevance of the research.&lt;br /&gt;   —DN&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-5480990684403749168?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/5480990684403749168/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2011/04/conversation-about-building-state-and.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5480990684403749168'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5480990684403749168'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2011/04/conversation-about-building-state-and.html' title='A Conversation About Building State and Local Research Capacity'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-4991748355312159672</id><published>2011-04-01T10:16:00.000-07:00</published><updated>2011-04-01T11:28:38.023-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='modal effect'/><category scheme='http://www.blogger.com/atom/ns#' term='social policy'/><category scheme='http://www.blogger.com/atom/ns#' term='lee cronbach'/><category scheme='http://www.blogger.com/atom/ns#' term='generalizability'/><category scheme='http://www.blogger.com/atom/ns#' term='i3'/><category scheme='http://www.blogger.com/atom/ns#' term='generalization'/><category scheme='http://www.blogger.com/atom/ns#' term='local evaluation'/><title type='text'>Looking Back 35 Years to Learn about Local Experiments</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://empiricaleducation.com/images/cronbach.jpg"&gt;&lt;img style="float: left; margin: 0pt 10px 10px 0pt; cursor: pointer; width: 233px; height: 159px;" src="http://empiricaleducation.com/images/cronbach.jpg" alt="" border="0" /&gt;&lt;/a&gt;With the growing interest among federal agencies in building local capacity for research, we took another look at an &lt;a href="http://psycnet.apa.org/journals/amp/30/2/116/"&gt;article by Lee Cronbach published in 1975&lt;/a&gt;&lt;span style="font-size:78%;"&gt;&lt;span style="text-decoration: underline;"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;a href="https://mail.google.com/a/empiricaleducation.com/?ui=2&amp;amp;view=bsp&amp;amp;ver=ohhl4rw8mbn4#12f11f39ddce3cbe__ftn1" name="12f11f39ddce3cbe__ftnref1" title=""&gt;&lt;/a&gt;. We found it has a lot to say about conducting local experiments and implications for generalizability.&lt;span&gt;  &lt;/span&gt;Cronbach worked for much of his career at Empirical’s neighbor, Stanford University, and his work has had a direct and indirect influence on our thinking.&lt;span&gt;  &lt;/span&gt;Some may interpret Cronbach’s work as stating that randomized trials of educational interventions have no value because of the complexity of interactions between subjects, contexts, and the experimental treatment. In any particular context, these interactions are infinitely complex, forming a “hall of mirrors” (as he famously put it, p. 119), making experimental results—which at most can address a small number of lower-order interactions—irrelevant.&lt;span&gt;  &lt;/span&gt;We don’t read it that way.&lt;span&gt;  &lt;/span&gt;Rather, we see powerful insights as well as cautions for conducting the kinds of field experiments that are beginning to show promise for providing educators with useful evidence.&lt;span&gt;  &lt;/span&gt;      &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;"&gt;We presented these ideas at the &lt;a href="http://www.sree.org/conferences/2011/program/"&gt;Society for Research in Educational Effectiveness conference&lt;/a&gt; in March, building the presentation around a set of memorable quotes from the 1975 article.&lt;span&gt;  &lt;/span&gt;Here we highlight some of the main ideas.&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;"&gt;Quote #1: &lt;i&gt;“When we give proper weight to local conditions, any generalization is a working hypothesis, not a conclusion…positive results obtained with a new procedure for early education in one community warrant another community trying it. But instead of trusting that those results generalize, the next community needs its own local evaluation”&lt;/i&gt; (p. 125).&lt;/p&gt;    &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;"&gt;Practitioners are making decisions for their local jurisdiction.&lt;span&gt;  &lt;/span&gt;An experiment conducted elsewhere (including over many locales, where the results are averaged) provides a useful starting point, but not “proof” that it will or will not work in the same way locally. Experiments give us a working hypothesis concerning an effect, but it has to be tested against local conditions at the appropriate scale of implementation.&lt;span&gt;  &lt;/span&gt;This brings to mind California’s experience with class size reduction following the famous experiment in Tennessee, and how the working hypothesis corroborated through the experiment did not transfer to a different context.&lt;span&gt;  &lt;/span&gt;We also see applicability of Cronbach’s ideas in the Investing in Innovation (i3) program, where initial evidence is being taken as a warrant to scale-up intervention, but where the grants included funding for research under new conditions where implementation may head in unanticipated directions, leading to new effects. &lt;/p&gt;    &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;"&gt;Quote #2:&lt;span&gt;  &lt;/span&gt;&lt;i&gt;“Instead of making generalization the ruling consideration in our research, I suggest that we reverse our priorities. An observer collecting data in one particular situation… will give attention to whatever variables were controlled, but he will give equally careful attention to uncontrolled conditions .… As results accumulate, a person who seeks understanding will do his best to trace how the uncontrolled factors could have caused local departures from the modal effect. That is, generalization comes late, and the exception is taken as seriously as the rule”&lt;/i&gt; (pp. 124-125).&lt;/p&gt;    &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;"&gt;Finding or even seeking out conditions that lead to variation in the treatment effect facilitates external validity, as we build an account of the variation. This should not be seen as a threat to generalizability because an estimate of average impact is not robust across conditions. We should spend some time looking at the ways that the intervention interacts differently with local characteristics, in order to determine which factors account for heterogeneity in the impact and which ones do not.&lt;span&gt;  &lt;/span&gt;Though this activity is exploratory and not necessarily anticipated in the design, it provides the basis for understanding how the treatment plays out, and why its effect may not be constant across settings. Over time, generalizations can emerge, as we compile an account of the different ways in which the treatment is realized and the conditions that suppress or accentuate its effects.&lt;/p&gt;    &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;"&gt;Quote #3: &lt;span&gt; &lt;/span&gt;&lt;i&gt;“Generalizations decay”&lt;/i&gt; (p. 122). &lt;/p&gt;    &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;"&gt;In the social policy arena, and especially with the rapid development of technologies, we can’t expect interventions to stay constant.&lt;span&gt;  &lt;/span&gt;And we certainly can’t expect the contexts of implementation to be the same over many years.&lt;span&gt;  &lt;/span&gt;The call for quicker turn-around in our studies is therefore necessary, not just because decision-makers need to act, but because any finding may have a short shelf life.&lt;span&gt;  &lt;/span&gt;&lt;span&gt;  &lt;/span&gt;- AJ &amp;amp; DN&lt;/p&gt;  &lt;div&gt;&lt;br /&gt;&lt;hr align="left"  width="33%" style="font-size:78%;"&gt;    &lt;div&gt;  &lt;p&gt;&lt;span style="font-size:78%;"&gt;&lt;a href="https://mail.google.com/a/empiricaleducation.com/?ui=2&amp;amp;view=bsp&amp;amp;ver=ohhl4rw8mbn4#12f11f39ddce3cbe__ftnref1" name="12f11f39ddce3cbe__ftn1" title=""&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="line-height: 115%;"&gt;[1]&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;/span&gt; Cronbach, L. J. (1975). Beyond the two disciplines of scientifi­c psychology. &lt;i&gt;American Psychologist&lt;/i&gt;, 116-127.&lt;/p&gt;  &lt;/div&gt;  &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-4991748355312159672?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/4991748355312159672/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2011/04/looking-back-35-years-to-learn-about.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4991748355312159672'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4991748355312159672'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2011/04/looking-back-35-years-to-learn-about.html' title='Looking Back 35 Years to Learn about Local Experiments'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-2613536234118970541</id><published>2010-11-30T11:46:00.000-08:00</published><updated>2010-11-30T11:52:48.860-08:00</updated><title type='text'>Recognizing Success</title><content type='html'>When the Obama-Duncan administration approaches teacher evaluation, the emphasis is on recognizing success.  We heard that cle&lt;span style="font-family: georgia;"&gt;arly in Arne Duncan’s comments on the release of teacher value-added modeling (VAM) data for LA Unified by the &lt;/span&gt;&lt;span style="font-style: italic; font-family: georgia;"&gt;LA Times&lt;/span&gt;&lt;span style="font-family: georgia;"&gt;.  He’s quoted as saying, "What's there to hide? In education, we've been scared to talk about success."  Since VAM is often thought of as a method for weedin&lt;/span&gt;g out low performing teachers, Duncan’s statement referencing success casts the use of VAM in a more positive light.  Therefore we want to raise the issue here: how do you know when you’ve found success?  The general belief is that you’ll recognize it when you see it.  But sorting through a multitude of variables is not a straightforward process, and that’s where research methods and statistical techniques can be useful. Below we illustrate how this plays out in teacher and in program evaluation.&lt;br /&gt;&lt;br /&gt;As we report in our &lt;a href="http://empiricaleducation.com/index.php#nov_17_10"&gt;news story&lt;/a&gt;, Empirical is participating in the Gates Foundation project called Measures of Effective Teaching (MET). This project is known for its focus on value-added modeling (VAM) of teacher effectiveness.  It is also known for having collected 13,000 hours of video from 3,000 teachers’ classrooms—an astounding accomplishment.  Research partners from many top institutions hope to be able to identify the observable correlates for teachers whose students perform at high levels as well as for teachers whose students do not. (The MET project tested all the students with an “alternative assessment” in addition to using the conventional state achievement tests.)  With this massive sample that includes both data about the students and videos of teachers, researchers can identify classroom practices that are consistently associated with student success. Empirical’s role in MET is to build a web-based tool that enables school system decision-makers to make use of the data to improve their own teacher evaluation processes. Thus they will be able to build on what’s been learned when conducting their own mini-studies aimed at improving their local observational evaluation methods.&lt;br /&gt;&lt;br /&gt;When the MET project recently had its “leads” meeting in Washington DC., the assembled group of researchers, developers, school administrators, and union leaders were treated to an after-dinner speech and Q&amp;amp;A by Joanne Weiss. Joanne is now Arne Duncan’s chief of staff, after having directed the Race to the Top program (and before that was involved in many Silicon Valley educational innovations).  The approach of the current administration to teacher evaluation–emphasizing that it is about recognizing success—carries over into program evaluation. This attitude was clear in Joanne’s presentation, in which she declared an intention to “shine a light on what is working.”  The approach is part of their thinking about the reauthorization of ESEA, where more flexibility is given to local decision-makers to develop solutions, while the federal legislation is more about establishing achievement goals such as being the leader in college graduation.&lt;br /&gt;&lt;br /&gt;Hand in hand with providing flexibility to find solutions, Joanne also spoke of the need to build “local capacity to identify and scale up effective programs.”  We welcome the idea that school districts will be free to try out good ideas and identify those that work.  This kind of cycle of continuous improvement is very different from the idea, incorporated in NCLB, that researchers will determine what works and disseminate these facts to the practitioners.  Joanne spoke about continuous improvement in the context of teachers and principals, where on a small scale it may be possible to recognize successful teachers and programs without research methodologies. While a teacher’s perception of student progress in the classroom may be aided by regular assessments, the determination of success seldom calls for research design.  We advocate for a broader scope, and maintain that a cycle of continuous improvement is just as much needed at the district and state levels.  At those levels, we are talking about identifying successful schools or successful programs where research and statistical techniques are needed to direct the light onto what is working.  Building research capacity at the district and state level will be a necessary accompaniment to any plan to highlight successes.  And, of course, research can’t be motivated purely by the desire to document the success of a program.  We have to be equally willing to recognize failure. The administration will have to take seriously the local capacity building to achieve the hoped-for identification and scaling up of successful programs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-2613536234118970541?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/2613536234118970541/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2010/11/recognizing-success.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/2613536234118970541'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/2613536234118970541'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2010/11/recognizing-success.html' title='Recognizing Success'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-5975193568996814214</id><published>2010-09-08T11:35:00.000-07:00</published><updated>2010-09-08T12:31:11.271-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='value-added modeling'/><category scheme='http://www.blogger.com/atom/ns#' term='productive collaborations'/><category scheme='http://www.blogger.com/atom/ns#' term='VAM'/><category scheme='http://www.blogger.com/atom/ns#' term='VAM calculations'/><category scheme='http://www.blogger.com/atom/ns#' term='academic growth'/><category scheme='http://www.blogger.com/atom/ns#' term='statistical techniques'/><category scheme='http://www.blogger.com/atom/ns#' term='Economic Policy Institute'/><category scheme='http://www.blogger.com/atom/ns#' term='merit pay'/><category scheme='http://www.blogger.com/atom/ns#' term='Arne Duncan'/><title type='text'>2010-2011: The Year of the VAM</title><content type='html'>If you haven’t heard about Value-Added Modeling (VAM) in relation to the controversial teacher ratings in Los Angeles and subsequent brouhaha in the world of education, chances are that you’ll hear about it in the coming year.&lt;br /&gt;&lt;br /&gt;VAM is a family of statistical techniques for estimating the contribution of a teacher or of a school to the academic growth of students. Recently, the &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.latimes.com/news/local/la-me-teachers-value-20100815,0,258862,full.story"&gt;LA Times&lt;/a&gt;&lt;/span&gt; obtained the longitudinal test score records for all the elementary school teachers and students in LA Unified and had a RAND economist (working as an independent consultant) run the calculations. The result was a “score” for all LAUSD elementary school teachers. Note that the economist who did the calculations wrote up a &lt;a href="http://www.latimes.com/media/acrobat/2010-08/55538493.pdf"&gt;technical report&lt;/a&gt; on how it was done and the specific questions his research was aimed at answering.&lt;br /&gt;&lt;br /&gt;Reactions to the idea that a teacher could be evaluated using a set of test scores—in this case from the California Standards Test—were swift and divisive. The concept was denounced by the teachers’ union, with the &lt;a href="http://www.latimes.com/news/local/la-me-teachers-react-20100816,0,6701929.story"&gt;local leader calling for a boycott&lt;/a&gt;. Meanwhile, the US Secretary of Education, Arne Duncan, made headlines by commenting favorably on the idea. The &lt;span style="font-style:italic;"&gt;&lt;a href="http://articles.latimes.com/2010/aug/16/local/la-me-0817-teachers-react-20100817"&gt;LA Times&lt;/a&gt;&lt;/span&gt; quotes him as saying “What’s there to hide? In education, we’ve been scared to talk about success.”&lt;br /&gt;&lt;br /&gt;There is a tangle of issues here, along with exaggerations, misunderstandings, and confusion between research techniques and policy decisions. This column will address some of the issues over the coming year. We also plan to announce some of our own contributions to the VAM field in the form of project news.&lt;br /&gt;&lt;br /&gt;The major hot-button issues include appropriate usage (e.g., for part or all of the input to merit pay decisions) and technical failings (e.g., biases in the calculations). Of course, these two issues are often linked; for example, many argue that biases may make VAM unfair for individual merit pay. The recent &lt;a href="http://www.epi.org/publications/entry/bp278"&gt;Brief from the Economic Policy Institute&lt;/a&gt;, authored by an impressive team of researchers (several our friends/mentors from neighboring Stanford), makes a well reasoned case for not using VAM as the only input to high-stakes decisions. While their arguments are persuasive with respect to VAM as the lone criterion for awarding merit pay or firing individual teachers, we still see a broad range of uses for the technique, along with the considerable challenges.&lt;br /&gt;&lt;br /&gt;For today, let’s look at one issue that we find particularly interesting: How to handle teacher collaboration in a VAM framework. In a recent &lt;a href="http://www.edweek.org/ew/articles/2010/09/01/02marshall.h30.html"&gt;&lt;span style="font-style:italic;"&gt;Education Week&lt;/span&gt; commentary&lt;/a&gt;, Kim Marshall argues that any use of test scores for merit pay is a losing proposition. One of the many reasons he cites is its potentially negative impact on collaboration.&lt;br /&gt;&lt;br /&gt;A problem with an exercise like that conducted by the &lt;span style="font-style:italic;"&gt;LA Times&lt;/span&gt; is that there are organizational arrangements that do not come into the calculations. For example, we find that team teaching within a grade at a school is very common. A teacher with an aptitude for teaching math may take another teacher’s students for a math period, while sending her own kids to the other teacher for reading. These informal arrangements are not part of the official school district roster. They can be recorded (with some effort) during the current year but are lost for prior years. Mentoring is a similar situation, wherein the value provided to the kids is distributed among members of their team of teachers. We don’t know how much difference collaborative or mentoring arrangements make to individual VAM scores, but one fear in using VAM in setting teacher salaries is that it will militate against productive collaborations and reduce overall achievement.&lt;br /&gt;&lt;br /&gt;Some argue that, because VAM calculations do not properly measure or include important elements, VAM should be disqualified from playing any role in evaluation. We would argue that, although they are imperfect, VAM calculations can still be used as a component of an evaluation process. Moreover, continued improvements can be made in testing, in professional development, and in the VAM calculations themselves. In the case of collaboration, what is needed are ways that a principal can record and evaluate the collaborations and mentoring so that the information can be worked into the overall evaluation and even into the VAM calculation. In such an instance, it would be the principal at the school, not an administrator at the district central office, who can make the most productive use of the VAM calculations. With knowledge of the local conditions and potential for bias, the building leader may be in the best position to make personnel decisions.&lt;br /&gt;&lt;br /&gt;VAM can also be an important research tool—using consistently high and/or low scores as a guide for observing classroom practices that are likely to be worth promoting through professional development or program implementations. We’ve seen VAM used this way, for example, by the research team at &lt;a href="http://www.wcpss.net/evaluation-research/reports/2010/1001eff_teach.pdf"&gt;Wake County Public Schools in North Carolina&lt;/a&gt; in identifying strong and weak practices in several content areas. This is clearly a rich area for continued research.&lt;br /&gt;&lt;br /&gt;The &lt;span style="font-style:italic;"&gt;LA Times&lt;/span&gt; has helped to catapult the issue of VAM onto the national radar. It has also sparked a discussion of how school data can be used to support local decisions—which can’t be a bad thing.&lt;br /&gt;— DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-5975193568996814214?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/5975193568996814214/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2010/09/2010-2011-year-of-vam.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5975193568996814214'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5975193568996814214'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2010/09/2010-2011-year-of-vam.html' title='2010-2011: The Year of the VAM'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-3923151768347826113</id><published>2010-06-03T09:39:00.000-07:00</published><updated>2010-06-03T14:09:29.063-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='relevant'/><category scheme='http://www.blogger.com/atom/ns#' term='rigorous research'/><category scheme='http://www.blogger.com/atom/ns#' term='product evaluation'/><category scheme='http://www.blogger.com/atom/ns#' term='educational decision-makers'/><category scheme='http://www.blogger.com/atom/ns#' term='SIIA'/><category scheme='http://www.blogger.com/atom/ns#' term='credible research'/><title type='text'>Making Vendor Research More Credible</title><content type='html'>The latest evidence that &lt;em&gt;research can be both rigorous and relevant&lt;/em&gt; was the subject of an &lt;a href="http://siia.net/index.php?option=com_docman&amp;task=doc_download&amp;gid=2574&amp;Itemid=318"&gt;announcement&lt;/a&gt; that the Software and Information Industry Association (SIIA) made last month about their new &lt;a href="http://siia.net/presentations/education/SIIA_EvaluationGuidelines_EdTechProduct.pdf"&gt;guidelines&lt;/a&gt; for conducting effectiveness research. The document is aimed at SIIA members, most of whom are executives of education software and technology companies and not necessarily schooled in research methodology. The main goal in publishing the guidelines is to improve the quality—and therefore the credibility—of research sponsored by the industry. The document provides SIIA members with things to keep in mind when contracting for research or using research in marketing materials. The document also has value for educators, especially those responsible for purchasing decisions. That’s an important point that I’ll get back to.&lt;br /&gt;&lt;br /&gt;One thing to make clear in this blog entry is that while your humble blogger (DN) is given credit as the author, the Guidelines actually came from a working group of SIIA members who put in many months of brainstorming, discussion, and review. DN’s primary contribution was just to organize the ideas, ensure they were technically accurate, and put them into easy to understand language. &lt;br /&gt;&lt;br /&gt;Here’s a taste of some of the ideas contained in the 22 guidelines:&lt;br /&gt;&lt;br /&gt;• With a few exceptions, all research should be reported regardless of the result. Cherry picking just the studies with strong positive results distorts the facts and in the long run hurts credibility. One lesson that might be taken from this is that conducting several small studies may be preferable to trying to prove a product effective (or not) in a single study.&lt;br /&gt;• Always provide a link to the full report. Too often in marketing materials (including those of advocacy groups, not just publishers) a fact such as “8th grade math achievement increased from 31% in 2004 to 63% in 2005,” is offered with no citation. In this specific case, the fact was widely cited but after considerable digging could be traced back to a report described by the project director as “anecdotal”. &lt;br /&gt;• Be sure to take implementation into account. In education, all instructional programs require setting up complex systems of teacher-student interaction, which can vary in numerous ways. Issues of how research can support the process and what to do with inadequate or outright failed implementation must be understood by researchers and consumers of research.&lt;br /&gt;• Watch out for the control condition. In education there are no placebos. In almost all cases we are comparing a new program to whatever is in place. Depending on how well the existing program works, the program being evaluated may appear to have an impact or not. This calls for careful consideration of where to test a product and understandable concern by educators as to how well a particular product tested in another district will perform against what is already in place in their district.&lt;br /&gt;&lt;br /&gt;The Guidelines are not just aimed at industry. SIIA believes that as decision-makers at schools begin to see a commitment to providing stronger research, their trust in the results will increase. It is also in the educators’ interest to review the guidelines because they provide a reference point for what actionable research should look like. Ultimately, the Guidelines provide educators with help in conducting their own research, whether it is on their own or in partnership with the education technology providers. -DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-3923151768347826113?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/3923151768347826113/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2010/06/making-vendor-research-more-credible.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/3923151768347826113'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/3923151768347826113'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2010/06/making-vendor-research-more-credible.html' title='Making Vendor Research More Credible'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-4051862742469032757</id><published>2010-03-29T16:50:00.000-07:00</published><updated>2010-03-29T16:55:42.295-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scientifically Based Research'/><category scheme='http://www.blogger.com/atom/ns#' term='effectiveness'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='Debra Viadero'/><category scheme='http://www.blogger.com/atom/ns#' term='Blueprint for Reform'/><category scheme='http://www.blogger.com/atom/ns#' term='scientific research'/><category scheme='http://www.blogger.com/atom/ns#' term='reauthorization of ESEA'/><title type='text'>Research: From NCLB to Obama’s Blueprint for ESEA</title><content type='html'>We can finally put “Scientifically Based Research” to rest. The term that appeared more than 100 times in NCLB appears zero times in the Obama administration’s &lt;a href="http://www.ed.gov/policy/elsec/leg/blueprint/blueprint.pdf"&gt;Blueprint for Reform&lt;/a&gt;, which is the document outlining its approach to the reauthorization of ESEA. The term was always an awkward neologism, coined presumably to avoid simply saying “scientific research.” It also allowed NCLB to contain an explicit definition to be enforced—a definition stipulating not just any scientific activities, but research aimed at coming to causal conclusions about the effectiveness of some product, policy, or laboratory procedure.&lt;br /&gt;&lt;br /&gt;A side effect of the SBR focus has been the growth of a compliance mentality among both school systems and publishers. Schools needed some assurance that a product was backed by SBR before they would spend money, while textbooks were ranked in terms of the number of SBR-proven elements they contained.&lt;br /&gt;&lt;br /&gt;Some have wondered if the scarcity of the word “research” in the new Blueprint might signal a retreat from scientific rigor and the use of research in educational decisions (see, for example, &lt;a href="http://blogs.edweek.org/edweek/inside-school-research/2010/03/in_2001_the_last_time.html"&gt;Debra Viadero’s blog&lt;/a&gt;). Although the approach is indeed different, the new focus makes a stronger case for research and extends its scope into decisions at all levels.&lt;br /&gt;&lt;br /&gt;The Blueprint shifts the focus to effectiveness. The terms “effective” or “effectiveness” appear about 95 times in the document. “Evidence” appears 18 times. And the compliance mentality is specifically called out as something to eliminate.&lt;br /&gt;&lt;br /&gt;“We will ask policymakers and educators at all levels to carefully analyze the impact of their policies, practices, and systems on student outcomes. ... And across programs, we will focus less on compliance and more on enabling effective local strategies to flourish.” (p. 35)&lt;br /&gt;&lt;br /&gt;Instead of the stiff definition of SBR, we now have a call to “policymakers and educators at all levels to carefully analyze the impact of their policies, practices, and systems on student outcomes.” Thus we have a new definition for what’s expected: carefully analyzing impact. The call does not go out to researchers per se, but to policymakers and educators at all levels. This is not a directive from the federal government to comply with the conclusions of scientists funded to conduct SBR. Instead, scientific research is everybody’s business now.&lt;br /&gt;&lt;br /&gt;Carefully analyzing the impact of practices on student outcomes is scientific research. For example, conducting research carefully requires making sure the right comparisons are made. A study that is biased by comparing two groups with very different motivations or resources is not a careful analysis of impact. A study that simply compares the averages of two groups without any statistical calculations can mistakenly identify a difference when there is none, or vice versa. A study that takes no measure of how schools or teachers used a new practice—or that uses tests of student outcomes that don’t measure what is important—can’t be considered a careful analysis of impact. Building the capacity to use adequate study design and statistical analysis will have to be on the agenda of the ESEA if the Blueprint is followed.&lt;br /&gt;&lt;br /&gt;Far from reducing the role of research in the U.S. education system, the Blueprint for ESEA actually advocates a radical expansion. The word “research” is used only a few times, and “science” is used only in the context of STEM education. Nonetheless, the call for widespread careful analysis of the evidence of effective practices that impact student achievement broadens the scope of research, turning all policymakers and educators into practitioners of science. — DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-4051862742469032757?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/4051862742469032757/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2010/03/research-from-nclb-to-obamas-blueprint.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4051862742469032757'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4051862742469032757'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2010/03/research-from-nclb-to-obamas-blueprint.html' title='Research: From NCLB to Obama’s Blueprint for ESEA'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-5339290570839312760</id><published>2010-02-23T11:25:00.000-08:00</published><updated>2010-02-23T11:35:39.283-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EETT'/><category scheme='http://www.blogger.com/atom/ns#' term='rigorous evaluations'/><category scheme='http://www.blogger.com/atom/ns#' term='Office of Educational Technology'/><category scheme='http://www.blogger.com/atom/ns#' term='ed-tech'/><category scheme='http://www.blogger.com/atom/ns#' term='Department of Education'/><category scheme='http://www.blogger.com/atom/ns#' term='Title IID'/><title type='text'>Stimulating Innovation and Evidence</title><content type='html'>After a massive infusion of stimulus money into K-12 technology through the Title IID “Enhancing Education Through Technology” (EETT) grants, known also as “ed-tech” grants, the administration is planning to cut funding for the program in future budgets.&lt;br /&gt;&lt;br /&gt;Well, they’re not exactly “cutting” funding for technology, but consolidating the dedicated technology funding stream into a larger enterprise, awkwardly named the “Effective Teaching and Learning for a Complete Education” program. For advocates of educational technology, here’s why this may not be so much a blow as a challenge and an opportunity.&lt;br /&gt;&lt;br /&gt;Consider the approach stated at the White House “&lt;a href="http://www.whitehouse.gov/omb/factsheet_key_child_ed/"&gt;fact sheet&lt;/a&gt;”:&lt;br /&gt;&lt;br /&gt;“The Department of Education funds dozens of programs that narrowly limit what states, districts, and schools can do with funds. Some of these programs have little evidence of success, while others are demonstrably failing to improve student achievement. The President’s Budget eliminates six discretionary programs and consolidates 38 K-12 programs into 11 new programs that emphasize using competition to allocate funds, giving communities more choices around activities, and using rigorous evidence to fund what works...Finally, the Budget dedicates funds for the rigorous evaluation of education programs so that we can scale up what works and eliminate what does not.”&lt;br /&gt;&lt;br /&gt;From this, technology advocates might worry that policy is being guided by the findings of “no discernable impact” from a number of federally funded technology evaluations (including the evaluation mandated by the EETT legislation itself).&lt;br /&gt;&lt;br /&gt;But this is not the case. The &lt;a href="http://www.whitehouse.gov/sites/default/files/edtech%20final.pdf"&gt;White House&lt;/a&gt; declares, “The President strongly believes that technology, when used creatively and effectively, can transform education and training in the same way that it has transformed the private sector.”&lt;br /&gt;&lt;br /&gt;The administration is not moving away from the use of computers, electronic whiteboards, data systems, Internet connections, web resources, instructional software, and so on in education. Rather, the intention is that these tools are integrated, where appropriate and effective, into all of the other programs.&lt;br /&gt;&lt;br /&gt;This does put technology funding on a very different footing. It is no longer in its own category. Where school administrators are considering funding from the “Effective Teaching and Learning for a Complete Education” program, they may place a technology option up against an approach to lower class size, a professional development program, or other innovations that may integrate technologies as a small piece of an overall intervention. Districts would no longer write proposals to EETT to obtain financial support to invest in technology solutions. Technology vendors will increasingly be competing for the attention of school district decision-makers on the basis of the comparative effectiveness of their solution—not just in comparison to other technologies but in comparison to other innovative solutions. The administration has clearly signaled that innovative and effective technologies will be looked upon favorably. It has also signaled that effectiveness is the key criterion.&lt;br /&gt;&lt;br /&gt;As an Empirical Education team prepares for a visit to Washington DC for the conference of the &lt;a href="http://www.cosn.org/Events/CoSNConference/tabid/5523/Default.aspx"&gt;Consortium for School Networking&lt;/a&gt; and the &lt;a href="http://www.siia.net/etgf/2010/"&gt;Software and Information Industry Association&lt;/a&gt;’s EdTech Government Forum, (we are active members in both organizations), we have to consider our message to the education technology vendors and school system technology advocates. (Coincidentally, we will also be presenting research at the annual conference of the &lt;a href="http://www.sree.org/conferences/2010/index.php?r=2675"&gt;Society for Research on Educational Effectiveness&lt;/a&gt;, also held in DC that week). As a research company we are constrained from taking an advocacy role—in principle we have to maintain that the effectiveness of any intervention is an empirical issue. But we do see the infusion of short term stimulus funding into educational technology through the EETT program as an opportunity for schools and publishers. Working jointly to gather the evidence from the technologies put in place this year and next will put schools and publishers in a strong position to advocate for continued investment in the technologies that prove effective.&lt;br /&gt;&lt;br /&gt;While it may have seemed so in 1993 when the U.S. Department of Education’s Office of Educational Technology was first established, technology can no longer be considered inherently innovative. The proposed federal budget is asking educators and developers to innovate to find effective technology applications. The stimulus package is giving the short term impetus to get the evidence in place. — DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-5339290570839312760?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/5339290570839312760/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2010/02/stimulating-innovation-and-evidence.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5339290570839312760'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5339290570839312760'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2010/02/stimulating-innovation-and-evidence.html' title='Stimulating Innovation and Evidence'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-1518082506167120042</id><published>2010-01-08T09:39:00.001-08:00</published><updated>2010-01-08T09:39:53.036-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='RELEVANCE'/><category scheme='http://www.blogger.com/atom/ns#' term='RIGOR'/><category scheme='http://www.blogger.com/atom/ns#' term='SELECTION BIAS'/><category scheme='http://www.blogger.com/atom/ns#' term='RANDOMIZING UNITS'/><category scheme='http://www.blogger.com/atom/ns#' term='INTERPRETING RESULTS'/><category scheme='http://www.blogger.com/atom/ns#' term='RANDOMIZATION'/><category scheme='http://www.blogger.com/atom/ns#' term='DEFINING RIGOR'/><category scheme='http://www.blogger.com/atom/ns#' term='INSTITUTE OF EDUCATION SCIENCES'/><category scheme='http://www.blogger.com/atom/ns#' term='EXPERIMENTAL DESIGNS'/><title type='text'>Rigor AND Relevance</title><content type='html'>One of the conversations at the Institute of Education Sciences (the federal research agency) in 2010 is about rigor. How do we adhere to strict rules about what is accepted as scientific evidence while making the work sponsored by the agency more relevant to educators, as the director, John Easton, wants to do?&lt;br /&gt;&lt;br /&gt;The conflict between rigor and relevance arises for a number of reasons that we will illustrate in this entry. The basic problem arises when rigor is defined in terms of specific methodologies such as randomized experiments or a specific criterion such as a 95% confidence interval. Defining rigor by such procedural rules restricts the body of evidence to a small number of studies and to a narrow range of questions that can be answered with the methods that would be considered acceptable. Our position is not that the education sciences have to become less rigorous in order to become more relevant. Instead, our position is that the concept of scientific rigor is being misunderstood.&lt;br /&gt;&lt;br /&gt;Rigor, in ordinary English, is used to suggest rigidly following rules and procedures. However, because blind adherence to procedures is inappropriate in any area of science, the usage within the education sciences needs clarification and realignment. Our suggestion to IES is to focus on the underlying scientific principles rather than the procedures and criteria derived from the principles. Here are some examples. &lt;br /&gt;&lt;br /&gt;The standard rules of research assume that a positive outcome identified in a study is very unlikely to be an artifact of a particular sample. There is a very important principle behind this that must be rigorously understood by researchers. And the appropriate statistical calculations must be applied. The rigor, however, is in understanding the trade-off between mistaking a false positive result for a real result or erroneously rejecting a positive effect of a new program as statistically insignificant when, in fact, there is a real difference. Scientific practice favors protecting against the first kind of mistake and conventionally sets the bar high. But changing the trade-off to favor avoiding the mistake of considering a program ineffective when it is really effective would not constitute less rigor. Faced with a very serious problem, a policy maker may prefer the risk of spending money on something that might not work rather than rejecting a promising program that narrowly missed the conventional threshold for statistical significance.&lt;br /&gt;&lt;br /&gt;Randomization provides another example. The fundamental principle that has to be understood is how results of quantitative studies can be biased by confounding and how controlling for the effects of confounders produces a more accurate estimate of the treatment effect. While randomizing units (e.g., teachers, grade-level teams, schools) into treatment and control groups is recognized as the gold standard for controlling for the effects of potential confounding variables so as to isolate the impact of treatment, rigor is not accomplished by restricting education science to randomized experiments. A relevant study can often benefit from the use of observational data stored in school district information systems. Rigor would then consist of understanding how other designs and statistical controls can be appropriately applied to reduce potential bias (and when statistics can’t fix a bad design). There is nothing rigorous in discarding a dataset outright because it has not been created in a fully controlled experimental setting or because it is not free of measurement error.&lt;br /&gt;&lt;br /&gt;While controlling selection bias through experimental designs and statistical adjustments must be understood by education scientists, it is also essential to attend to the context of the study and the range of its generalizability—what we can usefully conclude from the research. The experiment itself may have interfered with usual processes (a situation called ecological invalidity) such as when teacher-level randomization breaks up the existing team teaching within a grade-level team. We need a record of differences in program implementation that shows the relationship between quality of implementation and student performance and also prevents us from mistaking attributes of better-implementing teachers for attributes of the program. While the world of schools can be a messy place to conduct research, taking implementation issues seriously in the study design does not equal less rigor. &lt;br /&gt;&lt;br /&gt;Ultimately, it comes down to knowing what we can say to the stakeholders, whether they are educators, publishers, or government agencies. What can be said derives from rigorous application of research principles and, to some extent, calls upon the art of careful audience-sensitive communication. It is not more rigorous to leave out the results of post-hoc explorations. Rigor in education science includes framing the results with appropriate cautions about preliminary findings, limitations on generalization, and results that are interesting and warrant continued tracking or more targeted investigations. Making progress in education science calls for rigor, and rigor includes clear communication and the participation of stakeholders in interpreting results.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-1518082506167120042?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/1518082506167120042/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2010/01/rigor-and-relevance_08.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1518082506167120042'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1518082506167120042'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2010/01/rigor-and-relevance_08.html' title='Rigor AND Relevance'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-5892723113627528098</id><published>2009-11-17T09:39:00.000-08:00</published><updated>2009-11-17T09:52:24.327-08:00</updated><title type='text'>New ED Research Agenda Taking Shape</title><content type='html'>We’ve heard administration officials say that the stimulus programs provide a laboratory for ideas that can be built into the ESEA (aka NCLB) reauthorization, as well as into the reauthorization of the Education Sciences Act. So we are studying closely the RFAs, draft RFAs, and other guidance from the US Department of Education for stimulus programs such as Race to the Top (R2T), Investing in Innovation (i3), Enhancing Education Through Technology (EETT), and the State Longitudinal Data Systems (SLDS) looking for clues about the new research agenda. They are not hard to find.&lt;br /&gt;&lt;br /&gt;As a general trend, there is no doubt that the new administration is seriously committed to evidence-based policy. Peter Orszag of the White House budget office has recently called for (and made the case for, in his &lt;a href="http://empiricaleducation.blogspot.com/2009/07/problem-with-national-experiments.html"&gt;blog&lt;/a&gt;) systematic evaluations of federal policies consistent with the president’s promise to “restore science to its rightful place.” But how does this play out with ED?&lt;br /&gt;&lt;br /&gt;First, we are seeing a major shift from a static notion of “scientifically based research” (SBR) to a much more dynamic approach to continuous improvement. In NCLB there was constant reference to SBR as a necessary precondition for spending ESEA funds on products, programs, or services. In some cases, it meant that the product’s developers had to have consulted rigorous research. In other cases, it was interpreted as there having to be rigorous research showing the product itself was effective. But in either case, the SBR had to precede the purchase.&lt;br /&gt;&lt;br /&gt;Evidence of a more dynamic approach is found in all of the competition-based stimulus programs. Take for example the discussion of “instructional improvement systems.” While this term usually refers to classroom-based systems for formative testing with feedback to the teacher allowing differentiation of instruction, it is used in a broader sense in the current RFAs and guidance documents. The definition provided in the R2T draft RFA reads as follows (bullets and highlights added for clarity):&lt;br /&gt;&lt;br /&gt;“Instructional improvement systems means technology-based and other strategies that tools that provide&lt;br /&gt;&lt;br /&gt;    * teachers,&lt;br /&gt;    * principals,&lt;br /&gt;    * &lt;span style="background-color: yellow;"&gt;and administrators&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;with meaningful support and actionable data to systemically manage continuous instructional improvement, including activities such as:&lt;br /&gt;&lt;br /&gt;    * instructional planning;&lt;br /&gt;    * gathering information (e.g., through formative assessments (as defined in this notice), interim assessments (as defined in this notice), summative assessments, and looking at student work and other student data);&lt;br /&gt;    * analyzing information with the support of rapid-time (as defined in this notice) reporting;&lt;br /&gt;    * using this information to inform decisions on appropriate next steps;&lt;br /&gt;    * &lt;span style="background-color: yellow;"&gt;and evaluating the effectiveness of the actions taken.&lt;/span&gt;”&lt;br /&gt;&lt;br /&gt;It is important to notice, first of all, that tools are provided to administrators, not just to teachers. Moreover, the final activity in the cycle is to evaluate the effectiveness of the actions. (Joanne Weiss, who heads up the R2T program, uses the same language inclusive of effectiveness evaluation by district administrators in a &lt;a href="http://www.ed.gov/news/speeches/2009/09/09102009.html" target="_blank"&gt;recent speech&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;We have pointed out &lt;a href="http://empiricaleducation.com/evidence2008.php#oct08"&gt;in a previous entry&lt;/a&gt; that the same cycle of needs analysis, action, and evaluation that works for teachers in the classroom also works for district-level administrators. The same assessments that help teachers differentiate instruction can, in many cases, be aggregated up to the school and district level where broader actions, programs, and policies can be implemented and evaluated based on initial identification of the needs. An important difference exists between these parallel activities at the classroom and central office level. At the district level, where larger datasets extend over a longer period, evaluation design and statistical analysis are called for. In fact this level of information calls for scientifically based research.&lt;br /&gt;&lt;br /&gt;Research is now viewed as integral to the cycle of continuous improvement. Research may be carried out by the district’s or state’s own research department or data may be made available to outside researchers as called for in the SLDS and other RFAs. The fundamental difference now is that the research conducted and published before federal funds are used is not the only relevant research. Of course, ED strongly prefers (and at the highest level of funding in i3 requires) that programs have prior evidence. But now the further gathering of evidence is required both in the sense of a separate evaluation and in the sense that funding is to be put toward continuous improvement systems that build research into the innovation itself.&lt;br /&gt;&lt;br /&gt;Our recent &lt;a href="http://www.empiricaleducation.com/index.php#oct_26_09"&gt;news item&lt;/a&gt; about the i3 program takes note of other important ideas about the research agenda we can expect to influence the reauthorization of ESEA. It is worth noting that the methods called for in i3 are also those most appropriate and practical for local district evaluations of programs. We welcome this new perspective on research considered as a part of the cycle of continuous instructional improvement. — DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-5892723113627528098?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/5892723113627528098/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/11/new-ed-research-agenda-taking-shape.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5892723113627528098'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5892723113627528098'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/11/new-ed-research-agenda-taking-shape.html' title='New ED Research Agenda Taking Shape'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-5067601049449250995</id><published>2009-09-22T12:11:00.000-07:00</published><updated>2009-09-22T14:05:45.245-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='continuous improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='effectiveness'/><category scheme='http://www.blogger.com/atom/ns#' term='Jim  Shelton'/><category scheme='http://www.blogger.com/atom/ns#' term='innovation'/><category scheme='http://www.blogger.com/atom/ns#' term='i3'/><category scheme='http://www.blogger.com/atom/ns#' term='scientific evidence'/><title type='text'>Research as Innovation</title><content type='html'>Many of us heard Jim Shelton, the ED Assistant Deputy Secretary for Innovation and Improvement, speak to the education publishing industry last week about the $650 million fund now called “Investing in Innovation” (i3). Through i3, Shelton wants to fund the scaling up of innovations having some evidence that they’re worth investing in. These i3 grants could be as large as $50 million.&lt;br /&gt;&lt;br /&gt;With that amount at stake, it makes sense for government funders to look for some track record of scientifically documented success. The frequent references in ED documents to processes of “continuous improvement” as part of innovations suggest that proposers would do well to supplement the limited evidence for their innovation by showing how scientific evidence can be generated as an ongoing part of a funded project, that is, how in-course corrections and improvements can be made to the innovation as it is being put into place in a school system.&lt;br /&gt;&lt;br /&gt;In his speech to the education industry, Shelton complained about the low quality of the evidence currently being put forward. Although some publishers have taken the initiative and done serious tests of their products, there has never been a strong push for them to produce evidence of effectiveness.&lt;br /&gt;&lt;br /&gt;School systems usually haven’t demanded such evidence, partly because there are often more salient decision criteria and partly because little qualified evidence exists, even for programs that are effective. Moreover, district decision makers may find studies of a product conducted in schools that are different from their schools to have marginal relevance, regardless of how “rigorously” the studies were conducted.&lt;br /&gt;&lt;br /&gt;The ED appears to recognize that it will be counter-productive for grant programs such as i3 to depend entirely on the pre-existing scientific evidence. An alternative research model based on continuous improvement may help states and districts to succeed with their i3 proposals—and with their projects, once funded. &lt;br /&gt;&lt;br /&gt;Now that improved state and district data systems are increasing the ability of school systems to quickly reference several years of data on students and teachers, i3 can start looking at how rigorous research is built into the innovations they fund—not just the one-time evaluation typically built into federal grant proposals.&lt;br /&gt;&lt;br /&gt;This kind of research for continuous improvement is an innovation in itself—an innovation that may start with the “data-driven decision making” mode in which data are explored to identify an area of weakness or a worrisome trend. But the real innovation in research will consist of states and districts building their own capacity to evaluate whether the intervention they decided to implement actually strengthened the area of weakness or arrested the worrisome trend they identified and chose to address. Perhaps it did so for some schools but not others, or maybe it caught on with some teachers but not with all. The ability of educators to look at this progress in relation to the initial goals completes the cycle of continuous improvement and sets the stage for refocusing, tweaking, or fully redesigning the intervention under study.&lt;br /&gt;&lt;br /&gt;We predict that i3 reviewers, rather than depending solely on strong existing evidence, will look for proposals that also include a plan for continuous improvement that can be part of how the innovation assures its success. In this model, research need not be limited to the activity of an “external evaluator” that absorbs 10% of the grant. Instead, routine use of research processes can be an innovation that builds the internal capacity of states and districts for continuous improvement.&lt;br /&gt;  -DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-5067601049449250995?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/5067601049449250995/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/09/research-as-innovation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5067601049449250995'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5067601049449250995'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/09/research-as-innovation.html' title='Research as Innovation'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-6393792056693839511</id><published>2009-09-11T11:42:00.000-07:00</published><updated>2009-09-11T11:46:46.210-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='program effectiveness'/><category scheme='http://www.blogger.com/atom/ns#' term='methodological rigor'/><category scheme='http://www.blogger.com/atom/ns#' term='dissemination model'/><category scheme='http://www.blogger.com/atom/ns#' term='usability'/><category scheme='http://www.blogger.com/atom/ns#' term='state education agencies'/><category scheme='http://www.blogger.com/atom/ns#' term='IES'/><category scheme='http://www.blogger.com/atom/ns#' term='Regional Education Labs'/><category scheme='http://www.blogger.com/atom/ns#' term='John Easton'/><category scheme='http://www.blogger.com/atom/ns#' term='local relevance'/><category scheme='http://www.blogger.com/atom/ns#' term='randomized experiments'/><title type='text'>Easton Sets a New Agenda for IES</title><content type='html'>John Easton, now officially confirmed as the director of the Institute of Education Sciences, gave a brief talk July 24th to explain his agenda to the directors and staff of the Regional Education Labs. This is a particularly pivotal time, not only because the Obama administration is setting an aggressive direction for changes in the K-12 schools, but also because the Easton is starting his six-year term just as IES is preparing the RFP for the re-competition for the 10 RELs. (The budget for the RELs accounts for about 11% of the approximately $600 million IES budget.)&lt;br /&gt;&lt;br /&gt;Easton made five points.&lt;br /&gt;&lt;br /&gt;First, he is not retreating from the methodological rigor, which was the hallmark of his predecessor, Russ Whitehurst. This simply means that IES will not be funding poorly designed research that does not have the proper controls to support conclusions the researcher wants to assert. Randomized control is still the strongest design for effectiveness studies, although weaker designs are recognized as having value.&lt;br /&gt;&lt;br /&gt;Second, there has to be more emphasis on relevance and usability for practitioners. IES can’t ignore how decisions are made and what kind of evidence can usefully inform them. He sees this as requiring a new focus on school systems as learning organizations. This becomes a topic for research and development.&lt;br /&gt;&lt;br /&gt;Third, although randomized experiments will still be conducted, there needs to be a stronger tie to what is then done with the findings. In a research and development process, rigorous evaluation should be built in from the start and should relate more specifically to the needs of the practitioners who are part of the R&amp;D process. In this sense, the R&amp;D process should be linked more directly to the needs of the practitioners.&lt;br /&gt;&lt;br /&gt;Fourth, IES will move away from the top-down dissemination model in which researchers seem to complete a study and then throw the findings over the wall to practitioners. Instead, researchers should engage practitioners in the use of evidence, understanding that the value of research findings comes in its application, not simply in being released or published. IES will take on the role of facilitating the use of evidence.&lt;br /&gt;&lt;br /&gt;Fifth, IES will take on a stronger role in building capacity to conduct research at the local level and within state education agencies. There’s a huge opportunity presented by the investment (also through IES) in state longitudinal data systems. The combination of state systems and the local district systems makes gathering the data to answer policy questions and questions about program effectiveness much easier. The education agencies, however, often need help in framing their questions, applying an appropriate design, and deploying the necessary and appropriate statistics to turn the data into evidence.&lt;br /&gt;&lt;br /&gt;These five points form a coherent picture of a research agency that will work more closely through all phases of the research process with practitioners, who will be engaged in developing the findings and putting them into practice. This suggests new roles for the Regional Education Labs in helping their constituencies to answer questions pertinent to their local needs, in engaging them more deeply in using the evidence found, and in building their local capacity to answer their own questions. The quality of work will be maintained, and the usability and local relevance will be greatly increased. &lt;br /&gt;   — DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-6393792056693839511?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/6393792056693839511/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/09/easton-sets-new-agenda-for-ies.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/6393792056693839511'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/6393792056693839511'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/09/easton-sets-new-agenda-for-ies.html' title='Easton Sets a New Agenda for IES'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-2193980375310682822</id><published>2009-07-09T08:35:00.000-07:00</published><updated>2009-07-09T08:41:46.672-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OMB'/><category scheme='http://www.blogger.com/atom/ns#' term='Department of Education'/><category scheme='http://www.blogger.com/atom/ns#' term='cross-validation model'/><category scheme='http://www.blogger.com/atom/ns#' term='program implementations'/><category scheme='http://www.blogger.com/atom/ns#' term='evidence-based policy decisions'/><category scheme='http://www.blogger.com/atom/ns#' term='Peter R. Orszag'/><title type='text'>The Problem with National Experiments</title><content type='html'>We welcome the statement of the director of the Office of Management and Budget (OMB), Peter R. Orszag, issued as a &lt;a href="http://www.whitehouse.gov/omb/blog/09/06/08/BuildingRigorousEvidencetoDrivePolicy/" target="_blank"&gt;blog entry&lt;/a&gt;, calling for the use of evidence.&lt;br /&gt; &lt;br /&gt;&lt;em&gt;&amp;ldquo;I am trying to put much more emphasis on evidence-based policy decisions here at OMB.  Wherever possible, we should design new initiatives to build rigorous data about what works and then act on evidence that emerges &amp;mdash; expanding the approaches that work best, fine-tuning the ones that get mixed results, and shutting down those that are failing.&amp;rdquo;&lt;/em&gt;&lt;br /&gt; &lt;br /&gt;This suggests a continuous process of improving programs based on evaluations built into the fabric of program implementations, which sounds very valuable. Our concern, however, at least in the domain of education, is that Congress or the Department of Education will contract for a national experiment to prove a program or policy effective. In contrast, we advocate a more localized and distributed approach based on the argument Donald Campbell made in the early 70s in his classic paper &amp;ldquo;The Experimenting Society&amp;rdquo; (updated in 1988). He observes that &amp;ldquo;the U.S. Congress is apt to mandate an immediate, nationwide evaluation of a new program to be done by a single evaluator, once and for all, subsequent implementations to go without evaluation.&amp;rdquo; Instead, he describes a &amp;ldquo;contagious cross-validation model for local programs&amp;rdquo; and recommends a much more distributed approach that would &amp;ldquo;support adoptions that included locally designed cross-validating evaluations, including funds for appropriate comparison groups not receiving the treatment.&amp;rdquo; Using such a model, he predicts that &amp;ldquo;After five years we might have 100 locally interpretable experiments.&amp;rdquo; (p.303)&lt;br /&gt; &lt;br /&gt;Dr. Orszag&amp;rsquo;s adoption of the &lt;a href="http://www.evidencebasedpolicy.org/docs/TopTierProjectOverview5.19.09.pdf" target="_blank"&gt;&amp;ldquo;top tier&amp;rdquo;&lt;/a&gt; language from the &lt;a href="http://coalition4evidence.org/wordpress/" target="_blank"&gt;Coalition for Evidence Based Policy&lt;/a&gt; is buying into the idea that an educational program can be proven effective in a single large scale randomized experiment. There are several weaknesses in this approach.&lt;br /&gt; &lt;br /&gt;First, the education domain is extremely diverse and, without the &amp;ldquo;100 locally interpretable experiments,&amp;rdquo; it is unlikely that educators would have an opportunity to see a program at work in a sufficient number of contexts to begin to build up generalizations. Moreover, as local educators and program developers improve their programs, additional rounds of testing are called for (and even the &amp;ldquo;top tier&amp;rdquo; programs should engage in continuous improvement).&lt;br /&gt; &lt;br /&gt;Second, the information value of local experiments is much higher for the decision-maker who will always be concerned with performance in his or her school or district. National experiments generate average impact estimates, while giving little information about any particular locale.  Because concern with achievement gaps between specific populations differs across communities, it follows that, in a local experiment, reducing a specific gap&amp;mdash;not the overall average effect&amp;mdash;may well be the effect of primary interest.&lt;br /&gt; &lt;br /&gt;Third, local experiments are vastly less expensive than nationally contracted experiments, even while obtaining comparable statistical power. Local experiments can easily be one-tenth the cost of national experiments, thus conducting 100 of them is quite feasible. (We say more about the reasons for the cost differential in a separate &lt;a href="/pdfs/school_capacity.pdf" target="_blank"&gt;policy brief&lt;/a&gt;).  Better yet, local experiments can be completed in a more timely manner&amp;mdash;it need not take five years to accumulate a wealth of evidence. Ironically, one factor making national experiments expensive, as well as slow, is the review process required by OMB!&lt;br /&gt; &lt;br /&gt;So while we applaud Dr. Orszag&amp;rsquo;s leadership in promoting evidence-based policy decisions, we will continue to be interested in how this impacts state and local agencies. We hope that, instead of contracting for national experiments, the OMB and other federal agencies can help state and local agencies to build evaluation for continuous improvement into the implementation of federally funded programs. If nothing else, it helps to have OMB publicly making evidence-based decisions. &amp;mdash;DN&lt;br /&gt; &lt;br /&gt;Campbell, D. T. (1988). The Experimenting Society. In E. S. Overman (Ed.), &lt;em&gt;Methodology and epistemology for social science:&lt;/em&gt; Selected Papers. (pp. 303). Chicago: University of Chicago Press.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-2193980375310682822?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/2193980375310682822/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/07/problem-with-national-experiments.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/2193980375310682822'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/2193980375310682822'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/07/problem-with-national-experiments.html' title='The Problem with National Experiments'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-7910874975637830305</id><published>2009-06-09T09:50:00.000-07:00</published><updated>2009-06-09T09:59:15.327-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='reform agenda'/><category scheme='http://www.blogger.com/atom/ns#' term='rigorous evaluations'/><category scheme='http://www.blogger.com/atom/ns#' term='rational economic model'/><category scheme='http://www.blogger.com/atom/ns#' term='program implementation'/><category scheme='http://www.blogger.com/atom/ns#' term='Hanushek'/><category scheme='http://www.blogger.com/atom/ns#' term='achievement gaps'/><category scheme='http://www.blogger.com/atom/ns#' term='smaller scale evaluations'/><category scheme='http://www.blogger.com/atom/ns#' term='funding-achievement puzzle'/><title type='text'>It’s Not the Money, It’s What You Spend It On</title><content type='html'>Our neighbor from the Hoover Institution, Eric (Rick) Hanushek, who also currently chairs the National Education Sciences Board, has just published a very interesting book (with co-author Alfred Lindseth) on the financing of schools&lt;sup&gt;1&lt;/sup&gt;. It provides a very readable narrative of the last couple of decades’ court decisions about how much money it should take to provide an equitable and adequate K-12 education. The authors’ basic thesis is that the amounts of money schools spend are generally unrelated to increases in achievement, unless one considers what the money is spent on. Clearly, if spending was focused on policies and programs that lead to achievement gains and to decreases in the achievement gaps between populations, things would improve. But court-ordered increases in education spending have seldom used credible estimates of likely impact of various programs, even though the programs' costs were used in calculating how much an equitable or adequate education will cost. The authors document in fascinating detail the irrationality of the process of producing these cost estimates.&lt;br /&gt;&lt;br /&gt;Hanushek and Lindseth propose that, where administrators and teachers are accountable and rewarded for results, they will consider the trade-offs in efficiency of spending money one way or another. For example, smaller class size may lead to better results but, if the same money were spent to increase teacher quality, the results may be much more substantial. This proposal, of course, depends on there being sufficient evidence that various programs, policies, or approach have a measurable impact. And they further acknowledge that getting this information is not a matter of running one-time experimental evaluations. The wide variation of populations, resources, and standards in US school systems means that a large number of smaller scale evaluations are called for. If states and school districts were to get into the habit of routinely pilot testing programs locally (and collecting and analyzing the data systematically) before scaling up within the district or state, the gains in efficiency could be substantial.&lt;br /&gt;&lt;br /&gt;Hanushek and Lindseth do not address the question of how local evaluations of sufficient quality and quantity can be paid for. If one depends solely on the Institute of Education Sciences for grants and contracts, the process will be slow and the resources inadequate. Setting aside a certain percentage of federal grants to states and districts for evaluations is often unproductive because the evaluations are not designed or timed to provide feedback for continuous improvement. Too often, educators and administrators treat the evaluation as a requirement that takes money from the program. We have argued elsewhere that integrating research into program implementation at the local level calls for building local school district capacity for rigorous evaluations. It also calls for a reform agenda that changes how decisions are connected both to explorations of district data and to locally generated evidence as to whether programs and policies are having the desired impact. This is different from contracting with the evaluator once program is under way because the plan for the evaluation is part of the plan for implementation. Directing a good portion of the program funds to a process of continuous improvement will make the program more efficient and provide educators with the hundreds of studies that will begin to accumulate the kind of evidence that they need to make a rational choice about what programs are worth trying out in their own locale.&lt;br /&gt;&lt;br /&gt;Educators, especially those who spend their days engaged with children in a classroom, may find the rational economic model on which the authors’ proposals are based unsatisfying and perhaps simplistic. Most people don’t go into education because they are maximizing their economic return. Nonetheless, it is hard to find a rationale for retaining teachers who are demonstrably ineffective beyond the traditional practice of union solidarity that militates against differentiation of skills among its members. The authors' arguments are thought provoking in that they demonstrate in rich narrative detail the obvious irrationality of considering only the amount of money put into schools and not considering the effectiveness of the programs, policies, and approaches that the money is spent on. —DN&lt;br /&gt;&lt;br /&gt;&lt;sup&gt;1&lt;/sup&gt; Hanushek, E.A. &amp; Lindseth, A.A. (2009). &lt;em&gt;Schoolhouses, courthouses and statehouses: Solving the funding-achievement puzzle in America’s schools.&lt;/em&gt; Princeton NJ: Princeton University Press.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-7910874975637830305?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/7910874975637830305/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/06/its-not-money-its-what-you-spend-it-on.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/7910874975637830305'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/7910874975637830305'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/06/its-not-money-its-what-you-spend-it-on.html' title='It’s Not the Money, It’s What You Spend It On'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-1443435484424360594</id><published>2009-05-13T12:33:00.000-07:00</published><updated>2009-05-14T09:05:12.111-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ARRA'/><category scheme='http://www.blogger.com/atom/ns#' term='&quot;Ensure transparency and accountability&quot;'/><category scheme='http://www.blogger.com/atom/ns#' term='Compliance anxiety'/><category scheme='http://www.blogger.com/atom/ns#' term='Redrock Reports'/><category scheme='http://www.blogger.com/atom/ns#' term='Stimulus funds'/><title type='text'>Compliance Anxiety</title><content type='html'>Stimulus funds are beginning to flow. But not as quickly as needed to provide a boost to the economy. One source of hesitation might be called “compliance anxiety.” People in school systems know that the Department of Education is looking for bold innovations and progress toward lasting reforms of the schools (&lt;a href="http://www.ed.gov/policy/gen/leg/recovery/guidance/uses.doc"&gt;see, for example, the recently published suggestions&lt;/a&gt;), but are not sure exactly what is going to be asked of them in terms of accounting for the funds they spend. The third guiding principle of ARRA calls for K-12 districts to “ensure transparency, reporting, and accountability.” This is meant to prevent fraud and abuse, to support the most effective uses of ARRA funds, and to accurately measure and track results.&lt;br /&gt;&lt;br /&gt;Over the past few weeks, in webinars and similar venues, educators have been asking what this means. Many are hesitant to commit funds without knowing what evidence of compliance will be called for. The following quotes were compiled by Jennifer House, Ph.D., Founder of &lt;a href="http://redrockreports.com/"&gt;Redrock Reports&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Superintendent of a large suburban district&lt;/strong&gt;: “We just need to know what kind of data needs to be collected for the accountability portion of ARRA—especially funds in the State Fiscal Stabilization Fund.”&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Superintendent of an urban district&lt;/strong&gt;: “When is the Department of Education going to tell us what data they need for the accountability and reporting requirements of ARRA?”&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Title I director of a major urban district&lt;/strong&gt;: “I know what I need to do for Title I reporting. Is there any other data I need to collect to report on the use and impact of the ARRA funds?”&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;IDEA director of a large suburban district&lt;/strong&gt;: “What other data is needed about ARRA funds”&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Paraphrase of seven questions from a single MDR webinar&lt;/strong&gt;: “When will we hear what the accountability requirements are for ARRA?”&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;CIO of a large suburban district&lt;/strong&gt;: “We need to accommodate the data that needs to be collected in our system for ARRA. When do we get the word?”&lt;br /&gt;&lt;br /&gt;These educators need to know what is meant by “accurately measure and track results.” Will this information just be used to audit who was paid for what? Or will ED be calling for a measure of results in terms of impact on schools, teachers, and student achievement?&lt;br /&gt;&lt;br /&gt;State Education Agencies are asked for “baseline data that demonstrates the State’s current status in each of the four education reform areas.” Will the states and districts be asked for subsequent data showing an improvement over baseline?&lt;br /&gt;&lt;br /&gt;Educators have heard that, in the near future, ED will describe specific data metrics that states will use to make transparent their status in the four education reform areas for the purpose of “showing how schools are performing and helping schools improve.” They expect that this will not be a one-time data collection; instead, they expect an element of tracking to help them with continuous improvement.&lt;br /&gt;&lt;br /&gt;ED has a one-time opportunity to move education toward an evidence-based enterprise on a massive scale by calling for evidence of outcomes—not just the starting baseline. Conditions are ripe for quickly and easily promoting a major reform in how districts measure their own results. Educators already expect this. A simple time series design is all that is needed. Training and support for this can be readily supplied through existing IES funding mechanisms.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-1443435484424360594?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/1443435484424360594/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/05/compliance-anxiety.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1443435484424360594'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1443435484424360594'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/05/compliance-anxiety.html' title='Compliance Anxiety'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-4714158293236963535</id><published>2009-04-06T15:33:00.000-07:00</published><updated>2009-04-08T16:54:10.976-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='The Experimenting Society'/><category scheme='http://www.blogger.com/atom/ns#' term='cross-validation model'/><category scheme='http://www.blogger.com/atom/ns#' term='Consortium on Chicago School Research'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='John Easton'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><title type='text'>The Advantages of Research on Local Problems</title><content type='html'>The nomination of John Q. Easton as the new head of IES highlights a debate that has been going on for quite a long time. As Donald Campbell noted in the early 70s in his classic paper “The Experimenting Society” (updated in 1988), “The U.S. Congress is apt to mandate an immediate, nationwide evaluation of a new program to be done by a single evaluator, once and for all, subsequent implementations to go without evaluation.”  In contrast, he describes a “contagious cross-validation model for local programs” and recommends a much more distributed approach that would “support adoptions that included locally designed cross-validating evaluations, including funds for appropriate comparison groups not receiving the treatment.” Using such a model, he predicts that “After five years we might have 100 locally interpretable experiments.” (p.303)  The work of the Consortium on Chicago School Research, which Easton has led, has a local focus on Chicago schools consistent with the idea that experiments should be locally interpretable. Elsewhere, we have argued that local experiments can also be vastly less expensive; thus having 100 of them is quite feasible. These experiments also can be completed in a more timely manner—it need not take five years to accumulate a wealth of evidence. We welcome a change in orientation at IES from organizing single large national experiments to the more useful, efficient, and practical model of supporting many local rigorous experiments.  –DN&lt;br /&gt;&lt;br /&gt;Campbell, D. T. (1988). The Experimenting Society. In E. S. Overman (Ed.), &lt;em&gt;Methodology and epistemology for social science:&lt;/em&gt; Selected Papers. (pp. 303). Chicago: University of Chicago Press.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-4714158293236963535?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/4714158293236963535/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/04/advantages-of-research-on-local.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4714158293236963535'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4714158293236963535'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/04/advantages-of-research-on-local.html' title='The Advantages of Research on Local Problems'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-8355835901652337586</id><published>2009-03-10T11:18:00.000-07:00</published><updated>2009-03-10T11:26:45.433-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='incentive projects'/><category scheme='http://www.blogger.com/atom/ns#' term='educational reform'/><category scheme='http://www.blogger.com/atom/ns#' term='stimulus plan'/><category scheme='http://www.blogger.com/atom/ns#' term='education technology'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><title type='text'>Stimulating Times!</title><content type='html'>Very soon, an additional $80 billion or so will be flowing into K-12 schools. Spending it in a way that will improve education will be a challenge. While the majority of the funds are intended to ward off teacher layoffs or to supplement existing funding streams—and will therefore keep things mostly as they are—other large chunks of money are aimed at promoting change. It is hard to keep the zeros straight but it appears that $650 million goes to education technology; $250 million to state data systems; $200 million to teacher incentive projects; and billions to what Secretary Duncan is calling a “Race for the Top” fund. Of course in the current context a mere billion dollars seems like small change. But it is enormous, considering that it has to be spent quickly, and there is little precedent for how to do the spending.&lt;br /&gt;&lt;br /&gt;Mike Smith, a senior advisor to Secretary Duncan, recently outlined the goals of the stimulus plan this way: (1) get the money out fast; (2) create jobs; and (3) stimulate reform. While some hope that this funding amount will constitute a new default level, we are being warned to expect a “cliff” when the one-time funding is exhausted. Thus a wise use of this money would be for new jobs that have the effect of creating something that won’t have to be paid for at the same level on an ongoing basis. Repairing a school building illustrates the idea. The work can start quickly and generate short-term jobs and, once completed, the repaired building will be around for a long time before another infusion of repair money is needed. &lt;br /&gt;&lt;br /&gt;What is the analogy in the domain of education reform? If a school system uses stimulus money to purchase services with a fixed annual cost, the service may have to be discontinued when the money is spent. On the other hand, if the school system purchases two years of professional development, then teachers may be able to carry their new practices forward without continuing operational costs. Similarly, investing in data systems and building school district capacity for analyzing local data and for evaluating programs could pay off in sustained systemic improvements in district decision processes.&lt;br /&gt;&lt;br /&gt;The stimulus package does contain funds for data systems-an investment that will pay off beyond the initial implementation. By themselves, data systems are technical capacities, not reforms; alone they do not change how educators can generate useful evidence that can improve instruction. With the stimulus funding, we now have an opportunity to put in place local district capacities for data use-not just at the classroom level but also at the level of district and state policies and strategies.&lt;br /&gt;&lt;br /&gt;To harness the stimulus funds for reforms involving data systems, we need to look at the instances in the stimulus package where data would actually be used. This leads us to the places calling for evaluations. For example, the $650 million in technology grants will flow through the mechanism of Title II D, which includes the suggestion to use funds to “support the rigorous evaluation of programs funded under this part, particularly regarding the impact of such programs on student academic achievement…” Thus these funds may be used to enhance the use of data systems to conduct local evaluations of the technologies, thereby building the capacity of districts to generate useful evidence. &lt;br /&gt;&lt;br /&gt;In a section on teacher incentive programs, the stimulus bill specifically calls for “a rigorous national evaluation.” We don’t think this should be interpreted as a single large national evaluation. In our view, a large number of local evaluations would be a more productive use of the funds, as long as each is rigorous. Doing so—that is, distributing the available funds to the local districts implementing the incentive programs—stimulates district hiring (or retaining district staff who specialize in evaluations).&lt;br /&gt;&lt;br /&gt;We should be viewing data use and evidence generation in districts as an important part of the reform agenda and not just as an isolated technical problem of data warehousing. By distributing program evaluation to districts, we can fulfill the goals of creating (or retaining) jobs while stimulating reform that will endure beyond the two years when the extraordinary funds are available. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-8355835901652337586?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/8355835901652337586/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/03/stimulating-times.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/8355835901652337586'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/8355835901652337586'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/03/stimulating-times.html' title='Stimulating Times!'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-4052006597101704183</id><published>2009-02-13T14:26:00.000-08:00</published><updated>2009-03-10T11:24:14.880-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Education Week'/><category scheme='http://www.blogger.com/atom/ns#' term='Scientifically Based'/><category scheme='http://www.blogger.com/atom/ns#' term='Arne Duncan'/><title type='text'>Education Week Reports that 'Scientifically Based' is Giving Way to 'Development' and 'Innovation'</title><content type='html'>&lt;p&gt;The headline in the January 28 issue of &lt;a href="http://www.edweek.org/ew/articles/2009/01/28/19rd_ep.h28.html?r=1815474517" target="_blank"&gt;Education Week&lt;/a&gt; suggests that the pendulum is swinging from obtaining rigorous scientific evidence to providing greater freedom for development and innovation.&lt;/p&gt;&lt;p&gt;Is there reason to believe this is more than a war of catch phrases?  Does supporting innovative approaches take resources away from &amp;ldquo;scientifically based research&amp;rdquo;?  A January 30 interview with Arne Duncan, our new Secretary of Education, by CNN’s Campbell Brown is revealing.  She asked him about the innovative program in Chicago that pays students for better grades.  Here is how the conversation went:&lt;/p&gt;&lt;div id="indent"&gt;&lt;p&gt;Duncan: ...in every other profession we recognize, reward and incent excellence. I think we need to do more of that in education.&lt;/p&gt;&lt;p&gt;CNN: For the students specifically, you think money is the way to do that? It&amp;rsquo;s the best incentive?&lt;/p&gt;&lt;p&gt;Duncan: I don&amp;rsquo;t think it is the best incentive; I think it's one incentive.  This is a pilot program we started this fall so it&amp;rsquo;s very early on. But so far the data is very encouraging&amp;mdash;so far the students&amp;rsquo; attendance rates have gone up, students&amp;rsquo; grades have gone up, and these are communities where the drop out rate has been unacceptably high and whatever we can do to challenge that status quo. When children drop out today, Campbell as you know, they are basically condemned to social failure.  There are no good jobs out there so we need to be creative; we need to push the envelope. I don&amp;rsquo;t know if this is the right answer. We&amp;rsquo;ve got a control group.&lt;/p&gt;&lt;p&gt;CNN: But is it something that you would like to try across the country, to have other schools systems adopt?&lt;/p&gt;&lt;p&gt;Duncan: Again, Campbell, this is...we are about four months into it in Chicago.  We have a control group where this is not going on, so we&amp;rsquo;re going to follow what the data tells us. And if it&amp;rsquo;s successful, we&amp;rsquo;ll look to expand it. If it&amp;rsquo;s not successful, we&amp;rsquo;ll stop doing it. We want to be thoughtful but I think philosophically I am pro pushing the envelope, challenging the status quo, and thinking outside the box...&lt;/p&gt;&lt;/div&gt;&lt;script src="http://i.cdn.turner.com/cnn/.element/js/2.0/video/evp/module.js?loc=dom&amp;vid=/video/bestoftv/2009/01/30/cb.duncan.interview.cnn" type="text/javascript"&gt;&lt;/script&gt;&lt;noscript&gt;Embedded video from &lt;a href="http://www.cnn.com/video"&gt;CNN Video&lt;/a&gt;&lt;/noscript&gt;&lt;p&gt;Read more from the interview &lt;a href="http://www.cnn.com/2009/POLITICS/01/30/campbell.brown.duncan/index.html#cnnSTCText" target="_blank"&gt;here.&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Notice that he is calling for innovation: &amp;ldquo;pushing the envelope challenging the status quo, thinking outside the box.&amp;rdquo;  But he is not divorcing innovation from rigorously controlled effectiveness research. He is also looking at preliminary findings of changes such as attendance rates that can be detected early.  The &amp;ldquo;scientifically based&amp;rdquo; research is built into the innovation&amp;rsquo;s implementation at the earliest stage. And it is used as a basis for decisions about expansion. &lt;/p&gt;&lt;p&gt;While there may be reason to increase funding for development, we can&amp;rsquo;t divorce rigorous research from development; nor should we consider experimental evaluations as activities that kick in after an innovation has been fielded. We are skeptical that there is a real conflict between scientific research and innovation. The basic problem for research and development in education is not too much attention to rigorous research, but too little overall resources going into education. The graphic included in the Ed Week story makes clear that the R&amp;amp;D investment in education is miniscule compared to R&amp;amp;D for the military, energy, and heath sectors (health gets 100 times as much as education, whereas the category &amp;ldquo;other&amp;rdquo; gets 16 times as much).&lt;/p&gt;&lt;p&gt;The Department of Defense, of course, gets a large piece of the pie, and we often see the Defense Advanced Research Projects Agency (DARPA) held up as an example of a federal agency devoted to addressing innovative and often futuristic, requirements. The Internet started as one such engineering project, the ARPANET. In this case, researchers needed to access information flexibly from computers situated around the country and the innovative distributed approach turned out to be massively scalable and robust. Although we don&amp;rsquo;t always see that research is built in as a continuous part of engineering development, every step was in fact an experiment. While testing an engineering concept may take a few minutes of data collection, in education, the testing can be more cumbersome. Cognitive development and learning take months or years to generate measurable gains and experiments need careful ways to eliminate confounders that often don&amp;rsquo;t trouble engineering projects.  Education studies are also not as amenable to clever technical solutions (although the vision of something as big the Internet coming in to disrupt and reconfigure education is tantalizing).&lt;/p&gt;&lt;p&gt;It is always appealing to see the pendulum swing with a changing of the guard. In the current transition, what is coming about looks more like a synthesis in which research and development are no longer treated as separate&amp;mdash; and certainly not seen as competitors. —DN&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-4052006597101704183?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/4052006597101704183/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2009/02/education-week-reports-that.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4052006597101704183'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4052006597101704183'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2009/02/education-week-reports-that.html' title='Education Week Reports that &apos;Scientifically Based&apos; is Giving Way to &apos;Development&apos; and &apos;Innovation&apos;'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-4446813221651773976</id><published>2009-01-01T12:22:00.000-08:00</published><updated>2010-01-08T12:47:44.239-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Jim Goodnight'/><category scheme='http://www.blogger.com/atom/ns#' term='technology infrastructure'/><category scheme='http://www.blogger.com/atom/ns#' term='technology curricula'/><category scheme='http://www.blogger.com/atom/ns#' term='school technology'/><category scheme='http://www.blogger.com/atom/ns#' term='CoSN'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><category scheme='http://www.blogger.com/atom/ns#' term='Keith Krueger'/><category scheme='http://www.blogger.com/atom/ns#' term='research organization'/><title type='text'>Role of Technology as Infrastructure for Schools</title><content type='html'>We are excited and optimistic about the New Year. It will be a time of great challenges as well as critical transitions and important debates about the future of education in this country. The emerging proposal for a massive stimulus package gives reason both for optimism and caution. Thus far the package has included repairing school buildings, improving their broadband connections, and bringing in more technology.&lt;br /&gt;&lt;br /&gt;The Consortium for School Networking (CoSN), of which Empirical Education is a member and long time supporter, advocates for technology for schools. In an article entitled “&lt;a href="http://newsmanager.commpartners.com/cosnc/issues/2008-12-30/8.html"&gt;Why Obama Can’t Ignore Ed Tech&lt;/a&gt;”, Jim Goodnight, founder and CEO of SAS, and Keith Krueger, CEO of CoSN, argue for investing in education technology as a way to support “21st century learning” while creating jobs in the technology and telecommunications sectors. They also suggest that the investment will lead school districts to hire staff members specializing in technical and technology curricula, a function they note as currently being “vastly understaffed.”&lt;br /&gt;&lt;br /&gt;As a research organization, we have to maintain a cautious attitude about claims, such as those in the CoSN article, that technology products will reduce discipline problems and dropout rates generally. We do agree that an investment in school technology will call for increased staffing—that is, creating jobs—which is the primary goal of the stimulus package.&lt;br /&gt;&lt;br /&gt;But we believe there is a better argument for an investment in technical infrastructure. Network and data warehouse technologies inherently provide the mechanisms for measuring whether the investments are making a difference. Combined with online formative testing, automatic generation of usage data, and analytic tools, these technologies will put schools in a position to keep technology accountable for promised results. Using technology as a tool for tracking results of the stimulus package will, of course, create jobs. It will call for the creation of additional positions for data coaches, data analysts, trainers, and staff to handle the test administration, data cleaning, and communication functions.&lt;br /&gt;&lt;br /&gt;The fear that a stimulus package will just throw money at the problem is justified. Yes, it will provide jobs and benefits to certain industries in the short term, whereas any lasting improvement may be elusive. While building a new bridge employs construction workers and the lasting benefit can be measured, for example, by improved traffic, the lasting benefits of school technology may seem more subtle. We would argue, to the contrary, that a technology infrastructure for schools contains its own mechanism for accountability. The argument for school technology should drive home the notion that schools can be capable of determining whether the stimulus investment is having an impact on learning, discipline, graduation rates, and other measurable outcomes. Policy makers will not have to depend on promises of new forms of learning when they can put in place a technology infrastructure that provides school decision-makers with the information about whether the investment is making a difference. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-4446813221651773976?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/4446813221651773976/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2010/01/role-of-technology-as-infrastructure.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4446813221651773976'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/4446813221651773976'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2010/01/role-of-technology-as-infrastructure.html' title='Role of Technology as Infrastructure for Schools'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-5066101297059238721</id><published>2008-12-08T12:52:00.000-08:00</published><updated>2010-01-11T10:55:39.918-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='what works Clearinghouse'/><category scheme='http://www.blogger.com/atom/ns#' term='experimental program'/><category scheme='http://www.blogger.com/atom/ns#' term='focusing evaluations'/><category scheme='http://www.blogger.com/atom/ns#' term='H.S. Bloom'/><category scheme='http://www.blogger.com/atom/ns#' term='local program evaluations'/><category scheme='http://www.blogger.com/atom/ns#' term='best evidence encyclopedia'/><category scheme='http://www.blogger.com/atom/ns#' term='EXPERIMENTAL DESIGNS'/><title type='text'>Focusing Evaluations on Achievement Gaps</title><content type='html'>The standard design for experimental program evaluations in educational settings may not be doing justice to the questions that matter most to district decision makers. In many sites where we have worked, the most important question had to do with a gap between two populations within the district. For example, one district’s improvement plan specifically targeted the gap in science achievement between black students and white students. In another, there was a specific concern with the performance of new, and often uncertified, teachers compared to experienced teachers. NCLB, with its requirement for disaggregating the performance of specific subgroups, has reinforced this perspective. A new science curriculum that has a modest positive impact on performance across the district could be rejected if it had the effect of increasing the gap between the two populations of concern.&lt;br /&gt;&lt;br /&gt;When a new program favors one kind of student or teacher over another, we call it an interaction, that is, an interaction between the experimental “treatment” and some pre-existing “trait” of the population involved. In experimental design, we call these characteristics of the people or the setting moderators because they are seen as moderating the impact of the new program. Moderators are often considered secondary or even exploratory outcomes in experimental program evaluations, which are designed primarily to find out whether the new program makes an overall difference for the study population as a whole. Who gets and doesn’t get the program can be manipulated experimentally. By contrast, the moderator is a pre-existing characteristic that (usually) can’t be manipulated. While the experiment focuses on a specific program (treatment), any number of moderators can be examined after the fact.&lt;br /&gt;&lt;br /&gt;Many of our experiments in school systems are aimed at answering a question of local interest. In this case, we often find that the most important question concerns an interaction rather than the average impact of the experimental intervention itself. The potential moderator of interest, such as minority status, under-achievement, or certification can be specified in advance, based on the identified gap in performance the new program was intended to address in the first place. When the interaction is the primary outcome of interest, its status goes beyond even the emphasis that many experts put on interactions as a means for getting a fuller picture of the effectiveness of an intervention (Cook, 2002; Shadish, Cook, &amp; Campbell, 2002). But because investigations of interactions are usually exploratory and not the primary question (except perhaps for the specific setting in which the experiment took place), it is difficult to look across studies of the same intervention to come to any generalization about the moderating effects of certain variables. Research reviews that synthesize multiple studies of the same intervention such as found on the &lt;a href="http://ies.ed.gov/ncee/wwc/"&gt;What Works Clearinghouse&lt;/a&gt; and &lt;a href="http://www.bestevidence.org/"&gt;Best Evidence Encyclopedia&lt;/a&gt; are not concerned with interactions, even if an individual study finds one to be quite substantial. This is unfortunate because, in many studies that find no overall impact for a program, we may discover that it is differentially effective for an important subgroup. It would therefore be useful, for example, to examine whether the moderating effect of a certain variable varies more than is expected by chance across experimental settings. This would indicate whether the moderating effect is robust or whether it depends on local circumstances.&lt;br /&gt;&lt;br /&gt;This situation points to the importance of conducting local program evaluations that can focus on the achievement gap of greatest concern. Fortunately, recent theoretical work by Howard Bloom (Bloom, 2005) of MDRC provides an indication that statistical power for detecting differences among subgroups of students in the impact of an intervention (that is, the interaction) can be larger than for detecting a net impact of the same size for that program. This means that a local experiment primarily interested in an interaction can be smaller, and less expensive, than a traditional experiment looking for an overall average effect. The need for information about gaps, as well as the possible greater efficiency of studying gaps, provides support for a strategy of conducting relatively small experiments to answer questions of local interest to a school district (Newman, 2008). Small, and less expensive, experimental program evaluations focused on moderating effects can provide more valuable information to decision makers than large-scale experiments intended for broad generalization, which cannot provide useful evidence for all interactions of interest to schools.&lt;br /&gt;&lt;br /&gt;Empirical Education is now engaged in research to empirically verify Bloom’s observation about statistical power; we expect to be reporting the results next spring. —DN&lt;br /&gt;&lt;br /&gt;Bloom, H. S. (2005). Randomizing groups to evaluate place-based programs. In H. S. Bloom (Ed)., Learning More From Social Experiments. New York, NY: Sage.&lt;br /&gt;&lt;br /&gt;Cook, T. D. (2002). Randomized experiments in educational policy research: A critical examination of the reasons the education evaluation community has offered for not doing them, Educational Evaluation and Policy Analysis, 24, 175-199.&lt;br /&gt;&lt;br /&gt;Newman, D. (2008) Toward School Districts Conducting Their Own Rigorous Program Evaluations: Final Report on the “Low Cost Experiments to Support Local School District Decisions” Project. Empirical Education Research Reports, Palo Alto, CA: Empirical Education Inc. &lt;br /&gt;&lt;br /&gt;Shadish, W. R., Cook, T. D., &amp; Campbell, D. T. (2002). Experimental and quasi- experimental designs for generalized causal inference. Boston: Houghton Mifflin. &lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-5066101297059238721?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/5066101297059238721/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/12/focusing-evaluations-on-achievement.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5066101297059238721'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/5066101297059238721'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/12/focusing-evaluations-on-achievement.html' title='Focusing Evaluations on Achievement Gaps'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-6307522325921847832</id><published>2008-11-05T10:56:00.000-08:00</published><updated>2010-01-11T11:05:06.098-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OEEI'/><category scheme='http://www.blogger.com/atom/ns#' term='Brookings Institution'/><category scheme='http://www.blogger.com/atom/ns#' term='Barack Obama'/><category scheme='http://www.blogger.com/atom/ns#' term='educational innovation'/><category scheme='http://www.blogger.com/atom/ns#' term='new administration'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='scientific research'/><category scheme='http://www.blogger.com/atom/ns#' term='INSTITUTE OF EDUCATION SCIENCES'/><title type='text'>Climate Change: Innovation</title><content type='html'>Congratulations to Barack Obama on his sweeping victory. We can expect a change of policy climate with a new administration bringing new players and new policy ideas to the table. The appointment of a new director of the Institute of Education Sciences will provide an early opportunity to set direction for research and development. Reauthorization of NCLB and related legislation — including negotiating the definition and usage of “scientific research” — will be another, although pundit consensus was that this change will take two more years, given the urgency of fixing the economy and resolving the war in Iraq. But already change is in the air with proposals for dramatic shifts in priorities. Here we raise a question about the big new idea that is getting a lot of play: innovation.&lt;br /&gt;&lt;br /&gt;Educational innovation being called for includes funding for research and development [R&amp;D (with a capital D for a focus on new ideas)], acquisition of school technology, and funding for dissemination of new charter school models. The Brookings Institution recently published a policy paper Changing the Game: The Federal Role in Supporting 21st Century Educational Innovation by Sara Mead and Andy Rotherham. The paper imagines a new part of the US Department of Education called the Office of Educational Entrepreneurship and Innovation (OEEI) that would be charged with the job of implementing “a game-changing strategy [that] requires the federal government to make new types of investments, form new partnerships with philanthropy and the nonprofit sector, and act in new ways to support the growth of entrepreneurship and innovation within the public education system” (p34). The authors see this as complementary to standards-based reform, which is yielding diminishing returns. “To reach the lofty goals that standards-based reform has set, we need more than just pressure. We need new models of organizing schooling and new tools to support student learning that are dramatically more effective or efficient than what schools doing today” (p35).&lt;br /&gt;&lt;br /&gt;As an entrepreneurial education business, we applaud the idea behind the envisioned OEEI. The question for us arises when we think about how OEEI would know whether a game-changing model is “dramatically more effective or efficient.” How will the OEEI decide which entrepreneurs should receive or continue to receive funds? Although the authors call for a “relentless focus on results,” they do not say how results would be measured. The venture capital (VC) model bases success on return on investment. Many VC investments fail but, if a good percentage succeeds, the overall monetary return to the VC is positive. While venture philanthropies often work the same way, the profits go back into supporting more entrepreneurs instead of back to the investors. Scaling up profitably is a sufficient sign of success. Perhaps we can assume that parents, communities, and school systems would not choose to adopt new products if they were ineffective or inefficient. If this were true, then scaling up would be an indirect indication of educational effectiveness. Will positive results for innovations in the marketplace be sufficient, or should there perhaps be a role for research to determine their effectiveness?&lt;br /&gt;&lt;br /&gt;The authors suggest a $300 million per year “Grow What Works” fund of which less than 5% would be set aside for “rigorous independent evaluations of the results achieved by the entrepreneurs” (p48). Similarly, their suggestion for a program like the Defense Advanced Research Projects Agency (DARPA) would allow only up to 10%. Budgeting research at this level is unlikely to have much influence over what is likely to be an overwhelming imperative for market success. Moreover, what will be the role of independent evaluations if they fail to show the innovation to be dramatically more effective or efficient? Funding research as a set-aside from a funded program is always an uphill battle because it appears to take money away from the core activity. So let‘s be innovative and call this R&amp;D with the intention of empowering both the R and the D. Rather than offer a token concession to the research community, build ongoing formative research and impact evaluations into the development and scale-up processes themselves. This may more closely resemble the “design-engineering-development” activities that Tony Bryk describes.&lt;br /&gt;&lt;br /&gt;Integrating the R with the D will have two benefits. First it will provide information to federal and private funding agencies on the progress toward whatever measurable goal is set for an innovation. Second, it will help the parents, communities, and school systems make informed decisions about whether the innovation will work locally. The important partner here is the school district, which can take an active role in evaluation as well as development. These are the entities that ultimately have to decide whether the innovations are more effective and efficient that what they already do. They are also the ones with all the student, teacher, financial, and other data needed to conduct quasi-experiments or interrupted time series studies. If an agency like OEEI is created, it should insist that school districts become partners in the R&amp;D for innovations they consider introducing. —DN &lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-6307522325921847832?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/6307522325921847832/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/11/climate-change-innovation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/6307522325921847832'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/6307522325921847832'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/11/climate-change-innovation.html' title='Climate Change: Innovation'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-7493441001147937606</id><published>2008-10-11T11:34:00.000-07:00</published><updated>2010-01-11T11:42:41.129-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistical analysis'/><category scheme='http://www.blogger.com/atom/ns#' term='Ellen Mandinach'/><category scheme='http://www.blogger.com/atom/ns#' term='multi-year longitudinal data'/><category scheme='http://www.blogger.com/atom/ns#' term='statistical calculations'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Driven School Improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='data-driven decision making'/><category scheme='http://www.blogger.com/atom/ns#' term='Margaret Honey'/><title type='text'>Needed: A Developmental Approach to District Data Use</title><content type='html'>The recent publication of Data Driven School Improvement: Linking Data and Learning, edited by Ellen Mandinach and Margaret Honey, takes a useful step toward documenting innovative practices at the classroom, school, district, and state levels. The book’s 14 chapters (and 30 authors) avoid the advocacy orientation frequently found in discussions of data-driven decision making (D3M). Case studies provide rich detail that is often missing in other discussions. It is useful to get a sense of some of the actual questions that were addressed and that motivated the setup of the technology of data warehouses, assessment tools, and dashboards.&lt;br /&gt;&lt;br /&gt;The book provides a good introduction to a complicated field that is currently attracting much attention from practitioners and researchers, as well as from technology vendors. In some ways, however, it does not go deep enough in providing a framework for understanding the topic. While one of the key chapters provides a conceptual framework in terms of a set of processes and related skills such as for collecting, analyzing, prioritizing data, the framework is static in the sense that there is no account of or theory as to how teachers, principals, or district administrators might acquire these skills or come to be interested in using them. Without a developmental theory, we can’t predict what processes or skills are likely to be prerequisites for others or how processes can be scaffolded, for example, by using some of the useful technologies described in several of the chapters. Many of the examples of how data are used can be loosely described as data mining and moving toward identifying needs or gaps or problems. Situations where statistical analysis (beyond averages of descriptive data) is called for are mentioned only occasionally. Such a question might have asked for a comparison of what happened when a new program was put in place compared to what would have happened without the program as well as compared to the level of need identified for which the program was considered a solution. The chapters for the most part keep the discussion at a level that does not call for a statistical test or an examination of a correlation. This may be reasonable when considering decisions within a classroom, but is an oversimplification when it comes to decisions considered at the district central office.&lt;br /&gt;&lt;br /&gt;It is reasonable to posit stages in a developmental sequence where descriptive needs assessment would be a logical first step before moving on to more complex analyses that would, for example, introduce statistical controls. On a technical level, it is reasonable to consider data on a single school year to be both more readily available to school district administrators and also to address more straightforward questions than multi-year longitudinal data. For example, a question about mean differences among ethnic groups calls for simpler analytic tools than a question about changes over time in the size of the gap between groups. Both may feed into a needs analysis, but the latter calls for statistical calculations that go beyond a simple comparison. Similarly, a question about whether a new program had an impact not only calls for statistical machinery but requires the introduction of experimental design in setting up an appropriate comparison. Again it is reasonable to posit that incorporating research design into the “data-driven” decisions is a more advanced stage that builds upon the tools and processes that explore correlations to identify potential areas of need. A developmental theory of data-driven school improvement may provide a basis for tools, supports, and professional development for school district personnel that can accelerate adoption of these valuable processes. A development theory would provide a guide for starting where they are and for providing the scaffold to a next level that builds incrementally on what is already in place. —DN&lt;br /&gt;&lt;br /&gt;Mandinach, E and Honey, M. (Eds) (2008) Data Driven School Improvement: Linking Data and Learning. New York: Teachers College Press.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-7493441001147937606?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/7493441001147937606/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/10/needed-developmental-approach-to.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/7493441001147937606'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/7493441001147937606'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/10/needed-developmental-approach-to.html' title='Needed: A Developmental Approach to District Data Use'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-1603297479941110521</id><published>2008-09-05T12:27:00.000-07:00</published><updated>2010-01-11T12:33:25.685-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistical technologies'/><category scheme='http://www.blogger.com/atom/ns#' term='American education policy'/><category scheme='http://www.blogger.com/atom/ns#' term='value-added” analyses'/><category scheme='http://www.blogger.com/atom/ns#' term='data systems'/><title type='text'>Where Advocating Accountability Falls Short</title><content type='html'>Calling for greater accountability continues to be a theme in American education policy. Recently, Senator Barack Obama made this proposition: “I’ll recruit an army of new teachers, and pay them higher salaries, and give them more support. And in exchange, I’ll ask for higher standards and more accountability” (August 28, 2008).&lt;br /&gt;&lt;br /&gt;Although the details of policy positions are not generally provided in political speeches, this one is worth pulling apart to see what might be the issues in implementing such a policy.&lt;br /&gt;&lt;br /&gt;First, what is the accountability that there will be more of? In this case we may presume that, since accountability is linked specifically to teacher salaries, teachers will be held accountable. Is this appropriate—or even possible—as federal policy? The educational enterprise can be held accountable at many levels. While teachers have face-to-face contact with students who may do well or poorly, the team of teachers working at a grade level or in a small school could be collectively accountable. Moving up a level, a principal could be held accountable for the school’s results. And the district superintendent and the state schools chief can also be accountable for results in their jurisdictions. From purely an accountability point of view, teachers are not necessarily the best focus for federal policy. Certainly, recruiting and incentive efforts can be federally funded, but it seems at best awkward to legislate sanctions for individual teachers based on holding them accountable for their individual performance in raising their students’ scores.&lt;br /&gt;&lt;br /&gt;It is currently possible technically to hold teachers accountable. Database and statistical technologies are now available to link teacher identities to student records. District data systems routinely provide links between teachers and their rosters of students and, in many cases, these are extended longitudinally. Many state data systems have also begun providing unique teacher IDs so that linkages to the achievement of individual students can be tracked. And drawing on these longitudinal linkages, “value-added” analyses are being used to quantify the contribution over time of individual teachers to students in their classes.&lt;br /&gt;&lt;br /&gt;However, as an approach to federal policy, taking a top-down tactic—making superintendents and principals accountable—may be better than promoting technologies that attempt to measure individual teachers. (The technical controversies about the statistics used in some versions of value-added analysis are a noteworthy topic that we’ll save for another day.) A more productive approach may be to focus on the disincentives for teachers to collaborate or help one another when accountability is at the individual level. We find that many teachers report teaching students other than those officially registered in their classes. Frequently these are informal arrangements that can increase aggregate achievement for the students involved but muddy the district or state records for individual teachers. A school may be a more appropriate unit of accountability and, in that local context, data on individual teachers can more accurately be evaluated. The school principals will know both their schools’ standing among other schools in the district and will have school-level data to help in making staffing decisions. The central office staff, in turn, will have a broader view of the progress of the individual schools on which to base decisions about allocation of resources.&lt;br /&gt;&lt;br /&gt;For any achievement-based accountability approach—whether at the district, school, or teacher level—it is important to understand achievement in relation to challenges related to, for example, the economic status of the district or neighborhood or the prior preparation of the students in the classroom. We must also consider the growth of the students, not just their proficiency status at the end of the year. These considerations require statistical calculations, not just counting up percentages of proficient students. And once we begin looking at analyses such as trajectories of some schools compared to others facing similar challenges, we can take the next step and begin tracking the success of interventions, professional development programs, and other local policies aimed at addressing areas of weakness or supporting teachers who are not helping their students make the kind of progress the school is looking for. The data systems and the analytic tools needed to track a teacher’s or a school’s progress over time can also be turned to guiding resource allocations and interventions and, as a next logical step, providing the capability of tracking whether the additional interventions, support, or professional development are having the desired impact.&lt;br /&gt;&lt;br /&gt;Given the capacities of current data systems, how might a policy involving greater support and greater accountability for teachers be implemented? Here is one example. Federal funding to districts for principal leadership training could be tied to district-level labor contracts giving the building leaders greater control over personnel decisions. The leadership training will include the interpretation and use of longitudinal data, constituting tools for comparing the principal’s school to others facing similar challenges. Professional learning communities for the principals can be part of this leadership program, assisting district teams to work through ideas for interventions. While the achievement-based accountability measures of teacher performance can be used as one of the factors in building-level decisions, the leadership training would include how to use the data systems for tracking both teacher job performance and the impact of support and training on that performance. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-1603297479941110521?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/1603297479941110521/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/09/where-advocating-accountability-falls.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1603297479941110521'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1603297479941110521'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/09/where-advocating-accountability-falls.html' title='Where Advocating Accountability Falls Short'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-2031842776632469256</id><published>2008-06-01T12:35:00.000-07:00</published><updated>2010-01-14T11:39:53.939-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rigorous evidence'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='data-driven decision-making research'/><category scheme='http://www.blogger.com/atom/ns#' term='scientific evidence'/><title type='text'>How Do Districts Use Evidence?</title><content type='html'>The research journal &lt;a href="http://epx.sagepub.com/"&gt;Education Policy&lt;/a&gt; published an article this month that is important for understanding how data and evidence are used at the school district level: &lt;a href="http://epx.sagepub.com/cgi/content/abstract/22/4/578"&gt;“Evidence-Based Decision Making in School District Central Offices”&lt;/a&gt; by Meredith Honig and Cynthia Coburn, both alumnae of Stanford’s Graduate School of Education (Honig &amp; Coburn, 2008). Understand that most of the data-driven decision-making research (and most decision-making based on data) occurs at the classroom level; teachers get immediate and actionable information about individual students. But Honig and Coburn are talking about central office administrators. Data at the district level are more complicated and, as the authors document, infused with political complications. When district leaders are making decisions about products or programs to adopt, evidence of the scientific sort is at best one element among many.&lt;br /&gt;&lt;br /&gt;Honig and Coburn review three decades of research and, after eliminating purely anecdotal and obviously advocacy pieces, they found 52 books and articles of substantial value. What they document parallels our own experience at Empirical Education in many respects. That is, rigorous evidence, once it is gathered through either reading scientific reviews or conducting local program evaluations, is never used “directly.” It is not a matter of the evidence dictating the decision. They document that scientific evidence is incorporated into a wide range of other kinds of information and evidence. These may include teacher feedback, implementation issues, past experience, or what the neighboring district superintendent said about it—all of which are legitimate sources of information and need to be incorporated into the thinking about what to do. This “working knowledge” is practical and “mediates” between information sources and decisions.&lt;br /&gt;&lt;br /&gt;The other aspect of decision-making that Honig and Coburn address involves the organizational or political context of evidence use. In many cases the decision to move forward has been made before the evaluation is complete or even started; thus the evidence from it is used (or ignored) to support that decision or to maintain enthusiasm. As in any policy organization or administrative agency, there is a strong element of advocacy in how evidence is filtered and used. The authors suggest that this filtering for advocacy can be beneficial in helping administrators make the case for programs that could be beneficial.&lt;br /&gt;&lt;br /&gt;In other words, there is a cognitive/organizational reality that “mediates” between evidence and policy decisions. The authors contrast this reality with the position they attribute to federal policy makers and the authors of NCLB that scientific evidence ought to be used “directly” or instrumentally to make decisions. In fact, they see the federal policy as arguing that “these other forms of evidence are inappropriate or less valuable than social science research evidence and that reliance on these other forms is precisely the pattern that federal policy makers should aim to break” (p601). This is where their argument is weakest. The contrast they set up between the idea of practical knowledge mediating between evidence and decisions and the idea that evidence should be used directly is a false dichotomy. The “advocate for direct use of evidence” is a straw man. There are certainly researchers and research methodologists who do not study and are not familiar with how evidence is used in district decisions. But not being experts in decision processes does not make them advocates for a particular process called “direct.” The federal policy is not aimed at decision processes. Instead, it aims to raise the standards of evidence in formal research that claims to measure the impact of programs so that, when such evidence is integrated into decision processes and weighed against practical concerns of local resources, local conditions, local constraints, and local goals, the information value is positive. Federal policy is not trying to remove decision processes, it is trying to remove research reports that purport to provide research evidence but actually come to unwarranted conclusions because of poor research design, incorrect statistical calculations, or bias.&lt;br /&gt;&lt;br /&gt;We should also not mistake Honig’s and Coburn’s descriptions of decision processes for descriptions of deep, underlying, and unchangeable human cognitive tendencies. It is certainly possible for district decision-makers to learn to be better consumers of research, to distinguish weak advocacy studies from stronger designs, and to identify whether a particular report can be usefully generalized to their local conditions. We can also anticipate an improvement in the level of the conversation between districts’ evaluation departments, curriculum departments, and IT people so that local evaluations are conducted to answer critical questions and to provide useful information that can be integrated with other local considerations into a decision. —DN&lt;br /&gt;&lt;br /&gt;Honig, M. I. &amp; Coburn, C. (2008). Evidence-Based Decision Making in School District Central Offices. Educational Policy, 22(4), 578-608.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-2031842776632469256?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/2031842776632469256/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/06/how-do-districts-use-evidence.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/2031842776632469256'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/2031842776632469256'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/06/how-do-districts-use-evidence.html' title='How Do Districts Use Evidence?'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-6621557177965736988</id><published>2008-05-14T12:52:00.000-07:00</published><updated>2010-01-14T13:02:03.598-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rigorous evaluations'/><category scheme='http://www.blogger.com/atom/ns#' term='Randomized control'/><category scheme='http://www.blogger.com/atom/ns#' term='focusing evaluations'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='Mark Lipsey'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><category scheme='http://www.blogger.com/atom/ns#' term='rigorous program evaluation'/><title type='text'>What Makes Randomization Hard to Do?</title><content type='html'>The question came up at the recent &lt;a href="http://ies.ed.gov/ncer/whatsnew/conferences/08sdl_intervention/"&gt;workshop&lt;/a&gt; held in Washington DC for school district researchers to learn more about rigorous program evaluation: “Why is the strongest research design often the hardest to make happen?” There are very good theoretical reasons to use randomized control when trying to evaluate whether a school district’s instructional or professional development program works. What we want to know is whether involving students and teachers in some program will result in outcomes that are better than if those same students and teachers were not involved in the program. The workshop presenter, Mark Lipsey of Vanderbilt University, pointed out that if we had a time machine we could observe how well the students and teachers achieved with the program, then go back in time, don’t give them the program — thus creating the science fiction alternate universe — and watch how they did without the program. We can’t do that, so the next best thing is to find a group that is just like the one with the program and see how they do. By choosing who gets a program and who doesn’t get it from a pool of volunteer teachers (or schools) using a coin toss (or another random method), we can be sure that self selection had nothing to do with group assignment and that, at least on average, the only difference between members of the two groups is that one group won the coin toss and the other didn’t. Most other methods introduce potential bias that can change the results.&lt;br /&gt;&lt;br /&gt;Randomized control can work where the district is doing a small pilot and has only enough materials for some of the teachers, where resources call for a phased implementation starting with a small number of schools, or where slots in a program are going to be allocated by lottery anyway. To many people, the coin toss (or other lottery method) just doesn’t seem right. Any number of other criteria could be suggested as a better rationale for assigning the program: some students are needier, some teachers may be better able to take advantage of it, and so on. But the whole point is to avoid exactly those kinds of criteria and make the choice entirely random. The coin toss itself highlights the decision process, creating a concern that it will be hard to justify, for example, to a parent who wants to know why his kid’s school didn’t get the program.&lt;br /&gt;&lt;br /&gt;Our own experience with random assignment has not been so negative. Most districts will agree to it, although some do refuse on principle. When we begin working with the teachers face–to–face, there is usually camaraderie about tossing the coin, especially when it is between two teachers paired up because of their similarity on characteristics they themselves identify as important (we’ve also found this pairing method helps give us more precise estimates of the impact). The main problem we find with randomization, if it is being used as part of a district’s own local program evaluation, is the pre–planning that is required. Typically, decisions as to which schools get the program first or which teachers will be selected to pilot the program are made before consideration is given to doing a rigorous evaluation. In most cases, the program is already in motion or the pilot is coming to a conclusion before the evaluation is designed. At that point in the process, the best method will be to find a comparison group from among the teachers or schools that were not chosen or did not volunteer for the program (or to look outside the district for comparison cases). The prior choices introduce selection bias that we can attempt to compensate for statistically; still, we can never be sure our adjustments eliminate the bias. In other words, in our experience the primary reason that randomization is harder than weaker methods is that it requires that the evaluation design and the program implementation plan are coordinated from the start. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-6621557177965736988?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/6621557177965736988/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/05/what-makes-randomization-hard-to-do.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/6621557177965736988'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/6621557177965736988'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/05/what-makes-randomization-hard-to-do.html' title='What Makes Randomization Hard to Do?'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-7720367510570413495</id><published>2008-04-14T13:12:00.000-07:00</published><updated>2010-01-14T13:25:32.017-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='CoSN'/><category scheme='http://www.blogger.com/atom/ns#' term='Experimental control'/><category scheme='http://www.blogger.com/atom/ns#' term='D3M'/><category scheme='http://www.blogger.com/atom/ns#' term='data-driven decision making'/><category scheme='http://www.blogger.com/atom/ns#' term='Data warehouses'/><title type='text'>Data-Driven Decision Making—Applications at the District Level</title><content type='html'>Data warehouses and data-driven decision making were major topics of discussion at the &lt;a href="http://www.cosn.org/"&gt;Consortium for School Networking conference&lt;/a&gt; March 9-11 in Washington DC that Empirical Education staff attended. This conference has a sizable representation by Chief Information Officers from school districts as well as a long tradition of supporting instructional applications of technology. Clearly with the onset of the accountability provisions of NCLB, the growing focus has been on organizing and integrating such school district data as test scores, class rosters, and attendance. While the initial motivation may have been to provide the required reports to the next level up, there continues to be a lively discussion of functionality within the district. The notion behind data-driven decision making (D3M) is that educators can make more productive decisions if based on this growing source of knowledge. Most of the attention has focused on teachers using data on students to make instructional decisions for individuals. At the CoSN conference, one speaker claimed that teachers’ use of data for classroom decisions was the true meaning of D3M; uses at the district levels to inform decisions were at best of secondary importance. We would like to argue that the applications at the district level should not be minimized.&lt;br /&gt;&lt;br /&gt;To start with, we should note that there is little evidence that giving teachers access to warehoused testing data is effective in improving achievement. We are involved in two experimental studies on this topic, but more should be undertaken if we are going to understand the conditions for success with this technology. We are intrigued by the possibility that, with several waves of data during the year, teachers become action researchers, working through the following steps: 1) seeing where specific students are having trouble, 2) trying out intervention techniques with these children or groups, and 3) examining the results within a few months (or weeks). Thus the technique would be not just based on teacher impressions but from assessments that provide a measurement of student growth relative to standards and to the other students in the class. If a technique isn’t working, the teacher will move to another. And the cycle continues.&lt;br /&gt;&lt;br /&gt;D3M can be used in similar three-step process at the district level but this is much rarer. At the district level D3M is most often used diagnostically to identify areas of weakness, for example, to identify schools that are doing worse than they should or to identify achievement gaps between categories of students. This is like the first step in the teacher D3M. District planners may then make decisions about acquiring new instructional programs, providing PD to certain teachers, replacing particular staff, and so on. This is like the teacher’s second step. What we see far less frequently at the district level is the teacher’s third step: looking at the results so as to measure whether the new program is having the desired effect. In the district decision context this step requires a certain amount of planning and research design. Experimental control is not as important in the classroom because the teacher will likely be aware of any other plausible explanations for a student’s change. On the scale of a district pilot program or new intervention, research design elements are needed to distinguish any difference from what might have happened anyway or to exclude selection bias. Also, where the decision potentially impacts a large number of schools, teachers, and students, statistical calculations are needed to determine the size of the difference and the level of confidence the decision makers can have that the result is not just a matter of chance. We encourage the proponents of D3M to consider the importance of its application at the district level to take advantage, on a larger scale, of processes that happen in the classroom everyday. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-7720367510570413495?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/7720367510570413495/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/04/data-driven-decision-makingapplications.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/7720367510570413495'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/7720367510570413495'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/04/data-driven-decision-makingapplications.html' title='Data-Driven Decision Making—Applications at the District Level'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-520411224107150489</id><published>2008-03-14T13:26:00.000-07:00</published><updated>2010-01-14T13:42:44.355-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Roberto and Brad'/><category scheme='http://www.blogger.com/atom/ns#' term='what works Clearinghouse'/><category scheme='http://www.blogger.com/atom/ns#' term='Miller-McKeon'/><category scheme='http://www.blogger.com/atom/ns#' term='FERPA'/><category scheme='http://www.blogger.com/atom/ns#' term='Lugar-Bingaman'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='Reading First program'/><category scheme='http://www.blogger.com/atom/ns#' term='Nina Rees'/><category scheme='http://www.blogger.com/atom/ns#' term='FERA'/><title type='text'>Making Way for Innovation: An Open Email to Two Congressional Staffers Working on NCLB</title><content type='html'>Roberto and Brad, it was a pleasure hearing your commentary at the February 20 Policy Forum “Using Evidence for a Change” and having a chance to meet you afterward. Roberto, we promised you a note summarizing the views expressed by several on the panel and raised in the question period.&lt;br /&gt;&lt;br /&gt;We can contrast two views of research evident at the policy forum:&lt;br /&gt;&lt;br /&gt;The first view holds that, because research is so expensive and difficult, only the federal government can afford it and only highly qualified professional researchers can be entrusted with it. The goal of such research activities is to obtain highly precise and generalizable evidence. In this view, practitioners (at the state, district, or school level) are put in the role of consumers of the evidence.&lt;br /&gt;&lt;br /&gt;The second view holds that research should be made a routine activity within any school district contemplating a significant investment in an instructional or professional development program. Since all the necessary data are readily at hand (and without FERPA restrictions), it is straightforward for district personnel to conduct their own simple comparison group study. The result would be reasonably accurate local information on the program‘s impact in the setting. In this view, practitioners are producers of the evidence.&lt;br /&gt;&lt;br /&gt;The approach suggested by the second view is far more cost effective than the first, as well as more timely. It is also driven directly by the immediate needs of districts. While each individual study would pertain only to a local implementation, in combination, hundreds of such studies can be collected and published by organizations like the What Works Clearinghouse or by consortia of states or districts. Turning practitioners into producers of evidence also removes the brakes on innovation identified in the policy forum. With practitioners as evidence producers, schools can adopt “unproven” programs as long as they do so as a pilot that can be evaluated for its impact on student achievement.&lt;br /&gt;&lt;br /&gt;A few tweaks to NCLB will be necessary to turn practitioners into producers of evidence:&lt;br /&gt;&lt;br /&gt;1. Currently NCLB implicitly takes the “practitioners as consumers of evidence” view in requiring that the scientifically based research be conducted prior to a district‘s acquisition of a program. We have already published a &lt;a href="http://www.empiricaleducation.com/evidence.php#congress"&gt;blog&lt;/a&gt; entry analyzing the changes to the SBR language in the Miller-McKeon and Lugar-Bingaman proposals and how minor modifications could remove the implicit “consumers” view. These are tweaks such as, for example, changing a phrase that calls for:&lt;br /&gt;“including integrating reliable teaching methods based on scientifically valid research”&lt;br /&gt;to a call for&lt;br /&gt;“including integrating reliable teaching methods based on, or evaluated by, scientifically valid research.”&lt;br /&gt;&lt;br /&gt;2. Make clear that a portion of the program funds are to be used in piloting new programs so they can be evaluated for their impact on student achievement. Consider a provision similar to the “priority” idea that Nina Rees persuaded ED to use in awarding its competitive programs.&lt;br /&gt;&lt;br /&gt;3. Build in a waiver provision such as that proposed by the Education Sciences Board that would remove some of the risk to a failing district in piloting a new promising program. This “pilot program waiver” should cover consequences of failure for the participating schools for the period of the pilot. The waiver should also remove requirements that NCLB program funds be used only for the lowest scoring students, since this would preclude having the control group needed for a rigorous study.&lt;br /&gt;&lt;br /&gt;The view of “practitioners as consumers of evidence” is widely unpopular. It is viewed by decision-makers as inviting the inappropriate construction of an approved list, as was revealed in the Reading First program. It is seen as restricting local innovation by requiring compliance with the proclamations of federal agencies. In the end, science is reduced to a check box on the district requisition form. If education is to become an evidence-based practice, we have to start with the practitioners. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-520411224107150489?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/520411224107150489/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/03/making-way-for-innovation-open-email-to.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/520411224107150489'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/520411224107150489'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/03/making-way-for-innovation-open-email-to.html' title='Making Way for Innovation: An Open Email to Two Congressional Staffers Working on NCLB'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-1100911897879538349</id><published>2008-02-14T13:43:00.000-08:00</published><updated>2010-01-14T13:52:01.786-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='professional development programs'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><title type='text'>Outcomes—Who Cares About Them?</title><content type='html'>This should be obvious about research in education: If teachers or administrators don’t care about the outcomes we measure, then no matter how elegantly we design and analyze experiments and present their findings, they won’t mean much.&lt;br /&gt;&lt;br /&gt;A straightforward—simplistic, perhaps—approach to making an experiment meaningful is to measure whether the program we are testing has an impact on the same test scores to which the educators are held accountable. If the instructional or professional development program won’t help the school move more students into the proficient category, then it is not worth the investment of time and money.&lt;br /&gt;&lt;br /&gt;Well, maybe. Suppose the high-stakes test is a poor assessment of the state standards for skills like problem-solving or communication? As researchers, we’ve found ourselves in this quandary.&lt;br /&gt;&lt;br /&gt;At the other end of the continuum, many experimental studies use outcome measures explicitly designed to measure the program being studied. One strategy is to test both the program and the comparison group on material that was taught only to the program group. Although this may seem like an unfair bias, it can be a reasonable approach for what we would call an “efficacy” study—an experiment that is trying to determine whether the program has any effect at all under the best of circumstances (similar to using a placebo in medicine). Still, it is certainly important for the consumers of research not to mistake the impact measured in such studies with the impact they can expect to see on their high-stakes test.&lt;br /&gt;&lt;br /&gt;Currently, growth models are being discussed as better ways to measure achievement. It is important to keep in mind that these techniques do not solve the problem of mismatch between standards and tests. If the test doesn’t measure what is important, then the growth model just becomes a way to measure progress on a scale that educators don’t believe in. Insofar as growth models extend high-stakes testing into measuring the amount of student growth for which each individual teacher is responsible, the disconnect just grows.&lt;br /&gt;&lt;br /&gt;One technique that experimental studies can take advantage of without waiting for improvements in testing is the measurement of outcomes that consist of changes in classroom processes. We call these “mediators” because the process changes result from the experimental manipulation, they happen over time before the final outcome is measured, and in theory they represent a possible mechanism by which the program has an impact on the final outcome. For example, in testing an inquiry-based math program, we can measure—through surveys or observations—the extent to which classroom processes such as inquiry and hands-on activities appear more (or less) among the program or comparison teachers. This is best done where teachers (or schools) have been assigned randomly to program or comparison groups. And it is essential that we are measuring some factor that could be observed in both conditions. Differences in the presence of a mediator can often be measured long before the results of outcome tests are available, giving school administrators an initial indication of the new program’s success. The relationship of the program’s impact on the mediator and its impact on the test outcome can also tell us something about how the test impact came about.&lt;br /&gt;&lt;br /&gt;Testing is far from perfect. Improvements in the content of what is tested, combined with technical improvements that can lower the cost of delivery and speed the turn-around of results to the students and teachers, will benefit both school accountability and research on the effectiveness of instructional and professional development programs. In the meantime, consumers of research have to consider whether an outcome measure is something they care about. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-1100911897879538349?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/1100911897879538349/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/03/outcomeswho-cares-about-them.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1100911897879538349'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1100911897879538349'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/03/outcomeswho-cares-about-them.html' title='Outcomes—Who Cares About Them?'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-1934566217614398216</id><published>2008-01-14T13:52:00.000-08:00</published><updated>2010-01-14T14:01:32.538-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistical technique'/><category scheme='http://www.blogger.com/atom/ns#' term='educational innovation'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='AYP'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><category scheme='http://www.blogger.com/atom/ns#' term='John Merrow'/><title type='text'>What’s Unfair about a Margin of Error?</title><content type='html'>We think that TV newsman John Merrow is mistaken when, in an &lt;a href="http://www.edweek.org/ew/index.html"&gt;Education Week&lt;/a&gt; opinion piece (“Learning Without Loopholes”, December 4, 2007), he says it is inappropriate for states to use a “margin of error” in calculating whether schools have cleared an AYP hurdle. To the contrary, we would argue that schools don’t use this statistical technique as much as they should.&lt;br /&gt;&lt;br /&gt;Merrow documents a number of cynical methods districts and states use for gaming the AYP system so as to avoid having their schools fall into “in need of improvement” status. One alleged method is the statistical technique familiar in reporting opinion surveys where a candidate’s lead is reported to be within the margin of error. Even though there may be a 3-point gap, statistically speaking, with a plus-or-minus 5-point margin of error, the difference between the candidates may actually be zero. In the case of a school, the same idea may be applied to AYP. Let’s say that the amount of improvement needed to meet AYP for the 4th grade population were 50 points (on the scale of the state test) over last year’s 4th grade scores. But let’s imagine that the 4th grade scores averaged only 35 points higher. In this case, the school appears to have missed the AYP goal by 15 points. However, if the margin of error were set at plus-or-minus 20 points, we would not have the confidence to conclude that there’s a difference between the goal and the measured value.&lt;br /&gt;&lt;br /&gt;(Margin of Error bar graph) What is a margin or error or “confidence interval”? First of all, we assume there is a real value that we are estimating using the sample. Because we don’t have perfect knowledge, we try to make a fair estimate with some specified level of confidence. We want to know how far the average score that we got from the sample (e.g., of voters or of our 4th grade students) could possibly be from the real average. If we were, hypothetically, to go back and take lots of new samples, we assume they would be spread out around the real value. But because we have only one sample to work with, we do a statistical calculation based on the size of the sample, the nature of the variability among scores, and our desired level of confidence to establish an interval around our estimated average score. With the 80% confidence interval that we illustrated, we are saying that there’s a 4-in-5 chance that the true value we’re trying to estimate is within that interval. If we need greater confidence (for example, if we need to be sure that the real score is within the interval 95 out of a 100 times), we have to make the interval wider.&lt;br /&gt;&lt;br /&gt;Merrow argues that, while using this statistical technique to get an estimated range is appropriate for opinion polls, where a sample of 1,000 voters from a much larger pool is used and we are figuring by how much the result may change if we had a different sample of 1,000 voters, the technique is not appropriate for a school, where we are getting a score for all the students. After all, we don’t use a margin of error in the actual election; we just count all the ballots. In other words, there is no “real” score that we are estimating. The school’s score is the real score.&lt;br /&gt;&lt;br /&gt;We disagree. An important difference between an election and a school’s mean achievement score is that the achievement score, in the AYP context, implies a causal process: Being in need of improvement implies that the teachers, the leadership, or other conditions at the school need to be improved and that doing so will result in higher student achievement. While ultimately it is the student test scores that need to improve, the actions to be taken under NCLB pertain to the staff and other conditions at the school. If the staff is to blame for the poor conditions, we can’t blame them for a range of variations at the student level. This is where we see the uncertainty coming in.&lt;br /&gt;&lt;br /&gt;First consider the way we calculate AYP. With the current “status model” method, we are actually comparing an old sample (last year’s 4th graders) with a new sample (this year’s 4th graders) drawn from the same neighborhood. Do we want to conclude that the building staff would perform the same with a different sample of students? Consider also that the results may have been different if the 4th graders were assigned to different teachers in the school. Moreover, with student mobility and testing differences that occur depending on the day the test is given, additional variations must be considered. But more generally, if we are predicting that “improvements” in the building staff will change the result, we are trying to characterize these teachers in general, in relation to any set of students. To be fair to those who are expected to make change happen, we want to represent fairly the variation in the result that is outside the administrator’s and teachers’ control, and not penalize them if the difference between what is observed and what is expected can be accounted for by this variation.&lt;br /&gt;&lt;br /&gt;The statistical methods for calculating a confidence interval (CI) around such an estimate, while not trivial, are well established. The CI helps us to avoid concluding there is a difference (e.g., between the AYP goal and the school’s achievement) when it is reasonably possible that no difference exists. The same technique applies if a district research director is asked whether a professional development program made a difference. The average score for students of the teachers who took the program may be higher than the average scores of students of (otherwise equivalent) teachers who didn’t. But is the difference large enough to be clearly distinct from zero? Did the size of the difference escape the margin or error? Without properly doing this statistical calculation, the district may conclude that the program had some value when the differences were actually just in the noise.&lt;br /&gt;&lt;br /&gt;While the U.S. Department of Education is correct to approve the use of CIs, there is still an issue of using CIs that are far wider than justified. The width of a CI is a matter of choice and depends on the activity. Most social science research uses a 95% CI. This is the threshold for the so-called “statistical significance,” and it means that the likelihood is less than 5% that a difference as large or larger than the one observed would have occurred if the real difference (between the two candidates, between the AYP goal and the school’s achievement, or between classes taught by teachers with or without professional development) were actually zero. In scientific work, there is a concern to avoid declaring there is evidence for a difference when there is actually no difference. Should schools be more or less stringent than the world of science?&lt;br /&gt;&lt;br /&gt;Merrow points out that many states have set their CI at a much more stringent 99%. This makes the CI so wide that the observed difference between the AYP goal and the measured scores would have to be very large before we say there is a difference. In fact, we’d expect such a difference to occur by chance alone only 1% of the time. In other words, the measured score would have to be very far below the AYP goal before we’d be willing to conclude that the difference we’re seeing isn’t due to chance. As Merrow points out, this is a good idea if the education agency considers NCLB to be unjust and punitive and wants to avoid schools being declared in need of improvement. But imagine what the “right” CI would be if NCLB gave schools additional assistance when identified as below target. It is still reasonable to include a CI in the calculation, but perhaps 80% would be more appropriate.&lt;br /&gt;&lt;br /&gt;The concept of a confidence interval is essential as schools move to data-driven decision making. Statistical calculations are often entirely missing from data-mining tools, and chance differences end up being treated as important. There are statistical methods such as including pretest scores in the statistical equation for making calculations more precise and for narrowing the CI. Growth modeling, for example, allows us to use student-level (as opposed to grade-average) pretest scores to increase precision. School district decisions should be based on good measurement and a reasonable allowance for chance differences. —DN/AJ&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-1934566217614398216?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/1934566217614398216/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2008/01/whats-unfair-about-margin-of-error.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1934566217614398216'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1934566217614398216'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2008/01/whats-unfair-about-margin-of-error.html' title='What’s Unfair about a Margin of Error?'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-1046203702870377924</id><published>2007-12-15T10:58:00.000-08:00</published><updated>2010-01-15T11:13:17.472-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><category scheme='http://www.blogger.com/atom/ns#' term='Citizens for Responsibility and Ethics in Washington (CREW)'/><category scheme='http://www.blogger.com/atom/ns#' term='scientifically-based evidence'/><title type='text'>What Happens When a Publisher Doesn’t Have Scientific Evidence?</title><content type='html'>A letter from &lt;a href="http://www.citizensforethics.org/node/30099"&gt;Citizens for Responsibility and Ethics in Washington (CREW)&lt;/a&gt; to the Inspector General of the U.S. Department of Education raises important issues. Although the letter is written in a very careful, thorough, and lawyerly manner, no doubt most readers will notice right away that the subject of the letter are the business practices of Ignite!, the company run by the president’s brother Neil.&lt;br /&gt;&lt;br /&gt;CREW documents that Ignite! has sold quite a few units of Curriculum on Wheels (COW) to schools in Texas and elsewhere and that these were purchased with NCLB funds. They also document that there is no accessible scientific evidence that COWs are effective. Given the NCLB requirement that funds be used for programs that have scientifically-based evidence of effectiveness, there appears to be a problem. The question we want to raise is: whose problem is this?&lt;br /&gt;&lt;br /&gt;The media report that Mr. Bush has responded to the issues. For example, this explanation appears in &lt;a href="http://www.eschoolnews.com/2007/11/09/bush-ed-tech-firm-under-fire/"&gt;eSchool News (Nov. 17, 2007)&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;    * In his interview with eSchool News, Bush said the watchdog group has misinterpreted the federal statute.&lt;br /&gt;    * “We’re proud we have a product that has the science of learning built into its design, with tons of anecdotal evidence,” the Ignite! founder said. “But we don’t yet have efficacy studies that meet the What Works Clearinghouse standards–in fact, I challenge you to find any educational curriculum that has met that standard.”&lt;br /&gt;&lt;br /&gt;Mr. Bush appears to suggest that NCLB requires only that products incorporate scientific principles. This suggestion is doubtful, outside Reading First, which had its own rules. With respect to actually showing scientifically valid evidence of effectiveness, he concedes that none exists for COWs, but points to the fact that his company’s competitors also lack that kind of evidence.&lt;br /&gt;&lt;br /&gt;We came to two conclusions about CREW’s contentions: First, their letter suggests that Ignite! did something wrong in selling its product without scientific evidence. A perspective we want to suggest is that nothing in NCLB calls for vendors to base their products on the “science of learning,” let alone conduct WWC-qualified experimental evidence of effectiveness. Nowhere is it stated that vendors are not allowed to sell ineffective products. Education is not like the market for medical products, in which the producers have to prove effectiveness to get FDA approval to begin marketing. NCLB rules apply to school systems that are using federal funds to purchase programs like COW. The IG investigation has to be directed to the state and local agencies that allow this to happen. We think that the investigators will quickly discover that these agencies have not been given much guidance as to how to interpret the requirements. (Of course with Reading First, the Department took a hands-on approach to approving only particular products whose effectiveness was judged to be scientifically based, but this approach was exceptional.)&lt;br /&gt;&lt;br /&gt;Our second conclusion is that the current law is unenforceable because there is insufficient scientific evidence about the effectiveness of the products and services for which agencies want to use their NCLB funds. The law needs to be modified. But the solution is not to water down the provisions (e.g., by allowing anecdotal evidence if that’s all that is available) or remove them altogether as some suggest. The idea behind having evidence that an instructional program works is a good one. The law has to address how the evidence can be produced while supporting local innovation and choice. State and local agencies will need the funds to conduct proper evaluations. Most importantly, the law has to allow agencies to adopt “unproven” programs under the condition that they assist in producing the evidence to support their continued usage.&lt;br /&gt;&lt;br /&gt;CREW’s letter misses the mark. But an investigation by the IG may help to ignite a reconsideration of how schools can get the evidence they need. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-1046203702870377924?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/1046203702870377924/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2007/12/what-happens-when-publisher-doesnt-have.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1046203702870377924'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/1046203702870377924'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2007/12/what-happens-when-publisher-doesnt-have.html' title='What Happens When a Publisher Doesn’t Have Scientific Evidence?'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-445309708908424927</id><published>2007-10-15T11:13:00.000-07:00</published><updated>2010-01-15T11:27:41.716-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='random assignment experiments'/><category scheme='http://www.blogger.com/atom/ns#' term='Miller-McKeon Draft'/><category scheme='http://www.blogger.com/atom/ns#' term='Scientifically Based Research'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='Reading First program'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><category scheme='http://www.blogger.com/atom/ns#' term='scientific research'/><category scheme='http://www.blogger.com/atom/ns#' term='Senators Lugar and Bingaman'/><title type='text'>Congress Grapples with the Meaning of “Scientific Research”</title><content type='html'>Good news and bad news. As reported recently in &lt;a href="http://www.edweek.org/ew/index.html"&gt;Education Week&lt;/a&gt;(Viadero, 2007, October 17), pieces of legislation currently being put forward contain competing definitions for scientific research. The good news is that we may finally be getting rid of the obtuse and cumbersome term “Scientifically Based Research.” Instead we find some of the legislation using the ordinary English phrase “scientific research” (without the legalese capitalization). So far, the various proposals for NCLB reauthorization are sticking with the idea that school districts will find scientific evidence useful in selecting effective instructional programs and are mostly just tweaking the definition.&lt;br /&gt;&lt;br /&gt;So why is the definition of scientific research important? This gets to the bad news. It is important because the definition—whatever it turns out to be—will determine which programs are, in effect, on an approved list for purchase with NCLB funds.&lt;br /&gt;&lt;br /&gt;Let’s take a look at two candidate definitions, just focusing on the more controversial provisions.&lt;br /&gt;&lt;br /&gt;    * The Education Sciences Reform Act of 2002 says that research meeting its “scientifically based research standards” makes “claims of causal relationships only in random assignment experiments or other designs (to the extent such designs substantially eliminate plausible competing explanations for the obtained results) ”&lt;br /&gt;    * However, the current House proposal (the Miller-McKeon Draft) defines “principles of scientific research” as guiding research that (among other things) makes “strong claims of causal relationships only in research designs that eliminate plausible competing explanation for observed results, which may include but shall not be limited to random assignment experiments.”&lt;br /&gt;&lt;br /&gt;Both say essentially the same thing, but the new wording takes the primacy off random assignment and puts it on eliminating plausible competing explanations. We see the change as a concession to researchers who find random assignment too difficult to pull off. These researchers are not, however, relieved of the requirement to eliminate competing explanations (for which randomized control remains the most effective method). Meanwhile, another bill, introduced recently by Senators Lugar and Bingaman takes a radically different approach to a definition.&lt;br /&gt;&lt;br /&gt;    * This bill defines what it means for a reading program to be “research–proven” and proposes the requirements for the actual studies that would “prove” that the program is effective. Among the minimum criteria described in the proposal are:&lt;br /&gt;&lt;br /&gt;    * The program must be evaluated in not less than two studies in which:&lt;br /&gt;    * The study duration was not less than 12 weeks.&lt;br /&gt;    * The sample size of each study is not less than five classes or 125 students per treatment (10 classes or 250 students overall). Multiple smaller studies may be combined to reach this sample size collectively.&lt;br /&gt;    * The median difference between program and control group students across all qualifying studies is not less than 20 percent of student-level standard deviation, in favor of the program students.&lt;br /&gt;&lt;br /&gt;As soon as legislation tries to be this specific, counter examples immediately leap to mind. For example, we are currently conducting a study of a reading program that fits the last two points but, because the program is designed as a 10-week intervention, it can never become research-proven under this definition. Another oddity is that the size of the impact and the size of the sample are specified, but not the level of confidence required—it is unlikely we would have any confidence in a finding of a 0.2 effect size with only 10 classrooms in the study. Perhaps the most unacceptable part of this definition is the term “research-proven.” This is far too strong and absolute. It suggests that as soon as two small studies are completed, the program gets a perpetual green light for district purchases under NCLB.&lt;br /&gt;&lt;br /&gt;As odd as this definition may be, we can understand why it was introduced. The most prevalent interpretation of the requirement for “Scientifically Based Research” in NCLB has been that the program under consideration should have been written and developed based on findings derived from scientific research. It was not required that the program itself have any scientific evidence of effectiveness. The Lugar-Bingaman proposal calls for scientific tests of the program itself. In Reading First, programs that had actual evidence of effectiveness were famously left off the approved list, while programs that simply claimed to be designed based on prior scientific research were put on. This proposal will help to level the playing field. To avoid the traps that open up when specific designs are legislated, perhaps the law could call for the convening of a broadly representative panel to hash out the differences between competing sets of criteria rather than enshrine one abbreviated set in federal law.&lt;br /&gt;&lt;br /&gt;But even with consensus on the review criteria for acceptable research (and for explaining the trade–offs to the consumers of the research reviews at the state and local level), we are still left with an approved list—a set of programs with sufficient scientific evidence of effectiveness to be purchased. Meanwhile new programs (books, software, professional development, interventions, etc.) are becoming available every day that have not yet been “proven.”&lt;br /&gt;&lt;br /&gt;There is a relatively simple fix that would help democratize the process for states and districts that want to try something because it looks promising but has not yet been “proven” in a sufficient number of other districts. Wherever the law says that a program must have scientific research behind it, also allow the state or district to conduct the necessary scientific research as part of the federal funding. So for example, where the Miller–McKeon Draft calls for&lt;br /&gt;&lt;br /&gt;“a description of how the activities to be carried out by the eligible partnership will be based on a review of scientifically valid research,”&lt;br /&gt;&lt;br /&gt;simply change that to&lt;br /&gt;&lt;br /&gt;“a description of how the activities to be carried out by the eligible partnership will be based on a review of, or evaluation using, scientifically valid research.”&lt;br /&gt;&lt;br /&gt;Similarly, a call for&lt;br /&gt;&lt;br /&gt;“including integrating reliable teaching methods based on scientifically valid research”&lt;br /&gt;&lt;br /&gt;can instead be a call for&lt;br /&gt;&lt;br /&gt;“including integrating reliable teaching methods based on, or evaluated by, scientifically valid research.”&lt;br /&gt;&lt;br /&gt;This opens the way for districts to try things they think should work for them while helping to increase the total amount of research available for evaluating the effectiveness of new promising programs. Most importantly, it turns the static approved list into a process for continuous research and improvement. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-445309708908424927?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/445309708908424927/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2007/10/congress-grapples-with-meaning-of.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/445309708908424927'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/445309708908424927'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2007/10/congress-grapples-with-meaning-of.html' title='Congress Grapples with the Meaning of “Scientific Research”'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-8814869972631919809</id><published>2007-09-15T11:28:00.000-07:00</published><updated>2010-01-15T11:31:38.697-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='what works Clearinghouse'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='Kathleen Kennedy Manzo'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><category scheme='http://www.blogger.com/atom/ns#' term='Bob Slavin'/><title type='text'>Ed Week: “Federal Reading Review Overlooks Popular Texts”</title><content type='html'>The August 29, 2007 issue of Education Week reports the release of the What Works Clearinghouse’s review of beginning reading programs. Out of nearly 900 studies that were reviewed, only 51 met the WWC standards—an average of about two studies per reading program that were included. (120 other reading programs were examined in 850 studies deemed methodologically unacceptable.) The article, written by Kathleen Kennedy Manzo, notes that the major textbook offerings, on which districts spend hundreds of millions of dollars, did not have acceptable research available. Bob Slavin, an accomplished researcher and founder of the Success for All program (which got a middling rating on the WWC scale), also noted that the programs reviewed were mostly supplementary and smaller intervention programs, rather than the more comprehensive school-wide programs.&lt;br /&gt;&lt;br /&gt;Why is there this apparent bias in what is covered in WWC reviews? Is it in the research base or in the approach that the WWC takes to reviews? It is a bit of both. First it is easier to find an impact of a program when it is supplemental and it is being compared to classrooms that do not have that supplement. This is especially true where the intervention is intense and targeted to a subset of the students. In contrast, consider trying to test a basal reading program. What does the control group have? Probably the prior version of the same basal or some other basal. Both programs may be good tools for helping teachers teach students to read, but the difference between the two is very hard to measure. In such an experiment, the “treatment” program would have “no discernible effect” (the WWC category for no measurable impact). Unlike a medical experiment where the control group gets a placebo, we can’t find a control group that has no reading program at all. Probably the major reason there is so little rigorous research on textbook programs is that districts usually have no choice: they have to buy one or another. Research on supplementary programs, in contrast, can inform a discretionary decision and so has more value to the decision-maker.&lt;br /&gt;&lt;br /&gt;While it may be hard to answer whether one textbook program is more effective than another, a better question may be whether one works better for specific populations, such as inexperienced teachers or English learners. It is a useful question if you are deciding on a text for your particular district but it is not a question that is addressed in WWC reviews.&lt;br /&gt;&lt;br /&gt;Another characteristic of WWC reviews is that the metric of impact is the same whether it is a small experiment on a highly defined intervention or a very large experiment on a comprehensive intervention. As researchers, we know that it is easier to show a large impact in a small targeted experiment. It is difficult to test something like Success for All that requires school-wide commitment. At Empirical Education we suggest to educators that WWC is a good starting point to find out what research has been conducted on interventions of interest. But the WWC reviews are not a substitute for trying out the intervention in your own district. In a local experimental pilot, the control group is your current program. Your research question is whether the intervention is sufficiently more effective than your current program for the teachers or students of interest to make it worth the investment. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-8814869972631919809?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/8814869972631919809/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2007/09/ed-week-federal-reading-review.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/8814869972631919809'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/8814869972631919809'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2007/09/ed-week-federal-reading-review.html' title='Ed Week: “Federal Reading Review Overlooks Popular Texts”'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7197809943968080149.post-8132570018893971295</id><published>2007-06-15T11:31:00.000-07:00</published><updated>2010-01-15T11:37:44.656-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Dr. Chris Dede'/><category scheme='http://www.blogger.com/atom/ns#' term='SETDA'/><category scheme='http://www.blogger.com/atom/ns#' term='empirical education'/><category scheme='http://www.blogger.com/atom/ns#' term='CoSN'/><category scheme='http://www.blogger.com/atom/ns#' term='Denis Newman'/><category scheme='http://www.blogger.com/atom/ns#' term='methodological standards'/><category scheme='http://www.blogger.com/atom/ns#' term='ISTE'/><title type='text'>National Study of Educational Software a Disappointment</title><content type='html'>The recent &lt;a href="http://ies.ed.gov/ncee/pubs/20074005/index.asp"&gt;report&lt;/a&gt; on the effectiveness of reading and mathematics software products provides strong evidence that, on average, teachers who are willing to pilot a software product and try it out in their classroom for most of a year are not likely to see much benefit in terms of student reading or math achievement. What does this tell us about whether schools should continue purchasing instructional software systems such as those tested? Unfortunately, not as much as it could have. The study was conducted under the constraint of having to report to Congress, which appropriates funds for national programs, rather than to the school district decision-makers, who make local decisions based on a constellation of school performance, resource, and implementation issues. Consequently we are left with no evidence either way as to the impact of software when purchased and supported by a district and implemented systematically.&lt;br /&gt;&lt;br /&gt;By many methodological standards, the study, which cost more than $10 million, is quite strong. The use of random assignment of teachers to take up the software or to continue with their regular methods, for example, assures that bias from self-selection did not play a role as it does in many other technology studies. In our opinion, the main weakness of the study was that it spread the participating teachers out over a large number of districts and schools and tested each product in only one grade. This approach encompasses a broad sample of schools but leaves the individual teachers often as the lone implementer in the school and one of only a few in the district. This potentially reduces the support that would normally be provided by school leadership and district resources, as well as the mutual support of a team of teachers in the building.&lt;br /&gt;&lt;br /&gt;We believe that a more appropriate and informative experiment would focus in the implementation in one or a small number of districts and in a limited number of schools. In this way, we can observe an implementation measuring characteristics such as how professional development is organized and how teachers are helped (or not helped) to integrate the software with district goals and standards. While this approach allows us to observe only a limited number of settings, it provides a richer picture that can be evaluated as a small set of coherent implementations. The measures of impact, then, can be associated with a realistic context.&lt;br /&gt;&lt;br /&gt;Advocates for school technology have pointed out limitations of the national study. Often the suggestion is that a different approach or focus would have demonstrated the value of educational technology. For example, a &lt;a href="http://www.iste.org/Template.cfm?Section=Home&amp;CONTENTID=16605&amp;TEMPLATE=/ContentManagement/ContentDisplay.cfm"&gt;joint statement from CoSN, ISTE, and SETDA&lt;/a&gt; released April 5, 2007 quotes Dr. Chris Dede, Wirth Professor in Learning Technologies at Harvard University: “In the past five years, emerging interactive media have provided ways to bring new, more powerful pedagogies and content to classrooms. This study misestimates the value of information and communication technologies by focusing exclusively on older approaches that do not take advantage of current technologies and leading edge educational methods.” While Chris is correct that the research did not address cutting edge technologies, it did test software that has been and, in most cases, continues to be successful in the marketplace. It is unlikely that technology advocates would call for taking the older approaches off the market. (Note that Empirical Education is a member of and active participant in CoSN.)&lt;br /&gt;&lt;br /&gt;Decision-makers need some basis for evaluating the software that is commercially available. We can’t expect federally funded research to provide sufficiently targeted or timely evidence. This is why we advocate for school districts getting into the routine of piloting products on a small scale before a district-wide implementation. If the pilots are done systematically, they can be turned into small-scale experiments that inform the local decision. Hundreds of such experiments can be conducted quite cost effectively as vendor-district collaborations and will have the advantage of testing exactly the product, professional development, and support for implementation under exactly the conditions that the decision-maker cares about. —DN&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7197809943968080149-8132570018893971295?l=empiricaleducation.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://empiricaleducation.blogspot.com/feeds/8132570018893971295/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://empiricaleducation.blogspot.com/2007/06/national-study-of-educational-software.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/8132570018893971295'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7197809943968080149/posts/default/8132570018893971295'/><link rel='alternate' type='text/html' href='http://empiricaleducation.blogspot.com/2007/06/national-study-of-educational-software.html' title='National Study of Educational Software a Disappointment'/><author><name>Empirical Education</name><uri>http://www.blogger.com/profile/12947834552180195906</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
