Assessment Literacy: Item Statistics

At the 2018 Connecting Communities of Education Stakeholders Conference in Greensboro, I was pleased to see several sessions on using Schoolnet data to identify skill weaknesses and to form remediation student groups. The presenter began by listing listed the reasons teachers struggle with data analysis, with the reasons including:

  • Not enough time
  • Not knowing how to interpret the data
  • Too many numbers, and
  • Can’t get past the negative data (results).

The presenter then outlined the steps and strategies which are essential for making the most of the Schoolnet student scores.  These steps included looking at the overall average score of the assessment and the percentage of students who correctly answered each item. I completely endorse the approach the presenter recommended for examining the assessment data and identifying students in need. I recommend taking look at item statistics by someone at the district level and communicating information about “weak” items to the teachers to avoid misinterpretation of the student assessment results.

Digging Deeper Into the Scores

I sat next to an instructional coach who was following along with her own district’s benchmark data as she shared her reports with me for discussion. As we looked at the student scores, I asked if she had examined the item discrimination values. She was not aware of this term. I explained that item discrimination was the ability of the item to differentiate high performing students from low performing students.  For example, let’s say an item had a p-value of .50, which means that 50% of the students got the answer correct. So we could conclude it was an item of medium difficulty. However, if the item has a very low discrimination value, it means that the higher performing students got the item incorrect, while the lower performing students got the answer correct.

Here is the sample data for the item:
Answer A – incorrect 40% of the high performing students selected that answer.
Answer B – correct 42% of the lower performing students selected that answer.
Answer C – incorrect 10% of the lower performing students selected that answer.
Answer D – incorrect 8% of the lower performing students selected that answer.

As you can see in this oversimplified example, the item was probably answered correctly by chance and the item did not add to our understanding of student performance on that skill. The question to be asked is: What led the high performing students to select answer A?

  1. Was the item worded in such a way that led high performing students to mistakenly select answer A?
  2. Was there some misunderstanding of what was taught – learned that resulted in the high performing students’ error?
  3. Select items
Item Discrimination Statistics

The statistical procedure for computing item discrimination is called a point biserial correlation. This procedure transforms the responses to the item from A-D to a 0 or a 1 for incorrect and correct. A correlation is then performed using the 0 and 1 score against the total percent correct score. For this example shown in the graph, the point biserial correlation is -0.511.

 When more students in the lower performing group than in the upper performing group select the right answer to an item, the item actually has negative validity. Assuming that the criterion itself has validity, the item is not only useless but is actually serving to decrease the validity of the test.
See:   for an excellent discussion of this topic.

 My Recommendations:
  1. Design your assessments so that each answer choice will provide meaningful information about the students’ understanding of the underlying skill.
  2. Include enough items on the assessment so that the skills are adequately sampled. The instructional coach showed me an assessment of 15 items and the average percent correct was 45% correct. This small number of items means that decisions were being made about students who got only 5-7 items correct.
  3. Field test each benchmark with a few students to identify problems before administering the test to hundreds of students.
  4. Always look at the p-value and item discrimination values for each item.
  5. Ask student why they selected an answer, especially if the item has a low discrimination value.
  6. Train the teachers in how to interpret the data.
    1. Provide protocols to guide the teachers in the interpretation process.
    2. Provide item analysis information such as if the student selected the incorrect foil A, it probably means there is a misunderstanding of a particular concept. That way all students who got that item incorrect and answered A can be remediated on the misunderstanding.
    3. Provide the data in a way that it is easy to view and manipulate.
    4. Have teachers COLLABORATIVELY examine their data and have a shared discussion so one teacher can scaffold other teachers’ understanding of the test results.

Dr. Lewis Johnson
Lead Consultant
Data Smart LLC

PowerSchool Enterprise Reporting

At the 2018 Connecting Communities of Education Stakeholders Conference, I was very pleased to be in a session where Mark Housner from NCDPI demonstrated the use of Oracle Application Express (APEX) for PS Enterprise Reporting.  All of the reports the user can create are interactive style reports and allow for some manipulation of the data. SQL queries can be created using the query builder, combining data from tables in the ER categories and turned into reports.

There are two applications within the APEX site when you log in, and the system will allow you to create other applications, however, when the site is refreshed, the other added applications are removed. Therefore, it is best to create new pages and reports within the existing two applications using data within the tables accessible in Enterprise Reporting. If you are interested in using this functionality in ER, I suggest learning basic SQL by taking a course on Udemy titled Oracle SQL – Step by Step SQL.  Also, a course in Application Express from Udemy is quite helpful. These two courses are the next best thing to spending thousands of dollars attending online courses on SQL and APEX. .  

A limitation of the ER APEX is the restriction of adding other data into the queries which are not included in the Power School ER categories, such as mClass data and EVAAS exports such as projected scores. For this reason, a complete data reporting solution really needs to be external from the Enterprise Reporting environment. For that reason, Data Smart LLC builds its data mart sites using Application Express in a secure hosted environment, using a custom security system for allowing access to the data, AND an unlimited number of data tables.

Have fun with APEX in Enterprise Reporting, and if you want to know more about how APEX can eliminate the spreadsheet reporting for testing data contact

select FNAME.Table 1, LNAME.Table2,
          FNAME ||’, ‘|| LNAME
            from  Table1
            inner join Table2 on Table1.SID = Table2.SID;
                        where FNAME = ‘Lew’

Dr. Lew Johnson
Application Express Developer

Seeing Your Data

For the past few weeks, I have had the opportunity to spend more time learning instead of just “doing” data. It has been great to step back and re-connect with a few of the data programs I have used and dig into one that I have known about for quite some time. The program is expensive as a single tool for reporting data, so due to its cost did not make it a priority to learn. The program I am referring to is Tableau ( ).

Tableau is primarily a data visualization tool, which means it specializes in reporting data through presenting it in graphic form, not in table form. The data marts created by Data Smart LLC generally have a mix of tables and visualizations. Due to some limitations within Application Express ( ) not all
graphic  styles are available, such as box plots. While most types of graphs are available in APEX, and made dynamic by auto-updating and filtering, I like box plots for examining test score data.

My solution is the creation of a Tableau visualization, using the test score data without identifiers (PII) and then publishing it to the public server and then linking that box plot graphic to a report region in APEX. This works fine.

But let’s return to the importance and utility of data visualizations. On one page in a report, I list the statistics of the test for the district, then for the school. I also provide histogram graphs so the user can see the distribution of scores. But when the two distributions are combined, now the data begins to tell a story and make understanding of the data much more useful.

APEX Example
Below is the graphic of the school and the district made in APEX. You can immediately see that this school did rather well on the test compared to the district average. You now can see the distributions and begin to think more deeply about the data. That is what visualizations do! Then you want to dig deeper into knowing more about the particulars of the data.   histogram Graphic

In summary, as data reporters and data analysts, we need to not be confined to one data reporting tool. We also need more than one format for reporting and presenting the data. We should use a variety of tools and make data presentations truly inform the user.

The Student Side of Data

Sharing Data with Students
It is common for school districts to compile and analyze student test scores in an effort to understand what is happening in the schools and to identify areas which need attention to improve performance. One practice, which I strongly support, is the reporting of student performance to students in a manner which assists them individually in understanding their strengths and weaknesses, so they can understand the relationship between what they do and the results. Additionally, it is worthwhile to let each student know specifically what skills he/she need to work on to improve.

This practice of sharing student performance is essential for engaging the student in the education process. By engaging the student in this way educators can address the other side of the learning coin: the affective component. It has gained attention now as Social-Emotional Learning. Aspects of Social-Emotional Learning include but are limited to: creating a growth mindset, developing learning self-efficacy, fostering a sense of belonging in school.

Interaction Effects
When teaching pre-service teachers, I presented a list of education strategies, based on lists similar to what is presented in Marzano’s “What Works” and Hattie’s “Visible Learning”.  The key feature of the message was the interaction effect of using some of these strategies. For example if strategy 1 resulted in 2 units of improved student performance, and strategy 2 resulted in 2 units of improved student performance, then it is quite possible that through the use of BOTH the interactive effect could yield 6 units of improved performance. This is the same concept of interaction effects found in statistics.

Therefore, the question at hand is “What are we waiting for”? There is evidence that tapping into the Social-Emotional aspects of learning is powerful. We have the tools to support and measure these student characteristics. If these data points could be collected and viewed within the context of student performance, we could have evidence that teaching “both sides of the coin” works and more importantly, we could intervene by understanding which students need attention to optimize their performance.

A Resource
A significant resource to build capacity in this area is the information and surveys from the Panorama Education website
. On this site you will find information and tools needed to get started with the affective side of your student population. Use this information to formulate a plan for addressing the powerful component which will serve as a catalyst for improved student growth and performance.