A copula-based consistency analysis of education indicators

In this paper we investigate the consistency of quality indicators of the Brazilian public educational system. According to the newspaper Estado de São Paulo – Brazil, of January 18, 2017, only 7.3% of students in the third year of high school have an adequate level of mathematics, this shows the relevance of the evaluation and assessment of the Brazilian educational system. In this paper we explore the dependence between two indicators: (i) mean value between the proportions (in two subjects: Portuguese and Mathematics) of students under the basic level (SARESP classification) and (ii) rate of fails, during the years 2013, 2014 and 2015. (i) and (ii) are bases to define the educational quality of public schools for the population of young people, between 14 and 17 years old. This inspection is carried out through the Bayesian estimation of the parameters of the Asymmetric Cubic Sections (ACS) copula. We show that the dependence profile, year after year, behaves in a very unstable way, although during those years there were no substantial changes which justify such instability. Through the copula we compute conditional probabilities of tail events. We verify that an inversion occurred in the concordance/discordance between (i) and (ii). We compute the probability of (i) assuming high values, conditioned to a threshold in (ii). In 2013, as the threshold in (ii) increases the probability increases (concordance), in 2014 the threshold in (ii) is almost irrelevant to the probability and in 2015, as the threshold in (ii) increases the probability decreases (discordance). The inspection of the tail dependence allows to expose some kind of manipulation, in view of for instance, the maintenance of a global index índice de desenvolvimento da educação de São Paulo (IDESP) used to classify the educational institutions.


Introduction
With information available almost constantly and coming from institutions, it is now possible to regularly review processes that impact on the life of those institutions, as is the case of institutions related to health, safety and education, among others, so that reviewing processes is a healthy task. For institutions to increase their performance, internal strategies are usually incorporated, such as measuring their processes along some period of time and using indices defined by consent. Some sectors have consolidated indices and can be used to identify performance changes. In general, indices are constructed with the intention of reproducing reality, summarizing it in just one value or few values that have simple interpretation and that are easy to calculate.
In the present study we investigate the relationship between two indicators of the Brazilian educational system. According to the newspaper Estado de São Paulo -Brazil, of January 18, 2017, only 7.3% of students in the third year of high school have an adequate level of mathematics, this shows the relevance of the constant inspection of the educational system performance. We restrict our study to the intermediate level (14-17 years old students) of public schools in the region of Guarulhos, years 2013, 2014 and 2015. Guarulhos is a city in the São Paulo state. The city of São Paulo, capital of the São Paulo state is the third most populated city in America, being behind New York and Mexico City. It is also the city with the largest Gross Domestic Product (GDP) in Latin America, which makes it a city of reference. The city of São Paulo has been approaching various municipalities of the state, because of its constant expansion. For instance, in the northeast with the municipality of Guarulhos. Guarulhos is the second most populous city in the state of São Paulo.
In this study we inspect two indicators, denoted by X and Y. These indicators compound a global index used in the state of São Paulo and called índice de desenvolvimento da educação de São Paulo (IDESP) created in 2007, http://idesp.edunet. sp.gov.br/. Thus, X = the annual proportion of students classified below the baseline, per school and Y = the annual failure rate, per school. Educational policies in São Paulo state, encourage the school monitoring in function of several indices, between them the IDESP. The proposal is to achieve a value of IDESP equal to or higher than five by 2030. Given that according to the Organisation for Economic Co-operation and Development (OECD) this value makes it possible to level public schools in Brazil with schools of excellence of member countries of the OECD, see more details in [1]. We see in Figures 1 and 2 that schools in Guarulhos expose a low IDESP value in comparison with the goal, despite the constant efforts made to improve their performance.
Since 2007 (year of creation of IDESP) the IDESP does not show a progressive evolution, which has led us to inspect some of its components, the most influential ones, which are X and Y. There are four levels at which students can be classified, those are: (i) under the basic level, (ii) basic level, (iii) adequate level, and (iv) advanced level, defined from an annual assessment called Sistema de Avaliação de Rendimento Escolar do Estado de São Paulo (SARESP). Students under the basic level demonstrate insufficient mastery of the contents, the skills and the abilities desirable for the school serie in which they find themselves. For details see the next two sections of this paper. Under an ideal and simplistic perspective the variables X and Y should exhibit a linear/concordant relationship between them. In this case we do not perceive that (as we can conclude from Tab. 1), which leads us to study and model the dependence between X and Y assuming a more general approach. We use the Asymmetric Cubic Sections (ACS) copula to describe the dependence between X and Y. We perform the estimation of the parameters of the model, under a Bayesian perspective, year by year. This procedure allows us to construct annual estimates of ProbðU > ujV > vÞ and annual estimates of the expected value EðU jV > vÞ where U are the ranks of X scaled to [0,1] and V are the ranks of Y scaled to [0,1]. In general terms, these quantities allow us to compare year by year the impact of high Y values on the values of X. More precisely, if we have observed high failure rates, we see how they affect the probability of high rates of students below the baseline and how those high failure rates impact in the mean value of rates of students below the baseline. The ACS family has already shown a good performance in applications in the area, see for example [2] and [3]. It is also compatible with our data which, as we shall see, shows very low correlation. Moreover, this family is analytically simple to treat, which facilitates its computational implementation.
In this paper, we will introduce the real problem as well as the description of the data in Section 2. Section 3 shows the model and the results. Finally we show our general conclusions in the Conclusion section, which is followed by the acknowledgments and the references.

Index of education development of São Paulo State
In this section we explain the construction of the IDESP and we show the reasons that lead us to study two quantities that contribute to its definition.
The SARESP system aims to evaluate the educational quality of the schools and not the performance of each student directly. This system provides different levels of classification: under de basic, basic, adequate and advanced and those levels are used to compose the IDESP, which serves as a measure of improvement in the quality of education in the state. The levels serve to diagnose the reality of the students of a given school, so it is possible through these results to develop projects in charge of the teachers of that school, in order to recover the skills not developed by that particular group of students. In the SARESP system, the classification of each student in one of the four levels is done separately in two subjects Portuguese and Mathematics. For each subject and for each school is computed the proportion of students inside each level, for under the basic: a m , a p ; basic: b m , b p ; adequate: c m , c p and advanced: d m , d p respectively. The quantities with subscript m(p) are related to Mathematics (Portuguese) and a m + b m + c m + d m = 1, a p + b p + c p + d p = 1, respectively. Formally the IDESP index, denoted by g, is defined as follows: where Á is the mean value between Á m and Á p , D ¼ 10: And f is the proportion of approved students. For instance, when the proportion a m = 1, the other proportions are zero b m = c m = d m = 0, and we obtain Á m = 0 (low quality in Mathematics). When d m = 1, the other proportions are zero, a m = b m = c m = 0 and Á m = 10 (high quality in Mathematics). This means that high values of g indicate that the school shows a good overall performance. That is, as expected, high values of under the basic and high failure rates are indicators of poor performance, implying in low values of g. Each year, the schools receive individual goals to be achieved, and defined by the IDESP. These goals are generated by the Education Secretary (http://www.educacao.sp.gov.br/) and based on the result of the IDESP index of the previous year. When a school reaches the growth goal totally or partially, all the school staff is awarded with a monetary complement, by merit, known as education bonus. If the school has high failure rates and a high number of students under the basic level, the school tends to have a low educational indicator, and consequently does not receive the bonus. If this continues, during three consecutive years, the school becomes a priority unit and as a consequence, the school can undergo by pedagogic interventions and detailed monitoring by the regional institution destined to do this, until the school changes its indicators. In the case of Guarulhos region this function is exercised by two sectors: Diretoria de ensino Guarulhos Norte, see http://deguarulhosnorte.educacao.sp.gov.br/ and Diretoria de ensino Guarulhos Sul, see http://deguarulhossul.educacao.sp.gov.br/. What usually occurs is that schools have high numbers of students classified under the basic level, which should also lead to high failure rates. But in order to mitigate the impact this would have on the indicator, the school regulates the failure rates, always keeping the same pattern in the indicator, regardless of the number of students under the basic level. That is, what regulates the promotions of the students is not how much they learn but a structural reality coming from methods of external control. This prospect is worrying, because every year some students receive the promotion to the next series without knowing the minimum required in the previous one. Consequently it becomes also more difficult to recover these students, in view of the accumulated great lag caused by this automatic promotion. It is uninteresting for a school to have high failure rates, consistent with the number of students below to the basic level, as besides impacting in the fall in the index and directly in obtaining the bonus, the school would also have more work, since it will be necessary to carry out a recovery plan, designed for these students.

Performance levels and fail rates
The data set consists of two scores X and Y recorded for each school and for the intermediate level (from 14 to 17 years old), X = proportion of students classified under the basic level and Y = proportion of fails for that school. We have annual data, from 2013 to 2015. Each school i receives a value x i ¼ amðiÞþapðiÞ 2 ; which is the arithmetic mean between the proportion of students under the basic for each subject. For the second variable, each school i receives the value y i , which is the proportion of fails, by year. State schools participating in this study are listed in http://www.ime.unicamp.br/~veronica/ schools.htm. See also the behavior of IDESP for those years and those schools in Figures 1 and 2. Figure 3 shows the plots of the data X versus Y, for the three years.
Since our focus is to identify the dependence between X and Y, we appeal to the concepts derived from Sklar's theorem (see [4]). If X and Y have joint distribution H, with marginal distributions F and G respectively, that is, for values x and y, As it is a matter of studying the dependence between X and Y, the marginal distributions F and G have nothing to report on the relationship between X and Y. Also, if we define U = F(X) and V = G(Y) the concordance/discordance between X and Y is preserved by U and V, since functions F and G are non-decreasing monotone functions. A natural representation of the values of U (and V respectively) are the empirical ranks of the observations scaled to [0,1] of X (and Y respectively). With this purpose, we compute the pseudo-observationsû i ¼F ðx i Þ n nþ1 andv i ¼Ĝðy i Þ n nþ1 where i = 1, . . ., n,F andĜ are the empirical distributions of X and Y, respectively and n denotes the number of observations (schools).
In Table 1 we expose the Spearman's correlation coefficients between X and Y, year by year. We can note the low values of the Spearman's correlation coefficient q although both variables are related with an unsatisfactory performance and, by coherence need to be associated. We note the inability of the Spearman's correlation coefficient to capture dependence by showing a negative value in 2015. The results of Table 1 only means that is not identified a linear relation between the ranks of the observations, so the alternative is to use a non-linear model to represent the dependence between the rates. Thus, the focus of our study is the identification of C. To get to this identification we will delimit the possibilities of C into a sufficiently flexible family.

Model and results
Here we introduce the model explored in this paper. This model corresponds to a family of copulas that is a perturbation of the case of independence, that is C(u, v) = uv. With this proposal we seek to contemplate also situations of low correlation, as shown in Table 1.

3
: Given the correlation spectrum allowed by the Farlie-Gumbel-Morgenstern family and according to the results of Table 1, our data could respond to this model. Thus, in relation to the estimation of parameters a and b, if they were similar, we could argue that the dependence between U and V is well represented by the Farlie-Gumbel-Morgenstern model.
Looking to explore stochastic-functional relationships between U and V, [5] shows a method of constructing copulas with the property of having cubic cross-sections, one of these models is given by the Definition 3.1. For instance, if we fix v = v 0 in Definition 3.1, we obtain: À v 0 Þða À bÞ: Then, the copula is given by a cubic expression in u. Analogously, if we set u = u 0 , the expression in Definition 3.1 corresponds to a cubic expression in v. In terms of the modeling process, these cubic forms aim to give greater flexibility to the dependence type between U and V, being this more general than a linear dependence type.
Given a specific year we compute the likelihood function of the sample of size n, that is where the function c is given by the equation (1). Assuming a non-informative prior distribution on ða; bÞ 2 H; pða; bÞ / 1; the posterior distribution of (a, b) is proportional to the likelihood function. We use a non-informative prior distribution on (a, b) in order to contain the impact of the prior distribution in the posterior distribution of (a, b). We also observe that the complexity of the parametric space H (see Definition 3.1) could hinder the use of an informative prior distribution without a very solid base. About literature linking copula's theory and Bayesian estimation, see [6] and [3]. The Bayesian estimates of a and b, under quadratic loss function, for each year are shown in Table 2. In 2015, five schools did not participate in the study, these are: Profa. Alice Chuery, Conselheiro Crispiniano, Hugo de Aguiar, Profa. Ilia Zilda Innocenti Blanco and Vila Any. We see that in none of the three cases the model indicates the Farlie-Gumbel-Morgenstern copula, since the estimates of a and b look very different. A Bayesian approach is appropriate in those cases for several reasons, between them we note: a moderate sample size to implement a frequentist estimation of two parameters and the constrains over the parameters a and b. We estimate the probability ProbðU > ujV > vÞ and the expected value EðU jV > vÞ by means of the values reported in Table 2, as follows. If X and Y are continuous with cumulative distributions F and G respectively, given U = F(X) and V = G(Y) with 2-copula C, ProbðU > ujV > vÞ ¼ 1ÀuÀvþCðu;vÞ 1Àv : Then, using the Definition 3.1 we can define the estimation of ProbðU > ujV > vÞ asP ðU > ujV > vÞ ¼ 1ÀuÀvþCðu;vjðâ;bÞÞ 1Àv ; u; v 2 ½0; 1 : as a consequence Computing the partial derivative of the copula given by Definition 3.1 we obtain from the equation (3), Then, we propose the estimation: Returning to the real problem, we expect the variables X = proportion of students classified under the basic level and Y = proportion of fails for that school, to show a performance compatible with what they are measuring. To investigate in detail the coherence in the dependence between X and Y, observed year after year, we first focus on the conditional dependence between tail events, estimated by the equation (2), then we show a more traditional study on the mean value of U (ranks of X) conditioned to thresholds in V (ranks of Y) estimated by equation (4).

Conditional tail dependence
The most reasonable behavior of (2) is to show an increasing tendency in the upper tail. This is, it is expected that high values of U to be concentrated with high values of V. We will show what we verify in the estimates, for certain values of U (ranks of X) and in relation to all possible values of V (ranks of Y). The behavior of (2), year by year is illustrated in Figures 4 and 5, for the cases u = 0.5, 0.7 and 0.9. See Table 3, for other values of u.
In 2013, (2) is given by a concave quadratic curve. We note that as u increases (2) changes by being formed only by the increasing part of the curve, also its concavity is less pronounced, revealing an almost linear and increasing aspect in the case of u = 0.9. The curves (2) of 2014 and 2015 are convex quadratic curves. For the year 2014, we see that as u grows, the curve goes taking a constant aspect. We can also verify this fact by inspecting Table 3 (case 2014). This statement can be better visualized in the Figure 6. For instance, given any threshold v, the probability of U > 0.9 is almost constant. In practical terms this means that large proportions of students below the basic level do not depend on any failure rate. Evidently, this exposes an extreme contradiction. In the case of 2015, we observe that as u grows the curve loses its convexity and exhibits an almost linear and decreasing behavior, for large threshold values in U (see also Tab. 3). That is, the higher the threshold in V, the smaller the chance of U exceeding values close to 1.
Since the dependence between X and Y is the same as the dependence between U and V, we see how there was a concrete deterioration from 2013 to 2015, of the relationship between X and Y. Arriving at the point of showing conditional discordance between X and Y (in 2015) and going through conditional independence (in 2014), which does not make sense from the meaning of the variables.   Figure 6.PðU > ujV > vÞ according to equation (2), for v 2 ½0; 1 and year: 2014.

Central tendency
To build a global view of the behavior of U (ranks of students classified under the basic level) conditioned to values of V (ranks of fails) that exceed a threshold v, we will estimate EðU jV > vÞ by equation (4). When comparing the 3 years, a similar behavior of (4) is expected. Since we are inspecting consecutive years where non changes happened in the educational system. Figure 7 and Table 4 show the results.
We note that, the relationship between U and V exhibits different behaviors, when considered during these 3 years, one is a concave function and two are convex functions (see also Fig. 4). This fact shows the lack of robustness of the process of dependence between X and Y. We can compare the behavior ofÊðU jV > vÞ with the conditional probabilitŷ P ðU > u 0 jV > vÞ where u 0 is the value corresponding with the median of X, as listed by Table 5.
We verify that the functional performance ofÊðU jV > vÞ (Fig. 7) andP ðU > u 0 jV > vÞ (Fig. 8) is similar as already anticipated when comparing Figures 4(a) and 7. In Table 4 we show the values given by equation (4) Table 6 we show the values given by (4)  These results lead us to observe Table 1, where the Spearman's correlation coefficient exposes its fragility. In the same way, it is to be expected that the mean values computed here do not clearly point out what is happening, in the tail region of [0,1] 2 where we are interested in tracking the concordance/discordance between U and V. This fact justifies the previously developed conditional study.

Conclusion
In this paper we explore the dependence between two indicators: (i) mean between the proportions (in Portuguese and Mathematics) of students under the basic level (SARESP classification) and (ii) rate of fails, during the years 2013, 2014 and 2015. The data is coming from around 100 public schools of the Guarulhos city, the second largest city of the São Paulo state. The inspection of the dependence is carried out by means of a Bayesian copula estimation, through the Bayesian estimation of the parameters of the ACS copula, a model adopted for its flexibility. We show that the dependence profile, year after year, behaves in a very unstable way, although during those years there were no substantial changes which justify such variable behavior. The Bayesian point estimation of the parameters indicates this instability, see Table 2 and also confirmed by the influence of those estimations in the mean conditional curve given by equation (4). The mean value of the ranks of (i) conditioned to a threshold in (ii) shows a very different behavior when we compare the 3 years. According to the indications reported by Table 1, global measures, such as those computed via the conditional mean value (4) may not be appropriated to identify what is happening. Since, is suspected that some kind of handling may exists in (i) and/or (ii), due to the structural aspects of the educational system, which could explain the difference in dependence profiles, as is the case of Figure 7. To understand the relation between (i) and (ii) we inspect the conditional dependence in different upper tail regions of [0,1] 2 of the marginal ranks of (i) and (ii) scaled to [0,1]. We can see the representation of the behavior of tail events given by equation (2) in Figure 5. We see that in 2013 the behavior of the conditional probability is the expected, since, the higher threshold in rate of fail, the higher the probability of classification under the basic level be superior to 90%. In 2014, the thresholds of rate of fail do not influence the probability of classification under the basic level being greater than 90%. In 2015, to higher threshold in rate of fail is lesser the probability of classification under the basic level be superior to 90%. That is to say that the relation of concordance between (i) and (ii) verified in 2013 is inverted for discordance in 2015, precisely in the most critical values which are high failure rates and high proportions under the basic level.
Based on the study, we perceive the need to review the use of global indices such as the IDESP, for the development of policies to control the quality of education. As illustrated in Figure 1, the IDESP appears to exhibit some stability or very slight improvement and at the same time is able to mask relevant and decisive aspects for quality in education. More precisely, it allows mitigating the effects of relevant indicators, as the case of (i) and (ii).  Table 5. Table 6.PðU > u 0 jV > vÞ according equation (2), years 2013, 2014 and 2015, with u 0 given by