https://eeecon.uibk.ac.at/~zeileis/Achim Zeileis2019-07-01T08:08:41+02:00Research homepage of Achim Zeileis, Universität Innsbruck. <br/>Department of Statistics, Faculty of Economics and Statistics. <br/>Universitätsstr. 15, 6020 Innsbruck, Austria<br/>Tel: +43/512/507-70403Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/Jekyllhttps://eeecon.uibk.ac.at/~zeileis/news/power_partitioning/The power of unbiased recursive partitioning2019-07-01T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/The significance tests underlying the unbiased tree algorithms CTree, MOB, and GUIDE are embedded into a unifying framework. This allows to assess relative strengths and weaknesses in a variety of setups, highlighting the advantages of score-based tests (as in CTree/MOB) vs. residual-based tests (as in GUIDE).<p>The significance tests underlying the unbiased tree algorithms CTree, MOB, and GUIDE are embedded into a unifying framework. This allows to assess relative strengths and weaknesses in a variety of setups, highlighting the advantages of score-based tests (as in CTree/MOB) vs. residual-based tests (as in GUIDE).</p> <h2 id="citation">Citation</h2> <p>Schlosser L, Hothorn T, Zeileis A (2019). <em>“The Power of Unbiased Recursive Partitioning: A Unifying View of CTree, MOB, and GUIDE”</em>, arXiv:1906.10179, arXiv.org E-Print Archive. <a href="https://arXiv.org/abs/1906.10179">https://arXiv.org/abs/1906.10179</a></p> <h2 id="abstract">Abstract</h2> <p>A core step of every algorithm for learning regression trees is the selection of the best splitting variable from the available covariates and the corresponding split point. Early tree algorithms (e.g., AID, CART) employed greedy search strategies, directly comparing all possible split points in all available covariates. However, subsequent research showed that this is biased towards selecting covariates with more potential split points. Therefore, unbiased recursive partitioning algorithms have been suggested (e.g., QUEST, GUIDE, CTree, MOB) that first select the covariate based on statistical inference using p-values that are adjusted for the possible split points. In a second step a split point optimizing some objective function is selected in the chosen split variable. However, different unbiased tree algorithms obtain these p-values from different inference frameworks and their relative advantages or disadvantages are not well understood, yet. Therefore, three different popular approaches are considered here: classical categorical association tests (as in GUIDE), conditional inference (as in CTree), and parameter instability tests (as in MOB). First, these are embedded into a common inference framework encompassing parametric model trees, in particular linear model trees. Second, it is assessed how different building blocks from this common framework affect the power of the algorithms to select the appropriate covariates for splitting: observation-wise goodness-of-fit measure (residuals vs. model scores), dichotomization of residuals/scores at zero, and binning of possible split variables. This shows that specifically the goodness-of-fit measure is crucial for the power of the procedures, with model scores without dichotomization performing much better in many scenarios.</p> <h2 id="software">Software</h2> <p>CRAN package: <a href="https://CRAN.R-project.org/package=partykit">https://CRAN.R-project.org/package=partykit</a><br /> Development version with some extensions enabled: <a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-07-01-power_partitioning/partykit_1.2-4.2.tar.gz">partykit_1.2-4.2.tar.gz</a><br /> Replication materials: <a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-07-01-power_partitioning/simulation.zip">simulation.zip</a></p> <h2 id="main-results">Main results</h2> <p>The manuscript compares three so-called unbiased recursive partitioning algorithms that employ statistical inference to adjust for the number of possible splits in a split variable: GUIDE (<a href="http://www.jstor.org/stable/24306967">Loh 2002</a>), CTree (<a href="https://doi.org/10.1198/106186006x133933">Hothorn <em>et al.</em> 2006</a>), MOB (<a href="https://doi.org/10.1198/106186008x319331">Zeileis <em>et al.</em> 2008</a>).</p> <p>First, it is pointed out what the similarities and the differences in the algorithms are, specifically with respect to the split variable selection through statistical tests. Second, the power of these tests is studied for a “stump”, i.e., a single split only. Third, the capability of the entire algorithm (including a pruning strategy) to recover the correct partition in a “tree” with two splits is investigated.</p> <p>In all cases, the three algorithms are employed to learn <em>model-based</em> trees where in each leaf of the tree a linear regression model is fitted with intercept β<sub>0</sub> and slope β<sub>1</sub>. The simulations then vary whether only the intercept β<sub>0</sub> or the slope β<sub>1</sub> or both differ in the data.</p> <h3 id="building-blocks-of-the-algorithms-and-their-test-statistics">Building blocks of the algorithms and their test statistics</h3> <p>All three algorithms proceed by first fitting the model (here: linear regression by OLS) in a given subgroup (or node) of the tree. Then they extract some kind of goodness-of-fit measure (either residuals or full model scores) and test whether this measure is associated with any of the split variables. The variable with the highest association (i.e., lowest p-value) is employed for splitting and then the procedure is repeated recursively in the resulting subgroups.</p> <p>For “pruning” the tree to the right size one can either first grow a larger tree and then prune those splits that are not relevant enough (post-pruning). Or the algorithm can stop splitting when the association test is not significant anymore (pre-pruning).</p> <p>The default combinations of fitted model type, test type, and pruning strategy for the three algorithms are given in the following table.</p> <table> <thead> <tr> <th style="text-align: left">Algorithm</th> <th style="text-align: left">Fit</th> <th style="text-align: left">Test</th> <th style="text-align: left">Pruning</th> </tr> </thead> <tbody> <tr> <td style="text-align: left">CTree</td> <td style="text-align: left">Non-parametric</td> <td style="text-align: left">Conditional inference</td> <td style="text-align: left">Pre</td> </tr> <tr> <td style="text-align: left">MOB</td> <td style="text-align: left">Parametric</td> <td style="text-align: left">Score-based fluctuation</td> <td style="text-align: left">Pre (or post with AIC/BIC)</td> </tr> <tr> <td style="text-align: left">GUIDE</td> <td style="text-align: left">Parametric</td> <td style="text-align: left">Residual-based chi-squared</td> <td style="text-align: left">Post (cost-complexity pruning)</td> </tr> </tbody> </table> <p>Thus, the main difference is the testing strategy but also the pruning is relevant. While at first sight, the tests come from very different motivations, they are actually not that different. When assessing the association with the split variable the following three properties are most relevant:</p> <ul> <li><em>Goodness-of-fit measure:</em> Either the full model scores which are here bivariate with one component for the intercept (= residuals) and one component for the slope. Alternatively, only the residuals are used.</li> <li><em>Dichotomization of residuals/scores:</em> Are the numeric values used for the residuals/scores? Or are they dichotomized at zero?</li> <li><em>Categorization of split variables:</em> Are the numeric values used for the split variables? Or are they binned at the quartiles?</li> </ul> <p>An overview of the corresponding settings for the three algorithms is given in the following table. Additionally, the tests differ somewhat in how they aggregate across the possible splits considered. Either in a sum-of-squares statistic or a maximally-selected statistic.</p> <table> <thead> <tr> <th style="text-align: left">Algorithm</th> <th style="text-align: left">Scores</th> <th style="text-align: left">Dichotomization</th> <th style="text-align: left">Categorization</th> <th style="text-align: left">Statistic</th> </tr> </thead> <tbody> <tr> <td style="text-align: left">CTree</td> <td style="text-align: left">Model scores</td> <td style="text-align: left">–</td> <td style="text-align: left">–</td> <td style="text-align: left">Sum of squares</td> </tr> <tr> <td style="text-align: left">MOB</td> <td style="text-align: left">Model scores</td> <td style="text-align: left">–</td> <td style="text-align: left">–</td> <td style="text-align: left">Maximally selected</td> </tr> <tr> <td style="text-align: left">GUIDE</td> <td style="text-align: left">Residuals</td> <td style="text-align: left">X</td> <td style="text-align: left">X</td> <td style="text-align: left">Sum of squares</td> </tr> </tbody> </table> <p>Subsequently, these algorithms are compared in two simulation studies. More details and more simulation studies can be found in the manuscript. In addition to the three default algorithms, a modified GUIDE algorithm using model scores instead of residuals (GUIDE+scores) is considered.</p> <h3 id="properties-of-the-tests-for-a-single-split-or-stump">Properties of the tests for a single split (or stump)</h3> <p>Clearly, the different choices made in the construction influence the inference properties of the significance tests. Hence, in a first step we investigate the power properties of the tests when there is only one split in one of the split variables (among further noise variables). The split can pertain either to the intercept β<sub>0</sub> only or the slope β<sub>1</sub> only or both.</p> <p>The plot below shows the probability of selection the true split variable (Z<sub>1</sub>) with the minimal p-value against the magnitude of the difference in the regression coefficients (δ). For a split in the middle of the data (50%) pertaining only to the intercept β<sub>0</sub> (top left panel) all tests perform almost equivalently. However, if the split only affects the slope β<sub>1</sub> (middle column) it is much better to use score-based tests rather than residual-based tests (as in GUIDE) which cannot pick up changes that do not affect the conditional mean. Moreover, if the split occurs not in the middle (50% quantile, top row) but in the tails (90% quantile, bottom row) it is better to use a maximally-selected statistic (as in MOB) rather than a sum-of-squares statistic.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-07-01-power_partitioning/results-stump.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-07-01-power_partitioning/results-stump.png" alt="results: stump" /></a></p> <h3 id="properties-of-the-algorithms-for-a-tree-with-two-splits">Properties of the algorithms for a tree with two splits</h3> <p>One could argue that the power properties of the tests may be crucial when pre-pruning (based on statistical significance) is used. However, when combined with cost-complexity post-pruning it may not be so important to have particularly high power. As long as the power for the true split variables is higher than for the noise variables, it might be sufficient to select the correct split variable.</p> <p>This is assessed in a simulation for a tree with two splits, both depending on differences of magnitude δ in the two regression coefficients, respectively. The adjusted Rand index is used to assess how well the partition found by the tree conforms with the true partition. The columns of the display below are for splits that occur in the middle of the data vs. later in the sample (left to right).</p> <p>And indeed it can be shown that post-pruning (bottom row) mitigates many of the power deficits of the testing strategies compared to significance-based pre-pruning (top row). However, it is still clearly better to use a score based test (as in CTree, MOB, and GUIDE+scores) than a residuals-based test (as in GUIDE). Also, pre-pruning may even lead to slightly better results than post-pruning when based on a powerful test.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-07-01-power_partitioning/results-tree.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-07-01-power_partitioning/results-tree.png" alt="results: tree" /></a></p> <h3 id="summary">Summary</h3> <p>Using several simulation setups we have shown that in many circumstances CTree, MOB, and GUIDE perform very similarly for recursive partitioning based on linear regression models. However, in some settings score-based clearly outperform residual-based tests (the latter may even lack power altogether). To some extent cross-complexity post-pruning can mitigate power deficits of the testing strategy but pre-pruning typically works as well as long as the significance test works well.</p> <p>Furthermore, other simulations in the manuscript show that dichotomization of residuals/scores should be avoided as it reduces the power of the tests. Note that this is very easy to do in GUIDE: Instead of chi-squared tests one can simply use one-way ANOVA tests. Finally, in the appendix of the manuscript it is shown that maximally-selected statistics (as in MOB) work better for abrupt splits late in the sample while the sum-of-squares statistics (from CTree and GUIDE) work better for smooth(er) transitions.</p>2019-07-01T00:00:00+02:00https://eeecon.uibk.ac.at/~zeileis/news/fifawomen2019/Hybrid Machine Learning Forecasts for the 2019 FIFA Women's World Cup2019-06-05T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/Using a random forest ensemble learner we obtain probabilistic forecasts for the FIFA Women's World Cup in France: The clear favorite is defending champion USA followed, with some margin, by host France, England, and Germany.<p>Using a random forest ensemble learner we obtain probabilistic forecasts for the FIFA Women's World Cup in France: The clear favorite is defending champion USA followed, with some margin, by host France, England, and Germany.</p> <div class="row t20 b20"> <div class="small-8 medium-9 large-10 columns"> The 2019 FIFA Women's World Cup will take place in France from 7 June to 7 July 2019. 24 of the best teams from 6 confederations compete to determine the new World Champion. Football fans worldwide are curious what the most likely outcome of the tournament will be. Hence we employ a machine learning approach yielding probabilistic forecasts for all possible matches which can then be used to explore the likely course of the tournament by simulation. </div> <div class="small-4 medium-3 large-2 columns"> <a href="https://www.fifa.com/womensworldcup/" alt="2019 FIFA Women's World Cup web page"><img src="https://upload.wikimedia.org/wikipedia/en/e/e9/2019_FIFA_Women%27s_World_Cup.svg" alt="2019 FIFA Women's World Cup logo" /></a> </div> </div> <h2 id="winning-probabilities">Winning probabilities</h2> <p>The forecast is based on a hybrid random forest learner that combines three main sources of information: An ability estimate for every team based on historic matches; an ability estimate for every team based on odds from 18 bookmakers; further team covariates (e.g., age, team structure) and country-specific socio-economic factors (population, GDP). The random forest is learned using the FIFA Women’s World Cups in 2011 and 2015 as training data and then applied to current information to obtain a forecast for the 2019 FIFA Women’s World Cup. The random forest actually provides the predicted number of goals for each team in all possible matches in the tournament so that a bivariate Poisson distribution can be used to compute the probabilities for a <em>win</em>, <em>draw</em>, or <em>loss</em> in such a match. Based on these match probabilities the entire tournament can be simulated 100,000 times yielding winning probabilities for each team. The results show that defending champions United States are the clear favorite with a winning probability of 28.1% followed by host France with a winning probability of 14.3%, England with 13.3%, and Germany with 12.9%. The winning probabilities for all teams are shown in the barchart below with more information linked in the interactive full-width version.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_win.html">Interactive full-width graphic</a></p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_win.html"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_win.png" alt="Barchart: Winning probabilities" /></a></p> <p>The full study is available in a recent <a href="https://arXiv.org/abs/1906.01131">working paper</a> which has been conducted by an international team of researchers: <a href="https://www.statistik.tu-dortmund.de/datenanalyse0.html">Andreas Groll</a>, <a href="https://users.ugent.be/~chley/">Christophe Ley</a>, <a href="http://schauberger.userweb.mwn.de/">Gunther Schauberger</a>, <a href="https://telefoonboek.ugent.be/en/people/802002400279">Hans Van Eetvelde</a>, <a href="https://eeecon.uibk.ac.at/~zeileis/">Achim Zeileis</a>. It actually provides a hybrid approach that combines three state-of-the-art forecasting methods:</p> <ul> <li> <p><em>Historic match abilities:</em><br /> An ability estimate is obtained for every team based on “retrospective” data, namely 3418 historic matches of 167 international women’s teams over the last 8 years. A <em>bivariate Poisson model</em> with team-specific fixed effects is fitted to the number of goals scored by both teams in each match. However, rather than equally weighting all matches to obtain <em>average</em> team abilities (or team strengths) over the entire history period, an exponential weighting scheme is employed. This assigns more weight to more recent results and thus yields an estimate of <em>current</em> team abilities. More details can be found in <a href="https://doi.org/10.1177%2F1471082X18817650">Ley, Van de Wiele, Van Eetvelde (2019)</a>.</p> </li> <li> <p><em>Bookmaker consensus abilities:</em><br /> Another ability estimate for every team is obtained based on “prospective” data, namely the odds of 18 international bookmakers that reflect their expert expectations for the tournament. Using the <em>bookmaker consensus model</em> of <a href="https://dx.doi.org/10.1016/j.ijforecast.2009.10.001">Leitner, Zeileis, Hornik (2010)</a> the bookmaker odds are first adjusted for the bookmakers’ profit margins (“overround”) and then averaged (on a logit scale) to obtain a consensus for the winning probability of each team. To adjust for the effects of the tournament draw (that might have led to easier or harder groups for some teams), an “inverse” simulation approach is used to infer which team abilities are most likely to lead up to these winning probabilities.</p> </li> <li> <p><em>Hybrid random forest:</em><br /> Finally, machine learning is used to combine the two ability estimates above along with a broad range of further relevant covariates, yielding refined probabilistic forecasts for each match. Specifically, the <em>hybrid random forest</em> approach of <a href="https://arXiv.org/abs/1806.03208">Groll, Ley, Schauberger, Van Eetvelde (2019)</a> is used to combine the two highly-informative ability estimates with further team-specific information that may or may not be relevant to the team’s performance. The covariates considered comprise team-specific details (e.g., FIFA rank, average age, confederation, team structure, …) as well as country-specifc socio-economic factors (population and GDP per capita). By learning a large ensemble of 5,000 regression trees, the relative importances of all the covariates can be inferred automatically. The resulting predicted number of goals for each team (averaged over all trees) can then finally be used to simulate the entire tournament 100,000 times.</p> </li> </ul> <h2 id="match-probabilities">Match probabilities</h2> <p>Using the hybrid random forest an expected number of goals is obtained for both teams in each possible match. The covariate information used for this is the difference between the two teams in each of the variables listed above, i.e., the difference in historic match abilities (on a log scale), the difference in bookmaker consensus abilities (on a log scale), difference in mean age of the teams, etc. Assuming a bivariate Poisson distribution with the expected numbers of goals for both teams, we can compute the probability that a certain match ends in a <em>win</em>, a <em>draw</em>, or a <em>loss</em>.</p> <p>The following heatmap shows the <em>win</em> probabilities in each possible match between a pair of teams with green vs. pink signalling probabilities above vs. below 50%, respectively. The corresponding <em>loss</em> probability is displayed when changing the roles of the teams (i.e., switching rows and columns in the matrix below). The tooltips for each match in the interactive version of the graphic also print the three <em>win</em>, <em>draw</em>, and <em>loss</em> probabilities.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_match.html">Interactive full-width graphic</a></p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_match.html"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_match.png" alt="Heatmap: Match probabilities" /></a></p> <h2 id="performance-throughout-the-tournament">Performance throughout the tournament</h2> <p>As every single match can be simulated with the pairwise probabilities above, it is also straightfoward to simulate the entire tournament (here: 100,000 times) providing “survival” probabilities for each team across the different stages.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_surv.html">Interactive full-width graphic</a></p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_surv.html"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-06-05-fifawomen2019/p_surv.png" alt="Line plot: Survival probabilities" /></a></p> <h2 id="odds-and-ends">Odds and ends</h2> <p>All our forecasts are probabilistic, clearly below 100%, and thus by no means certain - even if the favorite United States has clearly the highest winning probability of all participating teams. However, recall that a single poor performance in the playoffs is sufficient to drop out of the tournament. For example, this happened to host and clear favorite Germany in 2011 (with a winning probability of almost 40% according to the bookmakers) when they lost to Japan 0-1 in extra time in the quarter-finals. Japan then went on to become FIFA Women’s World Champion for the first time.</p> <p>Another interesting observation is that the bookmakers see both the United States and France almost on par with bookmaker consensus probabilities of 18.1% and 18.7%, respectively. Clearly, the bookmakers (and presumably their customers) expect that France’s home advantage will play an important role. In contrast, our hybrid random forest does not find the home advantage to be an important factor and hence forecasts a much higher winning probability for the United States (28.1%) than for France (14.3%). This is due to the home advantage not having played an important role in our learning data: Germany in 2011 and Canada in 2015 both dropped out in the quarter-finals.</p> <p>Finally, when considering the bookmaker consensus, it is also worth pointing out that the bookmakers seem to be less confident about their odds for the Women’s World Cup compared to the Men’s World Cup. This is reflected by the increased overround that assures the bookmakers’ profit margins. While for men’s tournaments this overround is typically around 15% (that the bookmakers keep and do not pay out) while for the FIFA Women’s World Cup 2019 it is a sizeable 25% on average and thus ten percentage points higher.</p> <p>This overround is also the main reason why we recommend against betting based on the results presented here. It assures that the best chances of making money based on sports betting lie with the bookmakers. Instead we recommend to bet only privately among friends and colleagues - or simply enjoy the exciting matches we are surely about to see in France!</p> <h2 id="working-paper">Working paper</h2> <p>Groll A, Ley C, Schauberger G, Van Eetvelde H, Zeileis A (2019). <em>“Hybrid Machine Learning Forecasts for the FIFA Women’s World Cup 2019”</em>, arXiv:1906.01131, arXiv.org E-Print Archive. <a href="https://arXiv.org/abs/1906.01131">https://arXiv.org/abs/1906.01131</a></p>2019-06-05T00:00:00+02:00https://eeecon.uibk.ac.at/~zeileis/news/model4you/Personalized treatment effects with model4you2019-05-16T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/Personalized treatment effects can be estimated easily with model-based trees and model-based random forests using the R package model4you.<p>Personalized treatment effects can be estimated easily with model-based trees and model-based random forests using the R package model4you.</p> <h3 id="citation">Citation</h3> <p>Heidi Seibold, Achim Zeileis, Torsten Hothorn (2019). “model4you: An R Package for Personalised Treatment Effect Estimation.” <em>Journal of Open Research Software</em>, <strong>7</strong>(17), 1-6. <a href="https://dx.doi.org/10.5334/jors.219">doi:10.5334/jors.219</a></p> <h3 id="abstract">Abstract</h3> <p>Typical models estimating treatment effects assume that the treatment effect is the same for all individuals. Model-based recursive partitioning allows to relax this assumption and to estimate stratified treatment effects (model-based trees) or even personalised treatment effects (model-based forests). With model-based trees one can compute treatment effects for different strata of individuals. The strata are found in a data-driven fashion and depend on characteristics of the individuals. Model-based random forests allow for a similarity estimation between individuals in terms of model parameters (e.g. intercept and treatment effect). The similarity measure can then be used to estimate personalised models. The R package model4you implements these stratified and personalised models in the setting with two randomly assigned treatments with a focus on ease of use and interpretability so that clinicians and other users can take the model they usually use for the estimation of the average treatment effect and with a few lines of code get a visualisation that is easy to understand and interpret.</p> <h3 id="software">Software</h3> <p><a href="https://CRAN.R-project.org/package=model4you">https://CRAN.R-project.org/package=model4you</a></p> <h3 id="illustration">Illustration</h3> <p>The correlation between exam group and exam performance in an introductory mathematics exam (for business and economics students) is investigated using tree-based stratified and personalized treatment effects. Group 1 took the exam in the morning and group 2 started the exam with slightly different exercises after the first group finished. Potential sources of heterogeneity in the group effect include gender, field of study, whether the exam was taken (and failed) previously, and prior performance in online “tests” earlier in the semester. Performance in both the written exam and the online tests is captured by percentage of correctly solved exercises.</p> <p>Overall, it seems that the split into two different exam groups was fair: The second group had only a slightly lower performance by around 2 or 3 percentage points, suggesting that the exam in the second group was only very slightly more difficult. However, when investigating the heterogeneity of this group effect with a model-based tree it turns out that this distinguishes the students by their performance in the online tests. The largest difference between the two exam groups is in the students who did very well in the online tests (more than 92.3 percent correct), where the second-group students performed worse by 13.3 percentage points. So the split into the two exam groups seems to have been not fully fair for those very good students.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-05-16-model4you/model4you-tree.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-05-16-model4you/model4you-tree.png" alt="model4you tree" /></a></p> <p>To refine the assessment further, a model-based forest can be estimated. This reveals that the dependence of the group effect on the performance in the online tests is even more pronounced. This is shown in the dependence plots and beeswarm plots below with the group treatment effect on the y-axis and the performance in the online tests on the x-axis.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-05-16-model4you/model4you-beeswarm.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-05-16-model4you/model4you-beeswarm.png" alt="model4you beeswarm treatment effect" /></a></p> <p>To fit the simple linear base model in R, <code class="highlighter-rouge">lm()</code> can be used. The subsequent tree based on this model can be obtained with <code class="highlighter-rouge">pmtree()</code> from <code class="highlighter-rouge">model4you</code> and the forest with <code class="highlighter-rouge">pmforest()</code>. Example code is shown below, the full replication code for the entire analysis and graphics is included in the manuscript.</p> <pre><code class="language-{r}">bmod_math <- lm(pcorrect ~ group, data = MathExam) tr_math <- pmtree(bmod_math, control = ctree_control(maxdepth = 2)) forest_math <- pmforest(bmod_math) </code></pre>2019-05-16T00:00:00+02:00https://eeecon.uibk.ac.at/~zeileis/news/endrainbow/At the end of the rainbow2019-03-07T00:00:00+01:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/Fully saturated RGB rainbow colors are still widely used in scientific visualizations despite their widely-recognized disadvantages. A recent wild-caught example is presented, showing its limitations along with a better HCL-based alternative palette.<p>Fully saturated RGB rainbow colors are still widely used in scientific visualizations despite their widely-recognized disadvantages. A recent wild-caught example is presented, showing its limitations along with a better HCL-based alternative palette.</p> <h2 id="endrainbow">#endrainbow</h2> <p>The go-to palette in many software packages is - or used to be until rather recently - the so-called rainbow: a palette created by changing the hue in highly-saturated RGB colors. This has been widely recognized as having a number of disadvantages including: abrupt shifts in brightness, misleading for viewers with color vision deficiencies, too flashy to look at for a longer time. As part of our R software project <a href="https://eeecon.uibk.ac.at/~zeileis/news/colorspace/">colorspace</a> we therefore started collecting typical (ab-)uses of the RGB rainbow palette on our web site <a href="http://colorspace.R-Forge.R-project.org/articles/endrainbow.html">http://colorspace.R-Forge.R-project.org/articles/endrainbow.html</a> and suggest better HCL-based color palettes.</p> <p>Here, we present the most recent addition to that example collection, a map of influenza severity in Germany, published by the influenza working group of the <a href="https://influenza.rki.de/">Robert Koch-Institut</a>. Along with the original map and its poor choice of colors we:</p> <ul> <li>highlight its problems by desaturation to grayscale and by emulating color vision deficiencies,</li> <li>suggest a proper sequential HCL-based palette, and</li> <li>provide the R code that can extract and replace the color palette in a PNG graphics file.</li> </ul> <h2 id="influenza-in-germany">Influenza in Germany</h2> <p>The shaded map below was taken from the web site of the <a href="https://influenza.rki.de/">Robert Koch-Institut</a> (Arbeitsgemeinschaft Influenza) and it shows the severity of influenza in Germany in week 8, 2019. The original color palette (left) is the classic rainbow ranging from “normal” (blue) to “strongly increased” (red). As all colors in the palette are very flashy and highly-saturated it is hard to grasp intuitively which areas are most affected by influenza. Also, the least interesting “normal” areas stand out as blue is the darkest color in the palette.</p> <p>As an alternative, a proper multi-hue sequential HCL palette is used on the right. This has smooth gradients and the overall message can be grasped quickly, giving focus to the high-risk regions depicted with dark/colorful colors. However, the extremely sharp transitions between “normal” and “strongly increased” areas (e.g., in the North and the East) might indicate some overfitting in the underlying smoothing for the map.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow.png" alt="influenza-rainbow" width="49%" /></a> <a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow.png" alt="influenza-purpleyellow" width="49%" /></a></p> <p>Converting all colors to grayscale brings out even more clearly why the overall picture is so hard to grasp with the original palette: The gradients are discontinuous switching several times between bright and dark. Thus, it is hard to identify the high-risk regions while this is more natural and straightforward with the HCL-based sequential palette.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow-gray.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow-gray.png" alt="influenza-rainbow-gray" width="49%" /></a> <a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow-gray.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow-gray.png" alt="influenza-purpleyellow-gray" width="49%" /></a></p> <p>Emulating green-deficient vision (deuteranopia) emphasizes the same problems as the desaturated version above but shows even more problems with the original palette: The wrong areas in the map “pop out”, making the map extremely hard to use for viewers with red-green deficiency. The HCL-based palette on the other hand is equally accessible for color-deficient viewers as for those with full color vision.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow-deutan.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow-deutan.png" alt="influenza-rainbow-deutan" width="49%" /></a> <a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow-deutan.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow-deutan.png" alt="influenza-purpleyellow-deutan" width="49%" /></a></p> <h2 id="replication-in-r">Replication in R</h2> <p>The desaturated and deuteranope version of the original image <a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow.png">influenza-rainbow.png</a> (a screenshot of the RKI web page) are relatively easy to produce using the <code class="highlighter-rouge">colorspace</code> function <code class="highlighter-rouge">cvd_emulator("influenza-rainbow.png")</code>. Internally, this reads the RGB colors for all pixels in the PNG, converts them with the <code class="highlighter-rouge">colorspace</code> functions <code class="highlighter-rouge">desaturate()</code> and <code class="highlighter-rouge">deutan()</code>, respectively, and saves the PNG again. Below we also do this “by hand”.</p> <p>What is more complicated is the replacement of the original rainbow palette with a properly balanced HCL palette (without access to the underlying data). Luckily the image contains a legend from which the original palette can be extracted. Subsequently, it is possibly to index all colors in the image, replace them, and write out the PNG again.</p> <p>As a first step we read the original PNG image using the R package <a href="https://CRAN.R-project.org/package=png">png</a>, returning a height x width x 4 array containing the three RGB (red/green/blue) channels plus a channel for alpha transparency. Then, this is turned into a height x width matrix containing color hex codes using the base <code class="highlighter-rouge">rgb()</code> function:</p> <pre><code class="language-{r}">img <- png::readPNG("influenza-rainbow.png") img <- matrix( rgb(img[,,1], img[,,2], img[,,3]), nrow = nrow(img), ncol = ncol(img) ) </code></pre> <p>Using a manual search we find a column of pixels from the palette legend (column 630) and thin it to obtain only 99 colors:</p> <pre><code class="language-{r}">pal_rain <- img[96:699, 630] pal_rain <- pal_rain[seq(1, length(pal_rain), length.out = 99)] </code></pre> <p>For replacement we use a slightly adapted <code class="highlighter-rouge">sequential_hcl()</code> that was suggested by <a href="https://dx.doi.org/10.1175/BAMS-D-13-00155.1">Stauffer <em>et al.</em> (2015)</a> for a precipitation warning map. The <code class="highlighter-rouge">"Purple-Yellow"</code> palette is currently only in version 1.4-1 of the package on <a href="https://R-Forge.R-project.org/R/?group_id=20">R-Forge</a> but other sequential HCL palettes could also be used here.</p> <pre><code class="language-{r}">library("colorspace") pal_hcl <- sequential_hcl(99, "Purple-Yellow", p1 = 1.3, c2 = 20) </code></pre> <p>Now for replacing the RGB rainbow colors with the sequential colors, the following approach is taken: The original image is indexed by matching the color of each pixel to the closest of the 99 colors from the rainbow palette. Furthermore, to preserve the black borders and the gray shadows, 50 shades of gray are also offered for the indexing. To match pixel colors to palette colors a simple Manhattan distance (sum of absolute distances) is used in the CIELUV color space:</p> <pre><code class="language-{r}"># 50 shades of gray pal_gray <- gray(0:50/50) ## HCL coordinates for image and palette img_luv <- coords(as(hex2RGB(as.vector(img)), "LUV")) pal_luv <- coords(as(hex2RGB(c(pal_rain, pal_gray)), "LUV")) ## Manhattan distance matrix dm <- matrix(NA, nrow = nrow(img_luv), ncol = nrow(pal_luv)) for(i in 1:nrow(pal_luv)) dm[, i] <- rowSums(abs(t(t(img_luv) - pal_luv[i,]))) idx <- apply(dm, 1, which.min) </code></pre> <p>Now each element of the <code class="highlighter-rouge">img</code> hex color matrix can be easily replaced by indexing a new palette with 99 colors (plus 50 shades of gray) using the <code class="highlighter-rouge">idx</code> vector. This is what the <code class="highlighter-rouge">pal_to_png()</code> function below does, writing the resulting matrix to a PNG file. The function is somewhat quick and dirty, makes no sanity checks, and assumes <code class="highlighter-rouge">img</code> and <code class="highlighter-rouge">idx</code> are in the calling environment.</p> <pre><code class="language-{r}">pal_to_png <- function(pal = pal_hcl, file = "influenza.png", rev = FALSE) { ret <- img pal <- if(rev) c(rev(pal), rev(pal_gray)) else c(pal, pal_gray) ret[] <- pal[idx] ret <- coords(hex2RGB(ret)) dim(ret) <- c(dim(img), 3) png::writePNG(ret, target = file) } </code></pre> <p>With this function, we can easily produce the PNG graphic with the desaturated palette and the deuteranope version”</p> <pre><code class="language-{r}">pal_to_png(desaturate(pal_rain), "influenza-rainbow-gray.png") pal_to_png( deutan(pal_rain), "influenza-rainbow-deutan.png") </code></pre> <p>The analogous graphics for the HCL-based <code class="highlighter-rouge">"Purple-Yellow"</code> palette are generated by:</p> <pre><code class="language-{r}">pal_to_png( pal_hcl, "influenza-purpleyellow.png") pal_to_png(desaturate(pal_hcl), "influenza-purpleyellow-gray.png") pal_to_png( deutan(pal_hcl), "influenza-purpleyellow-deutan.png") </code></pre> <h2 id="further-remarks">Further remarks</h2> <p>Given that we have now extracted the <code class="highlighter-rouge">pal_rain</code> palette and set up the <code class="highlighter-rouge">pal_hcl</code> alternative we can also use the <code class="highlighter-rouge">colorspace</code> function <code class="highlighter-rouge">specplot()</code> to understand how the perceptual properties of the colors change across the two palettes. For the HCL-based palette the hue/chroma/luminance changes smoothly from dark/colorful purple to a light yellow. In contrast, in the original RGB rainbow chroma and, more importantly, luminance change non-monotonically and rather abruptly:</p> <pre><code class="language-{r}">specplot(pal_rain) specplot(pal_hcl) </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow-spectrum.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-rainbow-spectrum.png" alt="influenza-rainbow-spectrum" width="49%" /></a> <a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow-spectrum.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-purpleyellow-spectrum.png" alt="influenza-purpleyellow-spectrum" width="49%" /></a></p> <p>Given that the colors in the image are indexed now and the gray shades are in a separate subvector, we can now easily <code class="highlighter-rouge">rev</code>-erse the order in both subvectors. This yields a black background with white letters and we can use the <code class="highlighter-rouge">"Inferno"</code> palette that works well on dark backgrounds:</p> <pre><code class="language-{r}">pal_to_png(sequential_hcl(99, "Inferno"), "influenza-inferno.png", rev = TRUE) </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-inferno.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-03-07-endrainbow/influenza-inferno.png" alt="influenza-inferno" width="49%" /></a></p> <p>For more details on the limitations of the rainbow palette and further pointers see “<a href="http://www.climate-lab-book.ac.uk/2014/end-of-the-rainbow/">The End of the Rainbow</a>” by Hawkins <em>et al.</em> (2014) or “<a href="https://dx.doi.org/10.1175/BAMS-D-13-00155.1">Somewhere over the Rainbow: How to Make Effective Use of Colors in Meteorological Visualizations</a>” by Stauffer <em>et al.</em> (2015) as well as the <a href="https://twitter.com/hashtag/endrainbow">#endrainbow</a> hashtag on Twitter.</p>2019-03-07T00:00:00+01:00https://eeecon.uibk.ac.at/~zeileis/news/hclwizard/Online color apps at hclwizard.org2019-02-15T00:00:00+01:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/The hclwizard.org web page has been relaunched, hosting three online color apps based on the HCL (Hue-Chroma-Luminance) color model: a palette constructor, a color vision deficiency emulator, and a color picker.<p>The hclwizard.org web page has been relaunched, hosting three online color apps based on the HCL (Hue-Chroma-Luminance) color model: a palette constructor, a color vision deficiency emulator, and a color picker.</p> <h2 id="hcl-wizard-somewhere-over-the-rainbow">HCL wizard: Somewhere over the rainbow</h2> <p>The web page <a href="http://hclwizard.org/">http://hclwizard.org/</a> had originally been started to accompany the manuscript: “<a href="https://doi.org/10.1175/BAMS-D-13-00155.1">Somewhere over the Rainbow: How to Make Effective Use of Colors in Meteorological Visualizations</a>” by Stauffer <em>et al.</em> (2015, <em>Bulletin of the American Meteorological Society</em>) to facilitate the adoption of color palettes using the HCL (Hue-Chroma-Luminance) color model. It was realized using the R package <a href="http://colorspace.R-Forge.R-project.org/">colorspace</a> in combination with <a href="https://shiny.RStudio.com/">shiny</a>.</p> <p>After the <a href="https://eeecon.uibk.ac.at/~zeileis/news/colorspace/">major update of the colorspace package</a> the <a href="http://hclwizard.org/">http://hclwizard.org/</a> has also just been relaunched, now hosting all three shiny color apps from the package:</p> <p><a href="http://hclwizard.org/"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-02-15-hclwizard/hclwizard-apps.png" alt="HCL wizard color apps" /></a></p> <h2 id="palette-creator">Palette creator</h2> <p>This app allows to design new palettes interactively: qualitative palettes, sequential palettes with single or multiple hues, and diverging palettes (composed from two single-hue sequential palettes). The underlying HCL coordinates can be modified, starting out from a wide range of pre-defined palettes. The resulting palette can be assessed in various kinds of displays and exported in different formats.</p> <p><a href="http://hclwizard.org/hclwizard/"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-02-15-hclwizard/palette-creator.gif" alt="HCL wizard palette creator" /></a></p> <h2 id="color-vision-deficiency-emulator">Color vision deficiency emulator</h2> <p>This app allows to assess how well the colors in an uploaded graphics file (png/jpg/jpeg) work for viewers with color vision deficiencies. Different kinds of color blindness can be emulated: deuteranope (red deficient), protanope (green deficient), tritanope (blue deficient), monochrome (grayscale).</p> <p><a href="http://hclwizard.org/cvdemulator/"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-02-15-hclwizard/deficiency-emulator.gif" alt="Color vision deficiency emulator" /></a></p> <h2 id="color-picker">Color picker</h2> <p>In addition to the palette creator app described above, this app provides a more traditional color picker. Sets of individual colors can be selected (and exported) by navigating different views of the HCL color space.</p> <p><a href="http://hclwizard.org/hclcolorpicker/"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-02-15-hclwizard/color-picker.gif" alt="HCL color picker" /></a></p>2019-02-15T00:00:00+01:00https://eeecon.uibk.ac.at/~zeileis/news/lagsarlmtree/Spatial lag model trees2019-01-21T00:00:00+01:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/Economic growth models are recursively partitioned to assess heterogeneity in growth and convergence across EU regions while adjusting for spatial dependencies. Accompanied by R package lagsarlmtree, combining partykit::mob and spdep::lagsarlm.<p>Economic growth models are recursively partitioned to assess heterogeneity in growth and convergence across EU regions while adjusting for spatial dependencies. Accompanied by R package lagsarlmtree, combining partykit::mob and spdep::lagsarlm.</p> <h2 id="citation">Citation</h2> <p>Martin Wagner, Achim Zeileis (2019). “Heterogeneity and Spatial Dependence of Regional Growth in the EU: A Recursive Partitioning Approach.” <em>German Economic Review</em>, <strong>20</strong>(1), 67-82. <a href="https://doi.org/10.1111/geer.12146">doi:10.1111/geer.12146</a> [ <a href="https://eeecon.uibk.ac.at/~zeileis/papers/Wagner+Zeileis-2019.pdf">pdf</a> ]</p> <h2 id="abstract">Abstract</h2> <p>We use model-based recursive partitioning to assess heterogeneity of growth and convergence processes based on economic growth regressions for 255 European Union NUTS2 regions from 1995 to 2005. Spatial dependencies are taken into account by augmenting the model-based regression tree with a spatial lag. The starting point of the analysis is a human-capital-augmented Solow-type growth equation similar in spirit to Mankiw <em>et al.</em> (1992, <em>The Quarterly Journal of Economics</em>, <strong>107</strong>, 407-437). Initial GDP and the share of highly educated in the working age population are found to be important for explaining economic growth, whereas the investment share in physical capital is only significant for coastal regions in the PIIGS countries. For all considered spatial weight matrices recursive partitioning leads to a regression tree with four terminal nodes with partitioning according to (i) capital regions, (ii) non-capital regions in or outside the so-called PIIGS countries and (iii) inside the respective PIIGS regions furthermore between coastal and non-coastal regions. The choice of the spatial weight matrix clearly influences the spatial lag parameter while the estimated slope parameters are very robust to it. This indicates that accounting for heterogeneity is an important aspect of modeling regional economic growth and convergence.</p> <h2 id="software">Software</h2> <p><a href="https://CRAN.R-project.org/package=lagsarlmtree">https://CRAN.R-project.org/package=lagsarlmtree</a></p> <h2 id="heterogeneity-of-regional-growth-in-the-eu">Heterogeneity of regional growth in the EU</h2> <p>The growth model to be assessed for heterogeneity is a linear regression model for the average growth rate of real GDP per capita (<em>ggdpcap</em>) as the dependent variable with the following regressors:</p> <ul> <li>Initial real GDP per capita in logs (<em>gdpcap0</em>), sometimes simply referred to as initial income, to capture potential β-convergence.</li> <li>Investment share in GDP (<em>shgfcf</em>) to capture physical capital accumulation.</li> <li>Shares of high and of medium educated in the labor force (<em>shsh</em> and <em>shsm</em>) as measures of human capital.</li> </ul> <p>Thus, a human-capital-augmented version of the Solow model is employed, inspired by the by now classical work of Mankiw <em>et al.</em> (1992). The well-known data sets from Sala-i-Martin <em>et al.</em> (2004) and Fernandez <em>et al.</em> (2001) are employed below for estimation.</p> <p>To assess whether a single growth regression model with stable parameters across all EU regions is sufficient, splitting the data by the following partitioning variables is considered:</p> <ul> <li>Log of initial real GDP per capita itself as a partitioning variable as a simple device to check for the presence of initial income driven convergence clubs.</li> <li>Two measures for traffic accessibility of the region, one for accessibility via rail (<em>accessrail</em>) and one via the road network (<em>accessroad</em>).</li> <li>A dummy variable for capital regions (<em>capital</em>).</li> <li>Dummy variables for border regions (<em>regborder</em>) and coastal regions (<em>regcoast</em>).</li> <li>Dummy for EU Objective 1 regions (<em>regobj1</em>).</li> <li>Two dummy variables for Central and Eastern European countries (<em>cee</em>) and Portugal/Ireland/Italy/Greece/Spain (<em>piigs</em>).</li> </ul> <p>To adjust for spatial dependencies a spatial lag term with inverse distance weights is considered here. Other weight specifications lead to very similar estimated tree structures and regression coefficients, though.</p> <pre><code class="language-{r}">library("lagsarlmtree") data("GrowthNUTS2", package = "lagsarlmtree") data("WeightsNUTS2", package = "lagsarlmtree") tr <- lagsarlmtree(ggdpcap ~ gdpcap0 + shgfcf + shsh + shsm | gdpcap0 + accessrail + accessroad + capital + regboarder + regcoast + regobj1 + cee + piigs, data = GrowthNUTS2, listw = WeightsNUTS2$invw, minsize = 12, alpha = 0.05) print(tr) ## Spatial lag model tree ## ## Model formula: ## ggdpcap ~ gdpcap0 + shgfcf + shsh + shsm | gdpcap0 + accessrail + ## accessroad + capital + regboarder + regcoast + regobj1 + ## cee + piigs ## ## Fitted party: ## [1] root ## | [2] capital in no ## | | [3] piigs in no: n = 176 ## | | (Intercept) gdpcap0 shgfcf shsh shsm ## | | 0.11055 -0.01171 0.00208 0.02195 0.00179 ## | | [4] piigs in yes ## | | | [5] regcoast in no: n = 13 ## | | | (Intercept) gdpcap0 shgfcf shsh shsm ## | | | 0.1606 -0.0159 -0.0469 0.0789 -0.0234 ## | | | [6] regcoast in yes: n = 39 ## | | | (Intercept) gdpcap0 shgfcf shsh shsm ## | | | 0.07348 -0.01106 0.09156 0.11668 0.00942 ## | [7] capital in yes: n = 27 ## | (Intercept) gdpcap0 shgfcf shsh shsm ## | 0.2056 -0.0223 -0.0075 0.0411 0.0528 ## ## Number of inner nodes: 3 ## Number of terminal nodes: 4 ## Number of parameters per node: 5 ## Objective function (residual sum of squares): 0.0155 ## ## Rho (from lagsarlm model): ## rho ## 0.837 </code></pre> <p>The resulting linear regression tree can be visualized with p-values from the parameter stability tests displayed in the inner nodes and a scatter plot of GDP per capita growth (<em>ggdpcap</em>) vs. (log) initial real GDP per capita (<em>ggdcap0</em>) in the terminal nodes:</p> <pre><code class="language-{r}">plot(tr, tp_args = list(which = 1)) </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-21-lagsarlmtree/tree-spatial.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-21-lagsarlmtree/tree-spatial.png" alt="Growth tree" /></a></p> <p>It is most striking that the speed of β-convergence is much higher for the 27 capital regions. More details about differences in the other regressors are shown in the table below. Finally, it is of interest which variables were <em>not</em> selected for splitting in the tree, i.e., are not associated with significant parameter instabilities: initial income, the border dummy, and Objective 1 regions, among others.</p> <table> <thead> <tr> <th style="text-align: right">Node</th> <th style="text-align: right">n</th> <th style="text-align: left">Partitioning</th> <th style="text-align: left">variables</th> <th style="text-align: left"> </th> <th style="text-align: left">Regressor</th> <th style="text-align: left">variables</th> <th style="text-align: left"> </th> <th> </th> <th> </th> <th> </th> </tr> <tr> <th style="text-align: right"> </th> <th style="text-align: right"> </th> <th style="text-align: left">capital</th> <th style="text-align: left">piigs</th> <th style="text-align: left">regcoast</th> <th style="text-align: left">(Const.)</th> <th style="text-align: left"><em>gdpcap0</em></th> <th style="text-align: left"><em>shgfcf</em></th> <th><em>shsh</em></th> <th><em>shsm</em></th> <th> </th> </tr> </thead> <tbody> <tr> <td style="text-align: right">3</td> <td style="text-align: right">176</td> <td style="text-align: left">no</td> <td style="text-align: left">no</td> <td style="text-align: left">–</td> <td style="text-align: left"> 0.111 <br /> (0.016)</td> <td style="text-align: left">–0.0117 <br /> (0.0016)</td> <td style="text-align: left">–0.0021 <br /> (0.0077)</td> <td> 0.022 <br /> (0.011)</td> <td> 0.0018 <br /> (0.0068)</td> <td> </td> </tr> <tr> <td style="text-align: right">5</td> <td style="text-align: right">13</td> <td style="text-align: left">no</td> <td style="text-align: left">yes</td> <td style="text-align: left">no</td> <td style="text-align: left"> 0.161 <br /> (0.128)</td> <td style="text-align: left">–0.0159 <br /> (0.0135)</td> <td style="text-align: left">–0.0469 <br /> (0.0815)</td> <td> 0.079 <br /> (0.059)</td> <td>–0.0234 <br /> (0.0660)</td> <td> </td> </tr> <tr> <td style="text-align: right">6</td> <td style="text-align: right">39</td> <td style="text-align: left">no</td> <td style="text-align: left">yes</td> <td style="text-align: left">yes</td> <td style="text-align: left"> 0.073 <br /> (0.056)</td> <td style="text-align: left">–0.0111 <br /> (0.0059)</td> <td style="text-align: left"> 0.0916 <br /> (0.0420)</td> <td> 0.117 <br /> (0.029)</td> <td> 0.0094 <br /> (0.0218)</td> <td> </td> </tr> <tr> <td style="text-align: right">7</td> <td style="text-align: right">27</td> <td style="text-align: left">yes</td> <td style="text-align: left">–</td> <td style="text-align: left">–</td> <td style="text-align: left"> 0.206 <br /> (0.031)</td> <td style="text-align: left">–0.0223 <br /> (0.0029)</td> <td style="text-align: left">–0.0075 <br /> (0.0259)</td> <td> 0.041 <br /> (0.020)</td> <td> 0.0528 <br /> (0.0117)</td> <td> </td> </tr> </tbody> </table> <p>For more details see the full manuscript. Replication materials for the entire analysis from the manuscript are available as a demo in the package:</p> <pre><code class="language-{r}">demo("GrowthNUTS2", package = "lagsarlmtree") </code></pre>2019-01-21T00:00:00+01:00https://eeecon.uibk.ac.at/~zeileis/news/colorspace/colorspace: New tools for colors and palettes2019-01-14T00:00:00+01:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/A major update (version 1.4.0) of the R package colorspace has been released to CRAN, enhancing many of the package's capabilities, e.g., more refined palettes, named palettes, ggplot2 color scales, visualizations for assessing palettes, shiny and Tcl/Tk apps, color vision deficiency emulation, and much more.<p>A major update (version 1.4.0) of the R package colorspace has been released to CRAN, enhancing many of the package's capabilities, e.g., more refined palettes, named palettes, ggplot2 color scales, visualizations for assessing palettes, shiny and Tcl/Tk apps, color vision deficiency emulation, and much more.</p> <h3 id="overview">Overview</h3> <p>The <em>colorspace</em> package provides a broad toolbox for selecting individual colors or color palettes, manipulating these colors, and employing them in various kinds of visualizations. Version 1.4.0 has just been released on CRAN, containing many new features and contributions from new co-authors. A new web site presenting and documenting the package has been launched at <a href="http://colorspace.R-Forge.R-project.org/">http://colorspace.R-Forge.R-project.org/</a>.</p> <p>At the core of the package there are various utilities for computing with color spaces (as the name conveys). Thus, the package helps to map various three-dimensional representations of color to each other. A particularly important mapping is the one from the perceptually-based and device-independent color model HCL (Hue-Chroma-Luminance) to standard Red-Green-Blue (sRGB) which is the basis for color specifications in many systems based on the corresponding hex codes (e.g., in HTML but also in R). For completeness further standard color models are included as well in the package: <code class="highlighter-rouge">polarLUV()</code> (= HCL), <code class="highlighter-rouge">LUV()</code>, <code class="highlighter-rouge">polarLAB()</code>, <code class="highlighter-rouge">LAB()</code>, <code class="highlighter-rouge">XYZ()</code>, <code class="highlighter-rouge">RGB()</code>, <code class="highlighter-rouge">sRGB()</code>, <code class="highlighter-rouge">HLS()</code>, <code class="highlighter-rouge">HSV()</code>.</p> <p>The HCL space (= polar coordinates in CIELUV) is particularly useful for specifying individual colors and color palettes as its three axes match those of the human visual system very well: Hue (= type of color, dominant wavelength), chroma (= colorfulness), luminance (= brightness).</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/hcl-properties-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/hcl-properties-1.png" alt="HCL axes" /></a></p> <p>The <em>colorspace</em> package provides three types of palettes based on the HCL model:</p> <ul> <li><em>Qualitative:</em> Designed for coding categorical information, i.e., where no particular ordering of categories is available and every color should receive the same perceptual weight. Function: <code class="highlighter-rouge">qualitative_hcl()</code>.</li> <li><em>Sequential:</em> Designed for coding ordered/numeric information, i.e., where colors go from high to low (or vice versa). Function: <code class="highlighter-rouge">sequential_hcl()</code>.</li> <li><em>Diverging:</em> Designed for coding numeric information around a central neutral value, i.e., where colors diverge from neutral to two extremes. Function: <code class="highlighter-rouge">diverging_hcl()</code>.</li> </ul> <p>To aid choice and application of these palettes there are: scales for use with <em>ggplot2</em>; shiny (and tcltk) apps for interactive exploration; visualizations of palette properties; accompanying manipulation utilities (like desaturation, lighten/darken, and emulation of color vision deficiencies).</p> <p>More detailed overviews and examples are provided in the articles:</p> <ul> <li><a href="http://colorspace.R-Forge.R-project.org/articles/color_spaces.html">Color Spaces: S4 Classes and Utilities</a></li> <li><a href="http://colorspace.R-Forge.R-project.org/articles/hcl_palettes.html">HCL-Based Color Palettes</a></li> <li><a href="http://colorspace.R-Forge.R-project.org/articles/ggplot2_color_scales.html">HCL-Based Color Scales for <em>ggplot2</em></a></li> <li><a href="http://colorspace.R-Forge.R-project.org/articles/palette_visualization.html">Palette Visualization and Assessment</a></li> <li><a href="http://colorspace.R-Forge.R-project.org/articles/hclwizard.html">Apps for Choosing Colors and Palettes Interactively</a></li> <li><a href="http://colorspace.R-Forge.R-project.org/articles/color_vision_deficiency.html">Color Vision Deficiency Emulation</a></li> <li><a href="http://colorspace.R-Forge.R-project.org/articles/manipulation_utilities.html">Color Manipulation and Utilities</a></li> <li><a href="http://colorspace.R-Forge.R-project.org/articles/approximations.html">Approximating Palettes from Other Packages</a></li> </ul> <h3 id="installation">Installation</h3> <p>The stable release version of <em>colorspace</em> is hosted on the Comprehensive R Archive Network (CRAN) at <a href="https://CRAN.R-project.org/package=colorspace">https://CRAN.R-project.org/package=colorspace</a> and can be installed via</p> <pre><code class="language-{r}">install.packages("colorspace") </code></pre> <p>The development version of <em>colorspace</em> is hosted on R-Forge at <a href="https://R-Forge.R-project.org/projects/colorspace/">https://R-Forge.R-project.org/projects/colorspace/</a> in a Subversion (SVN) repository. It can be installed via</p> <pre><code class="language-{r}">install.packages("colorspace", repos = "http://R-Forge.R-project.org") </code></pre> <p>For Python users a beta re-implementation of the full <em>colorspace</em> package in Python 2/Python 3 is also available, see <a href="https://github.com/retostauffer/python-colorspace">https://github.com/retostauffer/python-colorspace</a>.</p> <h3 id="choosing-hcl-based-color-palettes">Choosing HCL-based color palettes</h3> <p>The <em>colorspace</em> package ships with a wide range of predefined color palettes, specified through suitable trajectories in the HCL (hue-chroma-luminance) color space. A quick overview can be gained easily with <code class="highlighter-rouge">hcl_palettes()</code>:</p> <pre><code class="language-{r}">library("colorspace") hcl_palettes(plot = TRUE) </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/hcl-palettes-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/hcl-palettes-1.png" alt="HCL palettes" /></a></p> <p>Using the names from the plot above and a desired number of colors in the palette a suitable color vector can be easily computed, e.g.,</p> <pre><code class="language-{r}">q4 <- qualitative_hcl(4, "Dark 3") q4 ## [1] "#E16A86" "#909800" "#00AD9A" "#9183E6" </code></pre> <p>The functions <code class="highlighter-rouge">sequential_hcl()</code>, and <code class="highlighter-rouge">diverging_hcl()</code> work analogously. Additionally, their hue/chroma/luminance parameters can be modified, thus allowing to easily customize each palette. Moreover, the <code class="highlighter-rouge">choose_palette()</code>/<code class="highlighter-rouge">hclwizard()</code> app provide convenient user interfaces to do the customization interactively. Finally, even more flexible diverging HCL palettes are provided by <code class="highlighter-rouge">divergingx_hcl()</code>.</p> <h3 id="usage-with-base-graphics">Usage with base graphics</h3> <p>The color vectors returned by the HCL palette functions can usually be passed directly to most base graphics function, typically through the <code class="highlighter-rouge">col</code> argument. Here, the <code class="highlighter-rouge">q4</code> vector created above is used in a time series display:</p> <pre><code class="language-{r}">plot(log(EuStockMarkets), plot.type = "single", col = q4, lwd = 2) legend("topleft", colnames(EuStockMarkets), col = q4, lwd = 3, bty = "n") </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/eustockmarkets-plot-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/eustockmarkets-plot-1.png" alt="EuStockMarkets" /></a></p> <p>As another example for a sequential palette a spine plot is created below, displaying the proportion of Titanic passengers that survived per class. The <code class="highlighter-rouge">Purples 3</code> palette is used which is quite similar to the <strong>ColorBrewer.org</strong> palette <code class="highlighter-rouge">Purples</code>. Here, only two colors are employed, yielding a dark purple and light gray.</p> <pre><code class="language-{r}">ttnc <- margin.table(Titanic, c(1, 4))[, 2:1] spineplot(ttnc, col = sequential_hcl(2, "Purples 3")) </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/titanic-plot-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/titanic-plot-1.png" alt="Titanic" /></a></p> <h3 id="usage-with-ggplot2">Usage with <em>ggplot2</em></h3> <p>To plug the HCL color palettes into <em>ggplot2</em> graphics suitable discrete and/or continuous <em>gglot2</em> color scales are provided. The scales are called via the scheme <code class="highlighter-rouge">scale_<aesthetic>_<datatype>_<colorscale>()</code>, where <code class="highlighter-rouge"><aesthetic></code> is the name of the aesthetic (<code class="highlighter-rouge">fill</code>, <code class="highlighter-rouge">color</code>, <code class="highlighter-rouge">colour</code>), <code class="highlighter-rouge"><datatype></code> is the type of the variable plotted (<code class="highlighter-rouge">discrete</code> or <code class="highlighter-rouge">continuous</code>) and <code class="highlighter-rouge"><colorscale></code> sets the type of the color scale used (<code class="highlighter-rouge">qualitative</code>, <code class="highlighter-rouge">sequential</code>, <code class="highlighter-rouge">diverging</code>, <code class="highlighter-rouge">divergingx</code>).</p> <p>To illustrate their usage two simple examples are shown using the qualitative <code class="highlighter-rouge">Dark 3</code> and sequential <code class="highlighter-rouge">Purples 3</code> palettes that were also employed above. First, semi-transparent shaded densities of the sepal length from the iris data are shown, grouped by species.</p> <pre><code class="language-{r}">library("ggplot2") ggplot(iris, aes(x = Sepal.Length, fill = Species)) + geom_density(alpha = 0.6) + scale_fill_discrete_qualitative(palette = "Dark 3") </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/iris-ggplot-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/iris-ggplot-1.png" alt="iris" /></a></p> <p>The sequential palette is used to code the cut levels in a scatter of price by carat in the diamonds data (or rather a small subsample thereof). The scale function first generates six colors but then drops the first color because the light gray is too light in this display. (Alternatively, the chroma and luminance parameters could also be tweaked.)</p> <pre><code class="language-{r}">dsamp <- diamonds[1 + 1:1000 * 50, ] ggplot(dsamp, aes(carat, price, color = cut)) + geom_point() + scale_color_discrete_sequential(palette = "Purples 3", nmax = 6, order = 2:6) </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/diamonds-ggplot-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/diamonds-ggplot-1.png" alt="diamonds" /></a></p> <h3 id="palette-visualization-and-assessment">Palette visualization and assessment</h3> <p>The <em>colorspace</em> package also provides a number of functions that aid visualization and assessment of its palettes.</p> <ul> <li><code class="highlighter-rouge">demoplot()</code> can display a palette (with arbitrary number of colors) in a range of typical and somewhat simplified statistical graphics.</li> <li><code class="highlighter-rouge">hclplot()</code> converts the colors of a palette to the corresponding hue/chroma/luminance coordinates and displays them in HCL space with one dimension collapsed. The collapsed dimension is the luminance for qualitative palettes and the hue for sequential/diverging palettes.</li> <li><code class="highlighter-rouge">specplot()</code> also converts the colors to hue/chroma/luminance coordinates but draws the resulting spectrum in a line plot.</li> </ul> <p>For the qualitative <code class="highlighter-rouge">Dark 3</code> palette from above the following plots can be obtained.</p> <pre><code class="language-{r}">demoplot(q4, "bar") hclplot(q4) specplot(q4, type = "o") </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/allplots-qualitative-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/allplots-qualitative-1.png" alt="demo-hcl-specplot-qualitative" /></a></p> <p>A bar plot would be another typical application for a qualitative palette (instead of the time series and density plot used above). However, a lighter and less colorful palette might be preferable in this situation (e.g., <code class="highlighter-rouge">Pastel 1</code> or <code class="highlighter-rouge">Set 3</code>).</p> <p>The other two displays show that luminance is (almost) constant in the palette while the hue changes linearly along the color “wheel”. Ideally, chroma would have also been constant to completely balance the colors. However, at this luminance the maximum chroma differs across hues so that the palette is fixed up to use less chroma for the yellow and green elements.</p> <p>Subsequently, the same assessment is carried out for the sequential <code class="highlighter-rouge">Purples 3</code> palette as employed above.</p> <pre><code class="language-{r}">s9 <- sequential_hcl(9, "Purples 3") demoplot(s9, "heatmap") hclplot(s9) specplot(s9, type = "o") </code></pre> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/allplots-sequential-1.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2019-01-14-colorspace/allplots-sequential-1.png" alt="demo-hcl-specplot-sequential" /></a></p> <p>Here, a heatmap (based on the well-known Maunga Whau volcano data) is used as a typical application for a sequential palette. The elevation of the volcano is brought out clearly, focusing with the dark colors on the higher elevations.</p> <p>The other two displays show that hue is constant in the palette while luminance and chroma vary. Luminance increases monotonically from dark to light (as required for a proper sequential palette). Chroma is triangular-shaped which allows to better distinguish the middle colors in the palette (compared to a monotonic chroma trajectory).</p>2019-01-14T00:00:00+01:00https://eeecon.uibk.ac.at/~zeileis/news/crps_vs_ml/Minimum CRPS vs. maximum likelihood2018-12-17T00:00:00+01:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/In a new paper in Monthly Weather Review, minimum CRPS and maximum likelihood estimation are compared for fitting heteroscedastic (or nonhomogenous) regression models under different response distributions. Minimum CRPS is more robust to distributional misspecification while maximum likelihood is slightly more efficient under correct specification. An R implementation is available in the crch package.<p>In a new paper in Monthly Weather Review, minimum CRPS and maximum likelihood estimation are compared for fitting heteroscedastic (or nonhomogenous) regression models under different response distributions. Minimum CRPS is more robust to distributional misspecification while maximum likelihood is slightly more efficient under correct specification. An R implementation is available in the crch package.</p> <h3 id="citation">Citation</h3> <p>Manuel Gebetsberger, Jakob W. Messner, Georg J. Mayr, Achim Zeileis (2018). “Estimation Methods for Nonhomogeneous Regression Models: Minimum Continuous Ranked Probability Score versus Maximum Likelihood.” <em>Monthly Weather Review</em>. <strong>146</strong>(12), 4323-4338. <a href="https://doi.org/10.1175/MWR-D-17-0364.1">doi:10.1175/MWR-D-17-0364.1</a></p> <h3 id="abstract">Abstract</h3> <p>Nonhomogeneous regression models are widely used to statistically postprocess numerical ensemble weather prediction models. Such regression models are capable of forecasting full probability distributions and correcting for ensemble errors in the mean and variance. To estimate the corresponding regression coefficients, minimization of the continuous ranked probability score (CRPS) has widely been used in meteorological postprocessing studies and has often been found to yield more calibrated forecasts compared to maximum likelihood estimation. From a theoretical perspective, both estimators are consistent and should lead to similar results, provided the correct distribution assumption about empirical data. Differences between the estimated values indicate a wrong specification of the regression model. This study compares the two estimators for probabilistic temperature forecasting with nonhomogeneous regression, where results show discrepancies for the classical Gaussian assumption. The heavy-tailed logistic and Student?s t distributions can improve forecast performance in terms of sharpness and calibration, and lead to only minor differences between the estimators employed. Finally, a simulation study confirms the importance of appropriate distribution assumptions and shows that for a correctly specified model the maximum likelihood estimator is slightly more efficient than the CRPS estimator.</p> <h3 id="software">Software</h3> <p><a href="https://CRAN.R-project.org/package=crch">https://CRAN.R-project.org/package=crch</a></p> <p>The function <code class="highlighter-rouge">crch()</code> provides heteroscedastic (or nonhomogenous) regression models of <code class="highlighter-rouge">"gaussian"</code> (i.e., normally distributed), <code class="highlighter-rouge">"logistic"</code>, or <code class="highlighter-rouge">"student"</code> (i.e., <em>t</em>-distributed) response variables. Additionally, responses may be censored or truncated. Estimation methods include maximum likelihood (<code class="highlighter-rouge">type = "ml"</code>, default) and minimum CRPS (<code class="highlighter-rouge">type = "crps"</code>). Boosting can also be employed for model fitting (instead of full optimization). CRPS computations leverage the excellent <a href="https://CRAN.R-project.org/package=scoringRules">scoringRules</a> package.</p> <h3 id="illustration">Illustration</h3> <p>The plots below show histograms of the PIT (probability integral transform) for various nonhomogenous regression models yielding probabilistic 1-day-ahead temperature forecasts at an Alpine site (Innsbruck). When the probabilistic forecasts are perfectly calibrated to the actual observations the PIT histograms should form a straight line at density 1. The gray area illustrates the 95% consistency interval around perfect calibration - and binning is based on 5% intervals.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-12-17-crps_vs_ml/pit.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-12-17-crps_vs_ml/pit.png" alt="PIT histograms" /></a></p> <p>When a normally distributed or Gaussian response is assumed (left panel), it is shown that the maximum-likelihood model (solid line) is not well calibrated as the tails are not heavy enough. (The legend denotes this “LS” because maximizing the likelihood is equivalent to minimizing the so-called log-score.) In contrast, the minimum-CRPS model is reasonably well calibrated.</p> <p>When assuming a Student-t response (right panel) there is little deviation between both estimation techniques and both are well-calibrated.</p> <p>Thus, the source of the differences between CRPS- and ML-based estimation with a Gaussian response comes from assuming a distribution whose tails are not heavy enough. In this situation, minimum-CRPS yields the somewhat more robust model fit while both estimation techniques lead to very similar results if a more suitable response distribution is adopted. In the latter case ML is slightly more efficient than minimum-CRPS.</p>2018-12-17T00:00:00+01:00https://eeecon.uibk.ac.at/~zeileis/news/palmtree/Partially additive (generalized) linear model trees2018-10-08T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/The PALM tree algorithm for partially additive (generalized) linear model trees is introduced along with the R package palmtree. One potential application is modeling of treatment-subgroup interactions while adjusting for global additive effects.<p>The PALM tree algorithm for partially additive (generalized) linear model trees is introduced along with the R package palmtree. One potential application is modeling of treatment-subgroup interactions while adjusting for global additive effects.</p> <h2 id="citation">Citation</h2> <p>Heidi Seibold, Torsten Hothorn, Achim Zeileis (2018). “Generalised Linear Model Trees with Global Additive Effects.” <em>Advances in Data Analysis and Classification</em>. Forthcoming. <a href="https://doi.org/10.1007/s11634-018-0342-1">doi:10.1007/s11634-018-0342-1</a> <a href="http://arxiv.org/abs/1612.07498">arXiv</a></p> <h2 id="abstract">Abstract</h2> <p>Model-based trees are used to find subgroups in data which differ with respect to model parameters. In some applications it is natural to keep some parameters fixed globally for all observations while asking if and how other parameters vary across subgroups. Existing implementations of model-based trees can only deal with the scenario where all parameters depend on the subgroups. We propose partially additive linear model trees (PALM trees) as an extension of (generalised) linear model trees (LM and GLM trees, respectively), in which the model parameters are specified a priori to be estimated either globally from all observations or locally from the observations within the subgroups determined by the tree. Simulations show that the method has high power for detecting subgroups in the presence of global effects and reliably recovers the true parameters. Furthermore, treatment-subgroup differences are detected in an empirical application of the method to data from a mathematics exam: the PALM tree is able to detect a small subgroup of students that had a disadvantage in an exam with two versions while adjusting for overall ability effects.</p> <h2 id="software">Software</h2> <p><a href="https://CRAN.R-project.org/package=palmtree">https://CRAN.R-project.org/package=palmtree</a></p> <h2 id="illustration-treatment-differences-in-mathematics-exam">Illustration: Treatment differences in mathematics exam</h2> <p>PALM trees are employed to investigate treatment differences in a mathematics 101 exam (for first-year business and economics students) at Universität Innsbruck. Due to limited availability of seats in the exam room, students could self-select into one of two exam tracks that were conducted back to back with slightly different questions on the same topics. The question is whether this “treatment” of splitting the students into two tracks was fair in the sense that it is on average equally difficult for the two groups. To investigate the question the data are loaded from the <a href="https://CRAN.R-project.org/package=psychotools">psychotools</a> package, points are scaled to achieved percent in [0, 100], and the subset of variables for the analysis are selected:</p> <pre><code class="language-{r}">data("MathExam14W", package = "psychotools") MathExam14W$tests <- 100 * MathExam14W$tests/26 MathExam14W$pcorrect <- 100 * MathExam14W$nsolved/13 MathExam <- MathExam14W[ , c("pcorrect", "group", "tests", "study", "attempt", "semester", "gender")] </code></pre> <p>A naive check could be whether the percentage of correct points (<code class="highlighter-rouge">pcorrect</code>) differs between the two <code class="highlighter-rouge">group</code>s:</p> <pre><code class="language-{r}">ci <- function(object) cbind("Coefficient" = coef(object), confint(object)) ci(lm(pcorrect ~ group, data = MathExam)) ## Coefficient 2.5 % 97.5 % ## (Intercept) 57.60 55.1 60.08 ## group2 -2.33 -5.7 1.03 </code></pre> <p>This shows that the second group achieved on average 2.33 percentage points less than the first group. But the corresponding confidence interval conveys that this difference is not significant.</p> <p>However, it is conceivable that stronger (or weaker) students selected themselves more into one of the two groups. And if the assignment had been random, then the “treatment effect” might have been larger or even smaller. Luckily, an independent measure of the students’ ability is available, namely the percentage of points achieved in the online <code class="highlighter-rouge">tests</code> conducted during the semester prior to the exam. Adjusting for that increases the treatment effect to a decrease of 4.37 percentage points which is still non-significant, though. This is due to weaker students self-selecting into the second group. Moreover, the <code class="highlighter-rouge">tests</code> coefficient signals that 1 more percentage point from the online tests lead on average to 0.855 more percentage points in the written exam.</p> <pre><code class="language-{r}">ci(lm(pcorrect ~ group + tests, data = MathExam)) ## Coefficient 2.5 % 97.5 % ## (Intercept) -5.846 -13.521 1.828 ## group2 -4.366 -7.231 -1.502 ## tests 0.855 0.756 0.955 </code></pre> <p>Finally, PALM trees are used to assess whether there are subgroups of differential <code class="highlighter-rouge">group</code> treatment effects when adjusting for a global additive <code class="highlighter-rouge">tests</code> effect. Potential subgroups can be formed from the covariates <code class="highlighter-rouge">tests</code>, type of <code class="highlighter-rouge">study</code> (three-year bachelor vs. four-year diploma), the number of times the students <code class="highlighter-rouge">attempt</code>ed the exam, number of <code class="highlighter-rouge">semester</code>s, and <code class="highlighter-rouge">gender</code>. Using <a href="https://CRAN.R-project.org/package=palmtree">palmtree</a> this can be easily carried out:</p> <pre><code class="language-{r}">library("palmtree") palmtree_math <- palmtree(pcorrect ~ group | tests | tests + study + attempt + semester + gender, data = MathExam) print(palmtree_math) ## Partially additive linear model tree ## ## Model formula: ## pcorrect ~ group | tests + study + attempt + semester + gender ## ## Fitted party: ## [1] root ## | [2] attempt <= 1 ## | | [3] tests <= 92.3: n = 352 ## | | (Intercept) group2 ## | | -7.09 -3.00 ## | | [4] tests > 92.3: n = 79 ## | | (Intercept) group2 ## | | 14.0 -14.5 ## | [5] attempt > 1: n = 298 ## | (Intercept) group2 ## | 2.33 -1.70 ## ## Number of inner nodes: 2 ## Number of terminal nodes: 3 ## Number of parameters per node: 2 ## Objective function (residual sum of squares): 253218 ## ## Linear fixed effects (from palm model): ## tests ## 0.787 </code></pre> <p>A somewhat enhanced version of <code class="highlighter-rouge">plot(palmtree_math)</code> is shown below:</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-10-08-palmtree/palmtree-math.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-10-08-palmtree/palmtree-math.png" alt="PALM tree for mathematics exam data" /></a></p> <p>This indicates that for most students the <code class="highlighter-rouge">group</code> treatment effect is indeed negligible. However, for the subgroup of “good” students (with high percentage correct in the online tests) in the first attempt, the exam in the second group was indeed more difficult. On average the students in the second group obtained -14.5 percentage points less than in the first group.</p> <pre><code class="language-{r}">ci(palmtree_math$palm) ## Coefficient 2.5 % 97.5 % ## (Intercept) -7.088 -16.148 1.971 ## .tree4 21.069 13.348 28.791 ## .tree5 9.421 5.168 13.673 ## tests 0.787 0.671 0.903 ## .tree3:group2 -2.997 -6.971 0.976 ## .tree4:group2 -14.494 -22.921 -6.068 ## .tree5:group2 -1.704 -5.965 2.557 </code></pre> <p>The absolute size of this group difference is still moderate, though, corresponding to about half an exercise out of 13.</p> <h2 id="simulation-study">Simulation study</h2> <p>In addition to the empirical case study the manuscript also provides an extensive simulation study comparing the performance of PALM trees in treatment-subgroup scenarios to standard linear model (LM) trees, optimal treatment regime (OTR) trees (following Zhang et al. 2012), and the STIMA algorithm (simultaneous threshold interaction modeling algorithm). The study evaluates the methods with respect to (1) finding the correct subgroups, (2) not splitting when there are no subgroups, (3) finding the optimal treatment regime, and (4) correctly estimating the treatment effect.</p> <p>Here we just briefly highlight the results for question (1): Are the correct subgroups found? The figure below shows the mean number of subgroups (over 150 simulated data sets and mean adjusted rand index (ARI) for increasing treatment effect differences Δ<sub>β</sub> and number of observations n.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-10-08-palmtree/palmtree-sim.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-10-08-palmtree/palmtree-sim.png" alt="Simulation results" /></a></p> <p>This shows that PALM trees perform increasingly well and somewhat better with respect to these metrics than the competitors. More details on the different scenarios and corresponding evaluations can be found in the manuscript. More replication materials are provided along with the manuscript on the publisher’s web page.</p>2018-10-08T00:00:00+02:00https://eeecon.uibk.ac.at/~zeileis/news/thunderstorm_forecasting/Thunderstorm forecasting with GAMs2018-09-16T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://eeecon.uibk.ac.at/~zeileis/Boosted binary generalized additive models (GAMs) with stability selection and corresponding MCMC-based credibility intervals are discussed in a new MWR paper as a probabilistic forecasting method for the occurrence of thunderstorms.<p>Boosted binary generalized additive models (GAMs) with stability selection and corresponding MCMC-based credibility intervals are discussed in a new MWR paper as a probabilistic forecasting method for the occurrence of thunderstorms.</p> <h2 id="citation">Citation</h2> <p>Thorsten Simon, Peter Fabsic, Georg J. Mayr, Nikolaus Umlauf, Achim Zeileis (2018). “Probabilistic Forecasting of Thunderstorms in the Eastern Alps.” <em>Monthly Weather Review</em>. <strong>146</strong>(9), 2999-3009. <a href="https://dx.doi.org/10.1175/MWR-D-17-0366.1">doi:10.1175/MWR-D-17-0366.1</a></p> <h2 id="abstract">Abstract</h2> <p>A probabilistic forecasting method to predict thunderstorms in the European eastern Alps is developed. A statistical model links lightning occurrence from the ground-based Austrian Lightning Detection and Information System (ALDIS) detection network to a large set of direct and derived variables from a numerical weather prediction (NWP) system. The NWP system is the high-resolution run (HRES) of the European Centre for Medium-Range Weather Forecasts (ECMWF) with a grid spacing of 16 km. The statistical model is a generalized additive model (GAM) framework, which is estimated by Markov chain Monte Carlo (MCMC) simulation. Gradient boosting with stability selection serves as a tool for selecting a stable set of potentially nonlinear terms. Three grids from 64 x 64 to 16 x 16 km<sup>2</sup> and five forecast horizons from 5 days to 1 day ahead are investigated to predict thunderstorms during afternoons (1200–1800 UTC). Frequently selected covariates for the nonlinear terms are variants of convective precipitation, convective potential available energy, relative humidity, and temperature in the midlayers of the troposphere, among others. All models, even for a lead time of 5 days, outperform a forecast based on climatology in an out-of-sample comparison. An example case illustrates that coarse spatial patterns are already successfully forecast 5 days ahead.</p> <h2 id="software">Software</h2> <p><a href="https://CRAN.R-project.org/package=bamlss">https://CRAN.R-project.org/package=bamlss</a></p> <h2 id="case-study">Case study</h2> <p>Predicting thunderstorms in complex terrain (like the Austrian Alps) is a challenging task since one of the main forecasting tools, NWP systems, cannot fully resolve convective processes or circulations and exchange processes over complex topography. However, using a boosted binary GAM based on a broad range of NWP outputs useful forecasts can be obtained up to 5 days ahead. As an illustration, lightning activity for the afternoon of 2015-07-22 is shown in the top-left panel below, indicating thunderstorms in many areas in the west but not the east. While the corresponding baseline climatology (top middle) has a low probability of thunderstorms for the entire region, the NWP-based probabilistic forecasts (bottom row) highlight increased probabilities already 5 days ahead, becoming much more clear cut when moving to 3 days and 1 day ahead.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-09-16-thunderstorm_forecasting/map.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-09-16-thunderstorm_forecasting/map.png" alt="observed and forecasted occurrence of thunderstorms on 2015-07-22" /></a></p> <p>More precisely, the probability of thunderstorms is predicted based on a binary logit GAM that allows for potentially nonlinear smooth effects in all NWP variables considered. It selects the relevant variables by gradient boosting coupled with stability selection. Effects and 95% credible intervals of the model for day 1 are estimated via MCMC sampling and shown below (on the logit scale). The number in the bottom-right corner of each panel indicates the absolute range of the effect. The x-axes are cropped at the 1% and 99% quantiles of the respective covariate to enhance graphical representation.</p> <p><a href="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-09-16-thunderstorm_forecasting/effects.png"><img src="https://eeecon.uibk.ac.at/~zeileis/assets/posts/2018-09-16-thunderstorm_forecasting/effects.png" alt="stability-selected effects of the boosted binary logit GAM" /></a></p> <p><em>(Note: As the data cannot be shared freely, the customary replication materials unfortunately cannot be provided.)</em></p>2018-09-16T00:00:00+02:00