Category Archives: Uncategorized

Principle-Driven Continuous Integration: Simplifying Failure Discovery and Raising Anti-Pattern Awareness

Public PhD Thesis Defense of Carmine Vassallo

Advisor: Prof. Dr. Harald C. Gall
2nd Advisor: Prof. Dr. Laurie Williams
3rd Advisor: Prof. Dr. Sebastian Proksch

Chair: Prof. Dr. Davide Scaramuzza

Date and time: Friday, September 18, 2020, 16:00 h
Location:  remotely via “Zoom” (link expired)

Extended Abstract: Continuous Integration (CI) is a software development practice that enables developers to build reliable software faster. Given its proven benefits, such as increased developer productivity and higher release frequency, most organizations have started adopting CI. This practice advocates full automation of all build steps (i.e., compilation, testing, and code quality assessment) to create a new version of the software. However, the mere introduction of an automated build infrastructure is not sufficient to practice CI well and to achieve its goals. Organizations have also to foster the application of several principles, such as commit often, that reduce conflicts in the team and ensure that the build is continuously executable. Living up to these principles is not easy especially when developers face tough deadlines. As a consequence, developers tend to deviate from principles generating anti-patterns, which are ineffective solutions to recurrent problems. Anti-patterns appear to be beneficial, but, in the end, they let CI decay and lower its effectiveness. In this dissertation, we characterize the problem of anti-patterns to implement solutions that help developers remove the root causes of anti-patterns and, therefore, follow principles. Based on the results of a preliminary study performed on opensource projects revealing the existence of deviations from core principles, we empirically derive a catalog of 79 anti-patterns encountered by developers in practice conducting semi-structured interviews with 13 practitioners and manually analyzing 2,300 posts from a well-know forum (i.e., StackOverflow) where users discuss issues related to the adoption of CI. By interpreting the resulting catalog of anti-patterns, we identify four main causes for their presence, which are (i) the poor knowledge of prerequisites for adopting CI, (ii) the difficulty of inspecting build failure logs, (iii) the presence of bad configurations, and (iv) the wrong usage of a CI process. While only better coaching in CI can efficiently remove the former, we implement several approaches to address the other causes. To improve the understandability of build failure logs, we first build a taxonomy of build failures through the manual analysis of errors contained in 34,182 build logs from open-source and closed-source projects, and then we develop a tool called BART that produces summaries for the most common build failure types. We evaluate the performance of our tool in a controlled experiment with 17 developers. To identify violations of CI principles in the form of configuration smells that developer should remove, we propose CDLinter, a semantic linter for CI/CD configuration files that is evaluated opening 145 issues in open-source projects and monitoring the acceptance of our bug reports and the removal of reported smells over a period of six months. Finally, we implement CI-Odor, an automated reporting tool that leverages information from repository and build history to monitor the presence of bad practices that slowly creep into a project over time. We evaluate its usefulness sending developers reports produced for 36 open-source projects. The results of our evaluations show that the proposed approaches are effective at removing and identifying the aforementioned causes of anti-patterns and, consequently, enforcing a principle-driven continuous integration. BART improves the understandability of the most common build failure types and developers are faster in solving build failures. In presence of build summaries, the resolution time is reduced by 23% when solving testing failures, 20% when repairing compilation errors, 43% when fixing missing dependencies, and 62% when dealing with code analysis failures. CD-Linter identifies smells that are relevant for developers. During the 6-month observation period, 53% of the project maintainers react positively to the issues detected by CD-Linter, with 9% that confirm the validity of the reported problem and 44% that fix it. Finally, the reports generated by CI-Odor are useful for monitoring anti-patterns. Many developers (67%) expect a positive effect of using our generated reports on their CI discipline and the majority (55%) is already willing to integrate CI-Odor in their CI processes.


Today was a Good Day: The Daily Life of Software Developers

Co-Authors: André N. Meyer (University of Zurich), Earl T. Barr (University College London),  Chris Bird (Microsoft Research), Tom Zimmermann (Microsoft Research)

Abstract: What is a good workday for a software developer? What is a typical workday? We seek to answer these two questions to learn how to make good days typical. Concretely, answering these questions will help to optimize development processes and select tools that increase job satisfaction and productivity. Our work adds to a large body of research on how software developers spend their time. We report the results from 5971 responses of professional developers at Microsoft, who reflected about what made their workdays good and typical, and self-reported about how they spent their time on various activities at work. We developed conceptual frameworks to help define and characterize developer workdays from two new perspectives: good and typical. Our analysis confirms some findings in previous work, including the fact that developers actually spend little time on development and developers’ aversion for meetings and interruptions. It also discovered new findings, such as that only 1.7% of survey responses mentioned emails as a reason for a bad workday, and that meetings and interruptions are only unproductive during development phases; during phases of planning, specification and release, they are common and constructive. One key finding is the importance of agency, developers’ control over their workday and whether it goes as planned or is disrupted by external factors. We present actionable recommendations for researchers and managers to prioritize process and tool improvements that make good workdays typical. For instance, in light of our finding on the importance of agency, we recommend that, where possible, managers empower developers to choose their tools and tasks.

You may download the pre-print here.

Conceptual Framework characterizing typical developer workdays
Conceptual Framework characterizing good developer workdays

seal @ ICSE 2017

We are very happy to announce that our research group got two papers at ICSE 2017 in Buenos Aires, Argentina.

The first accepted paper is entitled “Analyzing APIs Documentation and Code to Detect Directive Defects” and was written by Yu Zhou, Ruihang Gu, Taolue Chen, Zhiqiu Huang, Sebastiano Panichella and Harald Gall.

Abstract: “Application Programming Interface (API) documents represent one of the most important references for API users. However, it is frequently reported that the documentation is inconsistent with the source code and deviates from the API itself. Such inconsistencies in the documents inevitably confuse the API users hampering considerably their API comprehension and the quality of software built from such APIs.


In this paper, we propose an automated approach to detect defects of API documents by leveraging techniques from program comprehension and natural language processing. Particularly, we focus on the directives of the API documents which are related to parameter constraints and exception throwing declarations. A first-order logic based constraint solver is employed to detect such defects based on the obtained analysis results. We evaluate our approach on parts of well documented JDK 1.8 APIs. Experiment results show that, out of around 2000 API usage constraints, our approach can detect 1146 defective document directives, with a precision rate of 83.1%, and a recall rate of 81.2%, which demonstrates its practical feasibility.”

A preprint of the paper will be available soon.

The second paper is entitled “Recommending and Localizing Code Changes for Mobile Apps based on User Reviews” and was written in collaboration with the University of Salerno. The authors of the paper are: Fabio Palomba, Pasquale Salza, Adelina Ciurumelea, Sebastiano Panichella, Harald Gall, Filomena Ferrucci and Andrea De Lucia.

Abstract: “Researchers have proposed several approaches to extract information from user reviews useful for maintaining and evolving mobile apps. However, most of them just perform automatic classification of user reviews according to specific keywords (e.g., bugs, features). Moreover, they do not provide any support for linking user feedback to the source code components to be changed, thus requiring a manual, time-consuming, and error-prone task.

In this paper, we introduce ChangeAdvisor, a novel approach that analyzes the structure, semantics, and sentiments of sentences contained in user reviews to extract useful (user) feedback from maintenance perspectives and recommend to developers changes to software artifacts. It relies on natural language processing and clustering algorithms to group user reviews around similar user needs and suggestions for change. Then, it involves textual based heuristics to determine the code artifacts that need to be maintained according to the recommended software changes. The quantitative and qualitative studies carried out on 44683 user reviews of 10 open source mobile apps and their original developers showed a high accuracy of ChangeAdvisor in (i) clustering similar user change requests and (iii) identifying the code components impacted by the suggested changes.

Moreover, the obtained results show that ChangeAdvisor is more accurate than a baseline approach for linking user feedback clusters to the source code in terms of both precision +47%) and recall (+38%).”

Also in this case a preprint of the paper will be available soon.

“Reducing Redundancies in Multi-Revision Code Analysis” @ SANER’17

We’re happy to announce that the paper

“Reducing Redundancies in Multi-Revision Code Analysis”

written by Carol V. Alexandru, Sebastiano Panichella and Harald C. Gall, has been accepted into the technical research track of SANER 2017.


Software engineering research often requires analyzing multiple revisions of several software projects, be it to make and test predictions or to observe and identify patterns in how software evolves. However, code analysis tools are almost exclusively designed for the analysis of one specific version of the code, and the time and resources requirements grow linearly with each additional revision to be analyzed. Thus, code studies often observe a relatively small number of revisions and projects. Furthermore, each programming ecosystem provides dedicated tools, hence researchers typically only analyze code of one language, even when researching topics that should generalize to other ecosystems. To alleviate these issues, frameworks and models have been developed to combine analysis tools or automate the analysis of multiple revisions, but little research has gone into actually removing redundancies in multi-revision, multi-language code analysis. We present a novel end-to-end approach that systematically avoids redundancies every step of the way: when reading sources from version control, during parsing, in the internal code representation, and during the actual analysis. We evaluate our open-source implementation, LISA, on the full history of 300 projects, written in 3 different programming languages, computing basic code metrics for over 1.1 million program revisions. When analyzing many revisions, LISA requires less than a second on average to compute basic code metrics for all files in a single revision, even for projects consisting of millions of lines of code.

Use and extend LISA:

Or try out LISA using a simple template:


“Analyzing Reviews and Code of Mobile Apps for better Release Planning” @ SANER 2017

We’re happy to announce that the paper “Analyzing Reviews and Code of Mobile Apps for better Release Planning” has been accepted into SANER 2017 as a full paper. The authors of the paper are:   Adelina Ciurumelea, Andreas Schaufelbühl, Sebastiano Panichella and Harald Gall.


The mobile applications industry experiences an unprecedented high growth, developers working in this context face a fierce competition in acquiring and retaining users. They have to quickly implement new features and fix bugs, or risks losing their users to the competition. To achieve this goal they must closely monitor and analyze the user feedback they receive in form of reviews. However, successful apps can receive up to several thousands of reviews per day,  manually analysing each of them is a time consuming task.


To help developers deal with the large amount of available data, we manually analyzed the text of 1566 user reviews and defined a high and low level taxonomy containing mobile specific categories (e.g. performance, resources, battery, memory, etc.) highly relevant for developers during the planning of maintenance and evolution activities. Then we built the User Request Referencer (URR) prototype, using Machine Learning and Information Retrieval techniques, to automatically classify reviews according to our taxonomy and recommend for a particular review what are the source code artifacts that need to be modified to handle the issue described in the user review. We evaluated our approach through an empirical study involving the reviews and code of 39 mobile applications. Our results show a high precision and recall of URR in organising reviews according to the defined taxonomy. Furthermore, we discovered during the evaluation that using information about the specific structure of mobile software projects (e.g. how to find source code implementing the UI) improves the source code localization results”.

“Using (Bio)Metrics to Predict Code Quality” is currently one of the most downloaded articles in software engineering

We are happy to announce that our paper “Using (Bio)Metrics to Predict Code Quality Online”, written by Sebastian Müller and Thomas Fritz, was one of the most downloaded software engineering articles in June and July 2016. With 1709 downloads in 6 weeks, it scored the second place of all ACM software engineering articles. According to ACM, this is the first time that any paper was downloaded more than 1000 times.


Image source: ACM SIGSOFT Software Engineering Notes. Volume 41 Number 4.

The paper investigates the use of biometrics, such as heart rate variability (HRV) or electro-dermal activity (EDA) to determine the difficulty that developers experience while working on real world change tasks and automatically identify code quality concerns while a developer is making a change to the code. It can be accessed here.

ARdoc: App Reviews Development Oriented Classifier @ FSE 2016

We are happy to announce that the paper “ARdoc: App Reviews Development Oriented Classifier” got accepted at the FSE 2016 Demonstrations Track! The authors of the paper are: Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Aaron Visaggio, Gerardo Canfora and Harald Gall.

The paper presents ARdoc (App Reviews Development Oriented Classifier) a Java tool that automatically recognizes natural language fragments in user reviews that are relevant for developers to evolve their applications. Specifically, natural language fragments are extracted according to a taxonomy of app reviews categories that are relevant to software maintenance and evolution. The categories were defined in our previous paper entitled “How Can I Improve My App? Classifying User Reviews for Software Maintenance and Evolution” ) and are: (i) Information Giving, (ii) Information Seeking, (iii) Feature Request and (iv) Problem Discovery. ARdoc implements an approach that merges three techniques: (1) Natural Language Processing, (2) Text Analysis and (3) Sentiment Analysis
(SA) to automatically classify useful feedback contained in app reviews important for performing software maintenance and evolution tasks.

Our quantitative and qualitative analysis (involving mobile professional developers) demonstrate that ARdoc correctly classifies feedback useful for maintenance perspectives in user reviews with high precision (ranging between 84% and 89%), recall (ranging between 84% and 89%), and an F-Measure (ranging between 84% and 89%). While evaluating our tool we also found that ARdoc substantially helps to extract important maintenance tasks for real world applications.

This video provides a short demonstration of ARdoc:

ARdoc is available for download at

What Would Users Change in My App? Summarizing App Reviews for Recommending Software Changes @ FSE 2016

We’re happy to announce that the paper

“What Would Users Change in My App? Summarizing App Reviews for
Recommending Software Changes” has been accepted into FSE 2016 as a full paper. The authors of the paper are:  Andrea Di Sorbo, Sebastiano Panichella,
Carol Alexandru, Junji Shimagaki, Corrado Visaggio, Gerardo Canfora and Harald

Mobile app developers constantly monitor feedback in user reviews with the goal of improving their mobile apps and better meeting user expectations. Thus, automated approaches have been proposed in literature with the aim of reducing the effort required for analyzing feedback contained in user reviews via automatic classification (or prioritization) according to specific topics (e.g., bugs, features etc.).


In this paper, we introduce SURF (Summarizer of User Reviews Feedback), a novel approach to condense the enormous amount of information that developers of popular apps have to manage due to user feedback received on a daily basis. SURF relies on a conceptual model for capturing user needs useful for developers performing maintenance and evolution tasks. Then it uses sophisticated summarisation techniques for summarizing thousands of reviews and generating an interactive, structured and condensed agenda of recommended software changes. We performed an end-to-end evaluation of SURF on user reviews of 17 mobile apps (5 of them developed by Sony Mobile), involving 23 developers and researchers in total. Results demonstrate high accuracy of SURF in summarizing reviews and the usefulness of the recommended changes. In evaluating our approach we found that SURF helps developers in better understanding user needs, substantially reducing the time required by developers compared to manually analyzing user (change) requests and planning future software changes.

Three Open Positions for Researchers / PhD Students in Software Development for Cloud and Mobile Apps

The software evolution and architecture lab (s.e.a.l.) at the University of Zurich, Switzerland ( is seeking applications for three PhD students in the areas of software development for cloud and mobile applications. All positions are fully funded, available immediately, and open until filled.

One student will work with Dr. Philipp Leitner and Prof. Harald Gall on the SNF-funded project “MINCA – Models to Increase the Cost Awareness of Cloud Developers”. The student shall be interested in the intersection of software engineering and cloud computing (distributed systems) research, and be able and willing to pursue empirical research (e.g., repository mining, interview or survey research, concept prototyping, and statistical modelling and analysis). Some more information on this line of research can be found on Philipp Leitner’s web page (

Two students will work with Dr. Sebastiano Panichella and Prof. Harald Gall on the SNF-funded project “SURF-MobileAppsData”. This project focuses on mining mobile apps data available in app stores to support software engineers in better maintenance and evolution for these apps. In particular, the goal of mining data of mobile apps is to build an analysis framework and a feedback-driven environment to help developers to build better mobile applications. Some more information on this line of research can be found on Sebastiano Panichella’s web page (

Our group consists of 2 professors, 3 post-docs, and 8-10 PhD students, all working on how to improve software developer productivity and software quality. We have a track record of substantial impact at international venues and are well funded, both on the national and European level. We cooperate with researchers around the world, including companies such as Microsoft, Google, IBM, ABB, or SAP.

The Department of Informatics is the competence center for Informatics at the University of Zurich. Ten tenured professors, four assistant professors, and approximately 100 PhD students and postdoctoral researchers instruct and conduct research at the department. Zurich is a leading global city and among the world’s largest financial centers. The city is home to a large number of financial institutions and IT companies. Most of Switzerland’s research and development centers are concentrated in Zurich and the low tax rates attract overseas companies to set up their headquarters there. Quality of life is very high in Zurich.

English is the working and teaching language all over computer science. Germans generally have reasonable command of English; and in day-to-day life, you can easily get along with English as your only language.

Mandatory conditions of employment are:

– Master’s degree (MSc) in Computer Science and/or Software Engineering

– Fluency in English

The position requires relocating to Zurich.

We offer an internationally competitive salary in accordance with University of Zurich regulations. The university aims to increase the number of women in this field. Therefore, women are especially encouraged to apply for this position.

Starting date: The positions are available immediately. Applications are accepted until the positions are filled.

Applications: Please email a statement of interest, detailed CV, a writing sample (e.g., a published or submitted paper, or your thesis), and at least one letter of reference from a faculty member (either from your home institution or a past collaborator), as one PDF document to Please indicate which position you are applying to by using the tag [CLOUD-Student] or [MOBILE-Student] as part of your subject line.

More information: Please contact Sebastiano Panichella ( or Philipp Leitner ( for questions.