Category Archives: Papers

Screencast: Fostering Software Developers’ Productivity at Work

Screencast of my talk that I recently gave at Tasktop. I talked about how we aim to improve developer productivity by increasing their awareness about their work, interruptions, habits and goals.

Click here to access the full blogpost by Patrick Anderson from Tasktop.

Find out more about this work:

Sensing Interruptibility in the Office: A Field Study on the Use of Biometric and Computer Interaction Sensors

Knowledge workers experience many interruptions during their work day. Especially when they happen at inopportune moments, interruptions can incur high costs, cause time loss and frustration. Knowing a person’s interruptibility allows optimizing the timing of interruptions and minimize disruption. Recent advances in technology provide the opportunity to collect a wide variety of data on knowledge workers to predict interruptibility. While prior work predominantly examined interruptibility based on a single data type and in short lab studies, we conducted a two-week field study with 13 professional software developers to investigate a variety of computer interaction, heart-, sleep-, and physical activity-related data. Our analysis shows that computer interaction data is more accurate in predicting interruptibility at the computer than biometric data (74.8% vs. 68.3% accuracy), and that combining both yields the best results (75.7% accuracy). We discuss our findings and their practical applicability also in light of collected qualitative data.

You may access the pre-print here.

Characterizing Software Developers by Perceptions of Productivity

This work has been conducted by André Meyer (UZH), Thomas Zimmermann (Microsoft Research) and Thomas Fritz (UBC). This research has been published to the industrial papers track at the ESEM’17 in Toronto. Thomas Zimmermann will present it on Thursday, November 9th, 2017 at 1pm in Session 4B: Qualitative Research. Download Pre-Print

Studying Developers’ Perceptions of Productivity instead of Measuring it

To overcome the ever-growing demand for software, we need new ways of optimizing the productivity of software developers. Existing work has predominantly focused on top-down approaches for defining or measuring productivity, such as lines of code, function points, or completed tasks over time. While these measurements are valuable to compare certain aspects of productivity, we argue that they miss the many other factors that influence the success and productivity of a software developer, such as the fragmentation of their work, their experience, and so on. A developer who spends the workday with writing a high-quality test-case or helping a co-worker would have a bad productivity-score with said measurements. Hence, in our previous work we looked at productivity from the bottom-up, looking at developers’ individual perceptions of productivity contrary to what was done in previous work. We found that while perceptions of productivity are indeed very individual, they follow certain habitual patterns each day (e.g. Morning-People, Low-At-Lunch People, and Afternoon-People) and there are activities that most developers consider as unproductive or productive.

Similar Perceptions of Productivity

This previous work however, left us questioning if there are possibly more people with similar perceptions of productivity that can be clustered together. To investigate this, we run an online survey with 413 professional software developers who currently work at Microsoft (average experience 9.6 years) and asked them four questions asking them to describe productive (Q1) and unproductive (Q2) workdays, to rate their agreement with statements on factors that might affect productivity (Q3) and to rate the interestingness of productivity measures at work (Q4).

We found out that developers can roughly be clustered into six groups with similar perceptions: the lone, focused, balanced, leading, and goal-oriented developer. This allows us to abstract and simplify the variety of individual perceptions into groups and optimize productivity for these groups instead of individuals. In the following, I will describe the specific characteristics of these groups:

Some just love creative tasks with no clear goal, while others prefer measurable tasks.
  1. The social developers feel productive when helping coworkers, collaborating and doing code reviews. To get things done, they come early to work or work late and try to focus on a single task.
  2. The lone developers avoid disruptions such as noise, email, meetings, and code reviews. They feel most productive when they have little to no social interactions and when they can work on solving problems, fixing bugs or coding features in quiet and without interruptions. To reflect about work, they are mostly interested in knowing the frequency and duration of interruptions they encountered. Note that this group of developers is almost the opposite of the first group (the social developer) in how productive they feel when encountering social interactions.
  3. The focused developers feel most productive when they are working efficiently and concentrated on a single task at a time. They are feeling unproductive when they are wasting time and spend too much time on a task, because they are stuck or working slowly. They are interested in knowing the number of interruptions and focused time.
  4. The balanced developers are less affected by disruptions. They are less likely to come early to work or work late. They are feeling unproductive, when tasks are unclear or irrelevant, they are unfamiliar with a task, or when tasks are causing overhead.
  5. The leading developers are more comfortable with meetings and emails and feel less productive with coding activities than other developers. They feel more productive in the afternoon and when they can write and design things. They do not like broken builds and blocking tasks, preventing them (or the team) from doing productive work.
  6. The goal-oriented developers feel productive when they complete or make progress on tasks. They feel less productive when they multi-task, are goal-less or are stuck. They are more open to meetings and emails compared to the other clusters, in case they help them achieve their goals. In contrast to group 3 (the focused developer), goal-oriented developers care more about actually getting stuff done (i.e. crossing items off the todo-list), while the focused developer cares more about working efficiently.

Optimizing Productivity for Different Groups of Developers

The six clusters and their characteristics provide relevant insights into groups of developers with similar productivity perceptions that can be used to optimize the work and flow on the team and the individual level. The differences between software developers’ preferred collaboration and work styles show that not all developers are alike, and that the cluster an individual or team belongs to could be a basis for tailoring actions for improving their work and productivity.

For example, on the team level, we could provide quiet, less interruption-prone office to the lone and focused developers (cluster 2 and 3), and seat social developers (cluster C1) who feel more comfortable with discussions every now and then. Another example is task assignments, assigning an explorative task for a new product that is very open without clear goal might be less suitable for the goal-oriented developer (cluster 6) as opposed to the social and leading developer (cluster 1 and 5) who prefer explorative tasks that require intensive collaboration.

Not everyone feels productive when spending time in meetings.

On the individual level, developers might benefit from tailored user experiences for their (development) tools. Maybe someday, we can build virtual assistants, e.g. Cortana/Alexa for Developers, that recommend (or automatically take) actions, depending on the developers’ cluster. For example, they could block out notifications from email, Slack, and Skype during coding sessions for the lone developer (cluster 2) but allow them for the social developer (cluster 1). Or they could recommend the focused developer (cluster 3) to come to work early to have uninterrupted work time, or suggest the balanced developer (cluster 4) to take a break to avoid boredom and tiredness. Or they could help with scheduling meetings, depending on the users’ preferences.

 

In the paper (find a pre-print here) you may find more detailed explanations into the study method, and a much more detailed discussion of the clusters.

 

FlowLight: How a Traffic Light Reduces Interruptions at Work (CHI’17)

We are extremely happy to announce our newest project, FlowLight, a traffic-light-like light for knowledge workers to reduce their interruptions at work, and makes them more productive! The research project, published with the title “Reducing Interruptions at Work: A Large-Scale Field Study of FlowLight”, was conducted in close collaboration with researchers at ABB. It was also awared with an Honorable Mention award.

Authors: Manuela Züger, Christopher Corley, André N. Meyer, Boyang Li, Thomas Fritz, David Shepherd, Vinay Augustine, Patrick Francis, Nicholas Kraft, Will Snipes

In the media: Our work was also featured on The Telegraph, Wall Street Journal, GeekWireNBC NewsNew AtlasDigitalTrends, Business StandardThe New Yorker, New ScientistTechXplore, MailOnline/DailyMail, ScienceDaily, The Times (UK), rework.fm (Podcast), TheLaddersNews For Everyone, Evening Express, Yahoo News, India TodayPPP Focus, The StatesmanRadio Canada, LiveAtPC, Cantech Letter, Business Standard, Engineering 360, New Atlas, BT, Telengana TodayLe Matin (French), 20min.ch (German), Radio Energy (German), Die Presse (German), PresseText (German), Tages-Anzeiger (German) CnBeta (Chinese), PopMech (Russian), PcNews (Russian), Teknikan Maailma (Finnish), Utusan (Malaysian), Irish Examiner, Knowridge, CKNW Radio, Thrive GlobalTech.Rizlys, Appsforpcdaily.comEurekAlert, Lancashire Post, MetroNews, user-experience-blog (DE), Corriere della Sierra (Spanish), Breaking NewsUBC News, UBC ScienceSydöstran(Swedish), svt nyheter (Swedish), Sveriges Radio (Swedish) and many other blogs.

Reducing interruptions at the workplace

Various previous work has emphasized how bad constant interruptions and fragmentation of work is for knowledge workers’ productivity, the quality of their work, and also their motivation at work. When we were observing knowledge workers at their work in a previous study, we realized that signals, such as wearing headphones or closing their office door, were often used to visualize that they don’t want to be interrupted right now. However, this manual approach was often considered as quite cumbersome and not everybody was aware of these signs. Also, the long-term impact on teams and their work was unclear. This is why we developed the FlowLight, a physical traffic-light like LED combined with an automatic interruptibility measure based on computer interaction data.

The Research

In a large-scale and long-term field study with 449 participants from 12 different countries, we found, amongst other results, that the FlowLight reduced interruptions of participants by 46%, increased their awareness on the potential disruptiveness of interruptions, and most participants are still using it today!

These, and many other insights, can be found in detail in our publication to the CHI’17 conference (pre-print). Below, you find a video showcasing FlowLight:

This is a first step towards making knowledge workers more aware of, and reducing, interruptions at work. In the future, we plan to add extended computer interaction context and biometric sensing to improve FlowLight’s algorithm, to make it even more accurate.

Presentation & Demo at CHI’17

In case you are planning to attend the CHI’17 Conference in Denver next week, make sure to come to our presentation and learn much more about the FlowLight! The talk will take place on Monday, 9th 2017 at 11.30a to 12.50p.

You can find out more about (or soon order) FlowLight on this website.

 

A few more impressions:

 

“The Work Life of Developers: Activities, Switches and Perceived Productivity” accepted at TSE’17

We are happy to announce that our paper “The Work Life of Developers: Activities, Switches and Perceived Productivity” was accepted for the Transactions of Software Engineering (TSE) journal. You can access a pre-print here.

This work was conducted by André Meyer (UZH), Laura Barton (UBC), Gail Murphy (UBC), Thomas Zimmermann (Microsoft) and Thomas Fritz (UZH)

Make Developers Productive

Many software development companies strive to enhance the productivity of their engineers. All too often, efforts aimed at improving developer productivity are undertaken without knowledge about how developers spend their time at work and how it influences their own perception of productivity and well-being. For example, a software developers’ work day might be influenced by the tasks that are performed, by the infrastructure, tools used, or the office environment. Many of these factors result in activity and context switches that can cause fragmented work and, thus, often have a negative impact on the developers’ perceived productivity, quality of output and progress on tasks.

To fill this gap, we run an in-situ study with professional software developers from different companies, investigating developers’ work practices and the relationship to the developers’ perceptions of productivity more holistically, while also examining individual differences. One of the big questions we set out to answer is if there are observable trends in how developers perceive this productivity and how they can be potentially used to quantify productivity.

In-Situ Study to Investigate Productive Work Days

We deployed a monitoring application that logs developers’ interaction with the computer (e.g. programs used, user input) and asked 20 professional software developers to run it during 2-3 work weeks. We further asked participants to regularly self-report their perceived productivity, and the tasks and activities they have performed, every 90 minutes.

Corroborating earlier findings, we found that developers spend their time on a wide variety of activities and switch regularly between them, resulting in highly fragmented work. The findings further emphasize how individual developers’ work days are. For example, while some participants tend to span their work days out over as many as 21.4 hours (max), most developers keep more compact work hours, on average 8.4 (SD=1.2) hours per day. From that time, they spend on average 4.3 (SD=0.5) hours on their computer. And surprisingly little of it with development related activities (e.g. coding, testing, debugging): only about 30% of that time. The rest of the work day is split up into emails (15%), meetings (10%), web browsing (work related: 11%, unrelated: 6%) and other activities.

A next step was to investigate fragmentation of work in more details: Apart from meetings, developers remain only between 0.3 and 2.0 minutes in an activity before switching to another one. These very short times per activity and the variety of activities a developer pursues each day illustrate the high fragmentation of a developer’s work. From participant’s self-reported, perceived productivity we found that although there was a lot of variation between individuals, the plots can be categorized into three broad groups: morning people, afternoon people, and those whose perceived productivity dipped at lunch. Morning people often come to work a little bit earlier, and get the most important things done before the crowd arrives. Afternoon people usually arrive later and spend most of their mornings with meetings and emails, and get stuff done in the afternoon, thus feeling more productive then. These results suggest that while information workers in general have diverse perceived productivity patterns, individuals do appear to follow their own habitual patterns for each day.

Can we somehow quantify productivity?

We built explanatory models (stepwise linear regressions) to describe which factors (of the collected data) contributes to the productivity ratings reported by the study participant. We observe that productivity is a personal matter that varies greatly among individuals. There are some tendencies, however, such as that more user input is most often associated with a positive, and emails, planned meetings and work unrelated websites with a negative perception of productivity.

Existing, previous work predominantly focused on a single or small set of outcome measures, e.g. the lines of code or function points written. While these measures can be used across developers, e.g. for comparisons, they neglect to capture the individual differences in factors that impact the way that developers’ work. This suggests that measures or models that attempt to quantify productivity should take the individual differences into account, and what is perceived as productive or not; and capture the developer’s work more holistically, rather than just by a single outcome measure. Such individual models could then be used to provide better and more tailored support to developers, for instance to foster focus and flow at work. For example, we could help developers avoid interruptions at inopportune moments (see our FlowLight), increase the awareness about work and productivity using a retrospective view or help users to schedule a more productive work day, that avoids unproductive patterns as much as possible.

Finally, we examined if we can predict high and low productivity sessions based on the collected data for individual participants, using logistic regression. The results are promising and suggest that even with a relatively small number of reported productivity self-reports, it is possible to build personalized, predictive productivity models.

Contact André Meyer in case you have any questions or suggestions.

“Using (Bio)Metrics to Predict Code Quality” is currently one of the most downloaded articles in software engineering

We are happy to announce that our paper “Using (Bio)Metrics to Predict Code Quality Online”, written by Sebastian Müller and Thomas Fritz, was one of the most downloaded software engineering articles in June and July 2016. With 1709 downloads in 6 weeks, it scored the second place of all ACM software engineering articles. According to ACM, this is the first time that any paper was downloaded more than 1000 times.

screen-shot-2016-10-05-at-14-18-11

Image source: ACM SIGSOFT Software Engineering Notes. Volume 41 Number 4.

The paper investigates the use of biometrics, such as heart rate variability (HRV) or electro-dermal activity (EDA) to determine the difficulty that developers experience while working on real world change tasks and automatically identify code quality concerns while a developer is making a change to the code. It can be accessed here.

Journal of Systems and Software: Eye Gaze and Interaction Contexts for Change Tasks – Observations and Potential

The more we know about software developers’ detailed navigation behavior for change
tasks, the better we are able to provide effective tool support. In this article, we extend our work on the fine-granular navigation behavior of developers (see blogpost) and explore the potential of the more detailed and fine-granular data by examining the use of the captured change task context to predict perceived task difficulty and to provide better and more fine-grained navigation recommendations.

 

Check out our Journal article!

seal @ ICSE 2016

We are very happy to announce that our research group got two papers and a technical briefing accepted at ICSE 2016 in Austin, Texas.

The first accepted paper entitled “The Impact of Test Case Summaries on Bug Fixing Performance: An Empirical Investigation” was written in collaboration with the University of Delft. The authors of the paper are : Sebastiano Panichella, Annibale Panichella, Moritz Beller, Andy Zaidman and Harald Gall.

Abstract: “Automated test generation tools have been widely investigated with the goal of reducing the cost of testing activities. However, generated tests have been shown not to help developers in detecting and finding more bugs even though they reach higher structural coverage compared to manual testing. The main reason is that generated tests are difficult to understand and maintain.

Test Case Summarizer

Our paper proposes an approach which automatically generates test case summaries of the portion of code exercised by each individual test, thereby improving understandability. We argue that this approach can complement the current techniques around automated unit test generation or search-based techniques designed to generate a possibly minimal set of test cases. In evaluating our approach we found that (1) developers find twice as many bugs, and (2) test case summaries significantly improve the comprehensibility of test cases, which is considered particularly useful by developers.”

A preprint of the paper can be found online.

The second paper is entitled “Using (Bio)Metrics to Predict Code Quality Online” and was written by Sebastian Müller and Thomas Fritz. The paper investigates the use of biometrics, such as heart rate variability (HRV) or electro-dermal activity (EDA) to determine the difficulty that developers experience while working on real world change tasks and automatically identify code quality concerns while a developer is making a change to the code.

overview

A preprint of the paper will be available soon.

Additionally, we had a technical briefing on “Using Docker Containers to Improve Reproducibility in Software Engineering Research”, by Jürgen Cito and Harald Gall, accepted, where we will present opportunities to aid reproducibility to the SE community.

Preprint: “Interruptibility of Software Developers and its Prediction Using Psycho-Physiological Sensors”

We are excited that our paper “Interruptibility of Software Developers and its Prediction Using Psycho-Physiological Sensors” by Manuela Züger and Thomas Fritz was accepted for CHI 2015 and like to share a preprint with you.

Interruptions of knowledge workers are common and can cause a high cost if they happen at inopportune moments. Our paper presents a lab and a field study with a total of 20 software developers, where we examined the use of psycho-physiological sensors to measure interruptibility of a knowledge worker in a real-world context.

The results show that a Naïve Bayes classifier can be used to automatically assess states of a knowledge worker’s interruptibility with high accuracy in the lab as well as in the field. This demonstrates the potential of psycho-physiological sensors to avoid expensive interruptions. For instance, such a classifier could be used to automatically turn of notifications while a knowledge worker’s interruptibility is low.

The preprint of the paper can be downloaded here.