My story as a Gephi user and contributor

Looking back at 12 years of involvement with this great software

Quick background on Gephi

Two-line summary if you don’t know Gephi. Born around 2007, Gephi is the leading software for the visualization and exploration of networks. It is open source and free. It is used by analysts, professors, journalists, students. It is useful to find patterns and connecting the dots in rich and unstructured datasets, for instance: finding communities, people with key roles in a crowd, identifying connections between items or products, etc.

Gephi has been downloaded millions of times since its creation, and the original paper describing Gephi has been cited by close to 10,000 academic publications from a huge variety of scientific fields.

The Gephi Code retreat is a one-week event organized by the founding members of Gephi to accelerate the development of the software and help the community of Gephi enthusiasts meet and cohere. That is the second year such a retreat is organized, after a first edition in Copenhagen.

Being a historian, discovering Gephi in 2010

Gephi is the reason I transitioned from being a historian of science to a social scientist using computational (data-intensive) approaches. As a post-doc at the KNAW in the Netherlands in 2008, Gephi had just appeared (there was also GUESS that I used as well).

I realized that to explore and find insights in networks of scientists (that was my interest at the time), scrolling through big tables in Excel was going to be difficult:

An author in column A, the co-author in column B. How do you make sense of 10,000 of rows like that?

Relations between authors in column A and their co-authors in column B describe a network. Networks can be analyzed in formal ways with a body of knowledge called “network analysis” or “social network analysis” (SNA). Researchers in SNA have developed methods to compute metrics on networks, or to generate different types of networks, with dedicated software like UCINET or R packages like igraph.

My need was different though: I needed to “make sense” of the network, in a global and generalist sense. Things like:

  • do we see clearly split sub-groups in the network or is it a compact hairball?
  • are two famous scientists “neighbors” in the network or are they far apart?
  • is the oldest research lab at the periphery or at the center of the network?
  • where are European scientists, relative to their American peers?
  • are psychologists scattered in the network or grouped in one region?
  • how are positioned the scientists who tend to publish on addiction? And those working on animal studies?
  • etc…

I needed a way to ask all these questions and get a broad sense of whether they were relevant. Also, exploring the data would hopefully reveal other patterns of interest that I had not tought of. To do this, I did not need an analytical tool to test hypotheses, I needed a way to explore my data.

Being completely new to the field of social network analysis, my first impulse was to explore the network visually. In 2010, I found Gephi, which was already so much advanced compared to the classic viz solutions such as Pajek or UCINET’s NetDraw.

The founding team of Gephi - thanks to them!

By the way, it is pretty common for historians to have a similar epiphany with the visualization of networks, and Gephi is a very popular tool in the digital humanities.

(see two publications 1 2 of my research project on neuroeconomics that used Gephi and VOSviewer, another very good network viz tool).

Gephi’s killer features

The basic workflow in Gephi is super effective at delivering insights. Each step is a simple one-click operation!

  1. get the big picture: apply a layout to immediately get a sense of the size, patterns of relations, core / periphery structure… that characterize the network (source):
  1. detect communities (in one-click): quickly see how a crowd divides in sub-groups, based on the pattern of connections between the members of the group

  2. find central nodes: which member of the group has the most connections? Who is placed “in the middle of the network”?

  3. explore attributes: Gephi can handle the attributes attached to the members of the network. Not just their name, but any textual or numerical attribute you have stored on them. Occupation, gender, city, year of birth…

  4. zoom on sub-regions: what if we filter out all the network except for one group of nodes and their relations: can we identify local communities within this subnetwork? Who are the central actors within this region?

The killer feature of Gephi is that these 5 operations have an intuitive, illustrated expression in Gephi:

  1. The layout (with Force Atlas) deploys / unfolds progressively, showing how the connection micro-patterns progressively mould the final, global result
  2. Visualize different communities by painting them in different colors
  3. Visualize the importance of each node by resizing them according to their centrality
  4. Visualize attributes by switching on the display of labels attached to the nodes, or by painting the nodes in colors representing categories (“blue nodes for professors, red nodes for post-docs”).
  5. Explore subregions by hiding the parts of the network you want to ignore - just apply a filter in one click

From enthusiastic user to first steps as a contributor

In the course of using Gephi, I started asking a lot of questions on the now defunct Gephi forum. The community of users is now the most active in a private Facebook group with close to 10,000 members (join!), where I continue to be active.

Even with the help I received on the forum (Sébastien Heymann did a lot here!), I was frustrated by the absence of basic functions I needed. Like: paint a node in green if it has the text “green” stored in one of its attributes. How to get these functions I needed?

Gephi is a free and open source product, developed by a very small team of volunteers on their free time. That they could create Gephi and take the time to answer all my questions felt already huge. I had heard of “open source software” but the scale and impact of Gephi made me realize that there was nothing that justified for me to stay on the sidelines and complain - if I want something, I could at least make an effort and try to contribute?

The first thing I did was contribute to helping fellow users on the forum. I was an intensive user of Gephi, so there was plenty of cases when people were stuck on issues that could be solved by simply explaining where to click in Gephi and that sorts of things. It was relatively low effort but it felt already good to give back and also receive thank you messages from the users I helped (it should be noted that the Gephi community is really super nice btw - I can’t remember that we ever encountered a troll on the forum or the FB group!)

Starting developping Gephi plugins

To contribute to new functions for Gephi, that was a step beyond and that felt really out of reach. Almost since its inception, Gephi came with a system of plugins that anyone can develop. But as I am not a computer scientist, this sounded too hard. I had an experience in VBA (the programming language for Excel) as a real amateur, but Gephi is developed in Java and that felt like another league. Java even necessitates to install a software to use it. I had tried to download and install it (it is called NetBeans, there are others too). Being a real newbie I felt utterly lost: what are the menus and windows in this software even supposed to do? I had no clue.

But using Gephi for my research projects, the question of developing features I needed was really nagging. if I just need, you know, a plugin to change the colors of my nodes - surely it can’t be that hard to create?

After some inquiries in 2010, in the end, in 2011, I jumped in this strange world of Java, buying a book for complete beginners and learning how the thing works. By the way besides a book, everything is free: the software to code (NetBeans), and the Java language itself. Not costly to try!

I got hooked when I realized Java was a zillion time faster than Excel. With Excel, I was literally waiting for half an hour to apply operations on my rows, to end up freezing and crashing Excel. With Java that became a 10 seconds operation!

I used Java first to create custom gexf files (custom made networks!!) and finally some plugins for Gephi.

My coding style was horrendous. I remember showing my code to Francesco Ficarola (I believe), the developer of a Java library that helps create gexf files. I think he said that his eyes burnt. But with time, things improve!

Another big help in learning how to program was Stackoverflow. This is THE Q&A website for programming issues, and browsing back to the first question I asked on it, you see it really shows that I was transitioning from Excel to Java!

Using Gephi programmatically - the additional benefits

Knowing how to program came with exponential benefits, beyond Gephi, and right away. Java is a versatile language which enabled me to easily analyze text, create web apps, desktop apps, publish courses in fancy formats, grade large groups of students without going mad, teach a course on how to create mobile apps, create silly personal projects, and more.

I wrote tutorials for Gephi, and I continued using Gephi in most of my research projects to this day (on collusion in European markets, dynamic inter-banking financial networks, and new ways to identify communities on Twitter).

Expanding beyond Gephi and coming back to it

Moving away from Gephi…

In the last few years, I did not prioritize the development or maintenance of the Gephi plugins I had created in 2012-2016, for several reasons. On one side, my research and admin activities as a professor led me in other directions. Also, the plugins I have developed are pretty complex and with a code of poor quality which makes it hard to maintain.

In 2020, I found the time to think about how to (re-)deploy my efforts in the area of coding and development, looking at how to achieve a form of long term impact with my research activities when they are translated and packaged in software form. Create more Gephi plugins? Gather all the functions I have created under the umbrella of a desktop app? Or a web app? I just knew that in the spirit of my Gephi plugins, the interface would be click-and-point, meant for technical users who don’t code.

Things were not linear. I had actually started to create a desktop app called “Nocode app” that would include all the functions that a Gephi user would find interesting, and more: functions for text analysis, for instance. Developed with JavaFX, I soon realized that:

  1. Yes, JavaFX is a great Java framework for desktop apps, it is super enjoyable to develop with it
  2. but launching, testing and maintaining a desktop app is just daunting
  3. desktop apps are not popular at all compared to web apps

(and yes, Gephi is an exception to this)

So in late 2020 / early 2021, I switched gears and started developing the same functions, but as a web app called Nocode Functions:

To develop this web app, I stick to Java and use the JSF framework (read this blog post I published this summer about this framework!).

And… after one year, it is a success. I personally enjoy a lot developping it, it is easy to deploy, test, maintain, very robust and scalable to any number of function I’d need to add in the future.

… and back to Gephi

In late Spring / early summer 2022, Mathieu Jacomy made a call for Java devs to participate in the 2022 edition of the Gephi Code retreat. To be honest I still don’t consider myself a “fullfledged” Java dev so I did not answer the call, and was super happy when I was invited to join! 🤗.

This code retreat happened in late August so a week ago. I’ve published a short blog post in the middle of it and I’ll publish a longer blog post about its conclusion. But in the perspective of my journey as a Gephi user / contributor detailed above, the week constitutes an important step:

  • first time in ten years I spend days in contact with the Gephi core contributors. Mathieu Bastian in particular has a lot of knowledge on the Gephi code base which helped me get unstuck / speed up / expand the horizons of the features I could develop. This made my experience as a developer of Gephi plugins not just more comfortable: it actually helped me overcome difficulties that kept me far from plugins. I can now come back to developing plugins! And I look forward to participating to the following Gephi events of the same nature.
  • I made progress in the way I think of my contributions between Gephi plugins and Nocode Functions. I believe that I will develop Gephi plugins that are specific to Gephi, typical of a desktop app environment and which are integrated to the Gephi interface by nature. And I will leave other types of functions, which are less “Gephi specific”, to Nocode functions.

I hope, I think I have found a balance between the multiple components of my activities as a researcher and developer. Last week during the gephi code retreat, it certainly felt like it 😊:

This is a super long post in the end. I am not sure this is of interest to a general readership, maybe more of a biographical narrative for my grand-children 😀. But if you are a user of Gephi and read so far, I hope you are now tempted to contribute even more to the community, including with some programming 🚀.

Some of my contributions to Gephi / get in touch

The tutorials for Gephi I have written. Nocode functions 🔎, the web app that that can also serve the needs of the community of Gephi users. It is fully open source. Try it and give some feedback, I would appreciate it!

 Date: September 8, 2022

Previous
⏪ Gephi code retreat for 2022 - it is on!!

Next
With ChatGPT, we are crossing a threshold as important as the one from analogical to digital. ⏩