Null model

MT Milena Tsvetkova
RG Ruth García-Gavilanes
TY Taha Yasseri
request Request a Protocol
ask Ask a question
Favorite

A number of sophisticated statistical methods for analyzing network data already exist but they are not adapted for large growing networks observed over a long continuous period of time. Exponential random-graph models (ERGMs) cannot account for the timing of interactions because they express the structural properties of aggregate networks or networks observed at a single moment in time42. Extensions such as temporal ERGMs57 and stochastic actor-based models58 account for network dynamics but are nevertheless restricted to relatively small networks with a fixed set of nodes and a few “snapshots” over time.

To establish the statistical significance of our observations, we use a null model created by randomizing the underlying network. The randomization needs to preserve the daily pattern and the community structure of the network, while removing any systematic clustering in the timing of events in which an individual is involved, as we consider such clustering evidence for a social process. Hence, we do not randomize the network structure but only the timing of events. There are several possible ways to do this18,59. First, one can shuffle the entire time sequence of events but this destroys individual activity patterns and increases the time variation in the individual’s activities, as many editors joined Wikipedia for a limited period only and the shuffle is over almost 11 years. Individual activity patterns can be preserved if the randomization instead repeatedly samples two random nodes with equal number of events and swaps the time sequences for the events. The same could be achieved by shuffling within shorter periods of time, say 24 or 168 hours.

In addition to swapping and shuffling within a node, one can swap and shuffle within dyad, only within a node’s in-links, or only within a node’s out-links. These methods preserve the activity patterns characteristic for dyadic exchange, individual “visibility”, and individual activity, respectively. However, in this way they also restrict the baseline to these narrow scopes. For example, swapping within dyads would compare the occurrence of the AB-AB motif to the AB-BA motif but would not answer whether either of them is more common than chance.

To account for these potential caveats, we choose to use as a baseline a shuffle of the timestamps of events within individuals and within a window of 24 hours. For each link l, we look at the source i and collect other links in which i participated (as either source or target) up to 24 hours before or after l. We then swap l’s time with the time of one of these links selected at random; if the set of candidate links is empty, no swap occurs. We shuffle within individuals to preserve individual activity patterns and to identify each social interaction against this expected sequence of events. We execute the shuffle within a limited time window to preserve the duration of activity per individuals. The time window should be at least 24 hours to allow for interactions between editors from different time zones and with different daily routines. We found that a time window on the order of 24–240 hours produces similar results. Hence, we chose 24 hours as this is the time window we use to define the social interactions.

In short, our shuffling method does not change the structure of the network (who reverts whom), which implies that it preserves the community structure centered around articles and topics. Neither does the method change the overall sequence of timestamps, thus preserving any natural burstiness of activity due to editors being in the same time zone or due to the occurrence of news-worthy events, for example. In contrast, the method deliberately shuffles the sequence of reverts that an editor is involved in, thus removing any individual behavioral patterns.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A