The assembly of the career data set
This protocol is extracted from research article:
Cross-disciplinary evolution of the genomics revolution
Sci Adv, Aug 15, 2018; DOI: 10.1126/sciadv.aat4211

We selected 155 biology and computing departments in the United States following the 2014 U.S. News & World Report (table S1). We confirmed that all the departments in the set had active PhD programs since the conception of HGP in the 1980s. Moreover, the ranking of academic departments is relatively rank-stable, as supported by theoretical and empirical evidence drawn from various other social systems characterized by positive feedback reinforcement mechanisms that temper large rank fluctuations (58). With respect to the latter, we found no significant differences in the ranks of these 155 departments between the 2014 and 2018 U.S. News & World Report ranking (P > 0.05, Wilcoxon test).

We accessed the home pages of these departments and recorded the listed faculty as of spring 2017. In this master list, we identified the faculty Embedded Image that had GS pages, forming a database with their GS IDs, h-indices, departments, department rankings, and bibliometric data. We also indexed their NSF and NIH grant data from the corresponding repositories (59). We then applied a name disambiguation algorithm to Embedded Image and their coauthors to reconcile their identities within and across Embedded Image profiles (appendix S1). Figure 1 provides a visual example of how we constructed the biology-computing college network from the disambiguated Embedded Image data.

The key motivator behind our data collection methodology for the career data set is the tendency of typical computing researchers to publish the bulk of their work in refereed conferences from where they receive most of their citations. Traditional bibliometric databases, such as Scopus and WoS, do not cover citations from many refereed conference publications, but GS does, thus emerging as the only viable alternative for fair career assessment.

Although the career data set covers a substantial portion of the biology-computing college in United States, it does not cover it all, and it does not explicitly cover the international biology-computing college. This limitation is tempered by two factors. First, it is important to clarify that, in our analysis, we are not seeking to measure the impact of the HGP on research outcomes, but rather the impact of cross-disciplinarity on research outcomes. Because the HGP had explicit cross-disciplinary alignment, we expect it to have had its strongest and most direct impact on the adoption of cross-disciplinary research orientation in the United States. Second, the construction of the mediated association network considerably expands the reach of the career data set, as it includes not only the faculty members in these 155 departments but all their collaborators, forming an impressive ecosystem. The representational power and validity of this ecosystem finds confirmatory evidence in two cases during the course of our analysis: (i) The evolution of the rate of cross-disciplinary collaborations in the U.S. biology-computing college mirrors the rate of cross-disciplinary collaborations gleaned via author affiliations in the human genomics literature at large. (ii) Entrance of faculty in the U.S. biology college crests in early 2000s, which is consistent with the doubling of NIH research funding in the period 1998–2003 (8, 60).

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.