We prepared samples for high coverage (aiming for 20 to 30×) and for low coverage (aiming for at least 2× coverage based on a genome size of 830 Mb) sequencing on the Illumina HiSeq 2500 platform. This would amount to coverage of at least 2.5× if the genome is 650 Mb as has been suggested (49). The average sequence coverages obtained were in reasonable agreement with the coverage aimed for (table S5). Only our Polar cod specimen had low coverage (2.8×), whereas the other individuals had 12 to 45× coverage (table S5).

Our tissue collection not only is primarily gill tissue but also has fin clips, muscle tissues, and DNA isolated from blood (50). Tissues are stored in 96% ethanol. We isolated genomic DNA from 10 individuals selected for high-coverage sequencing using the NucleoSpin Tissue Kit (reference 740952.50, Macherey-Nagel), following the manufacturer’s protocol. We isolated genomic DNA from individuals selected for low-coverage sequencing using the E.Z.N.A. Tissue DNA Kit (Omega Bio-tek), following the manufacturer’s protocol.

We quantified and estimated 260/280 and 260/230 quality of the genomic DNA using a NanoDrop 2000 UV-Vis Spectrophotometer (Thermo Fisher Scientific). We also quantified genomic DNA using fluorescent detection with the Qubit dsDNA HS Assay Kit (Life Technologies).

Libraries for the high-coverage sequencing were made at the Bauer Core Facility at Harvard University. The facility used the Covaris S220 (Covaris) to shear the genomic DNA to a target size of 550 base pairs. The Apollo 324 (WaferGen Biosystems) system was used to generate libraries for DNA sequencing. Each library had a single index. The size distribution of the libraries was determined with the Agilent Bioanalyzer (Agilent Technologies). Library sample concentration was determined with quantitative polymerase chain reaction (qPCR) according to an Illumina protocol.

We prepared libraries for low-coverage sequencing using the Nextera DNA Library Preparation Kit (FC-121-1031, Illumina). We used the Nextera Index Kit (FC-121-1031) with dual indices. Index N517 was used instead of N501. We followed the manufacturer’s Nextera protocol. We cleaned the tagmentated DNA with the Zymo Purification Kit (ZR-96 DNA Clean & Concentrator-5, Zymo Research). We cleaned PCR products, and size was selected with Ampure XP beads (reference A63881, Beckman Coulter Genomics). We used the modification recommended for PCR clean-up for 2 × 250 runs on the MiSeq using 25 μl of Ampure XP beads (instead of 30 μl) for each well of the NAP2 plate. We quantified the individual libraries with fluorescent detection using the Quant-iT PicoGreen dsDNA Assay Kit (Quantit kit) on a SpectraMax i3x Multi-Mode Detection Platform (Molecular Devices). We determined the size distribution of 12 randomly chosen libraries using the Agilent Bioanalyzer. We then normalized the multiplexed DNA libraries to concentrations of 2 nM and pooled the libraries. The size distribution of the pooled libraries was determined by the Bauer Core Facility using the Agilent Bioanalyzer. Pooled library sample concentration was determined with qPCR. Pooled libraries were sequenced on the HiSeq 2500 in rapid run mode (paired-end, 2 × 250 cycles) at the Bauer Core Facility at Harvard University.

The Bauer Core Facility through the Department of Informatics and Scientific Applications returned the base-called data as demultiplexed fastq files. Individuals were sequenced on two lanes of the HiSeq 2500.

