We recently sent for publication the first draft genome for Cataglyphis niger, which was done by Tal using high-coverage (>100X) Illumina sequencing of a single haploid male. This draft genome was rather complete (>90% BUSCO) but quite fragmented (N50=18Kb). This is because we had only short-insert paired-end libraries. The service provider failed to deliver the long-insert mate-pair libraries we ordered. We also tried 10X sequencing, but the assembly was not much better then Tal’s assembly (even they can’t explain why!).

So now we did Pacbio sequencing through the GAGA consortium. The results were not optimal: 7Kb average read length, 2 million reads (50X coverage). We’re looking into the possible causes (extraction?), and we will probably try again. But already with these data we got a dramatic improvement in N50 size – from 18Kb to 160Kb. It wasn’t easy to achieve this assembly, and we finally did that by combining all the different types of sequencing data in a hybrid assembly approach by SPAdes.