The first column is labels, and the sixth column is tokens. You can ignore all the other columns. Current version is with full annotations. When you train the model with fewer labels, you need to replace the unused labels to O. For the model trained with 6 labels, the 6 labels you will use is: PERSON, ORGANIZATION, GPE, EVENT, NORP, TIME. (All the other labels in training data should be replaced with O) For the model trained with 12 labels, the 12 labels you will use is: PERSON, ORGANIZATION, GPE, EVENT, NORP, TIME, FACILITY, LAW, PERCENT, QUANTITY, WORK_OF_ART, PRODUCT. (All the other labels in training data should be replaced with O)