NES Plankton Classifier 2022 v2.4

Inception V3 model for automated classification of plankton and other particles imaged by the Imaging FlowCytobot (IFCB) on the Northeast U.S. Shelf (NES). Classifies 155 categories including phytoplankton, microzooplankton, detritus, and imaging artifacts.

This model is intended for automated taxonomic classification of IFCB imagery collected on the Northeast U.S. Shelf. It is suitable for operational use in plankton monitoring pipelines. Performance may degrade on IFCB data from other geographic regions or instruments with significantly different optical configurations.

Model Files

File Description
20220209_Jan2022_NES_2.4.onnx IFCB imagery ONNX model for inference (~85 MB)
20220209_Jan2022_NES_2.4.cpu.onnx non-GPU optimimized model (historical)
labels.json Integer index → class name mapping
config.json Model architecture summary
preprocessor_config.json Image preprocessing parameters
The main "cuda optimized" and "cpu" model versions return equivalent results. Both model will run fine on cpu or gpu.

The main version folds BatchNormalization nodes into the preceeding Conv layer, improving GPU performance. The "cpu" model retains the explicit BatchNormalization nodes is included for historical reasons. Output results are equivalent, baring minor floating-point score differences.

Property CPU CUDA
IR version 6 7
Opset 11 12
Total nodes 331 237
BatchNormalization nodes 94 0
Conv nodes with bias 0 94
Initializers 472 190

How to Use

Install Requirements

To run inference on IFCB bins, use of the ifcb-inference is recommended. See link for details on installation and inference-runtime options.

If your machine is gpu-enabled, use the [cuda] option (uses onnxruntime-gpu[cuda,cudnn]). Otherwise use the [cpu] option (uses onnxruntime).

For dataloading performance with ifcb-infer, it is recommended to use [torch] option. For lighter deployments in constrained environments, this option can be omitted (a simpler dataloader without additional dependancies will be used instead).

pip install "ifcb-infer[cuda,torch] @ git+https://github.com/WHOIGit/ifcb-inference.git@v0.1.0"

Download model

With huggingface hf command:

# Download the model and labels to current directory using huggingface hf command
hf download sbatchelder/NES-plankton-classifier-2022 20220209_Jan2022_NES_2.4.onnx labels.json --local-dir .

With curl command:

# or
curl -L -O https://huggingface.co/sbatchelder/NES-plankton-classifier-2022/resolve/main/20220209_Jan2022_NES_2.4.onnx
curl -L -O https://huggingface.co/sbatchelder/NES-plankton-classifier-2022/resolve/main/labels.json

Optionally, also download example-data, included here for example inference below. Realistically, you'd have your own ifcb bins you'd want to run inference on

hf download sbatchelder/NES-plankton-classifier-2022 example-data.zip
# or 
curl -L -O https://huggingface.co/sbatchelder/NES-plankton-classifier-2022/resolve/main/example-data.zip
# then
unzip example-data.zip

Run inference

# Run inference on ifcb bins
ifcb-infer \
  20220209_Jan2022_NES_2.4.onnx \
  example-data/bins/ \
  --classes labels.json \
  --batch 64

Input / Output Specification

Property Value
Input image format Grayscale (IFCB ROI, PIL mode L)
Input size 299 × 299 pixels
Preprocessing Resize → duplicate grayscale channel to 3 channels → divide by 255
Input tensor float32, shape (batch, 3, 299, 299)
Output tensor float32, shape (batch, 155) logits (not-softmaxed)
Class order As listed in labels.json
  • No ImageNet mean/std normalization is applied — pixel values are scaled to [0, 1] only.
  • Model's direct output are logits and are not softmaxed. ifcb-infer will automaticalyl apply softmax to logits and output confidence scores

Training Details

Property Value
Architecture Inception V3 (pretrained on ImageNet)
Training framework PyTorch
Training dataset NES plankton classifier 2022 dataset
Classes 155
Samples per class min 20, max 2000
Train / val split 80 / 20
Image augmentation Horizontal and vertical flip
Batch size 108
Optimizer Default (PyTorch Lightning)
Epochs Best at 15 of 26 (early stopping patience=10, max=60)
Input resolution 299 × 299
Training date 2022-02-15

Performance

Evaluated on a held-out 20% validation split:

Metric Value
F1 Weighted 0.9415
F1 Macro 0.9191
Best epoch 15

Training and per-class metrics are available in the companion training data repository sbatchelder/NES plankton classifier 2022 dataset.

Classes

All 155 classes (click to expand)
Acanthoica_quattrospina
Akashiwo
Alexandrium_catenella
Amphidinium
Amylax
Apedinella
Asterionellopsis_glacialis
Bacillaria
Bacillariophyceae
Bacteriastrum
Balanion
Biddulphia
Calciopappus
Calciosolenia_brasiliensis
Cerataulina_pelagica
Tripos
Tripos_furca
Tripos_fusus
Tripos_lineatus
Chaetoceros
Chaetoceros_danicus
Chaetoceros_didymus
Chaetoceros_didymus_TAG_external_flagellate
Chaetoceros_peruvianis
Chaetoceros_similis
Chaetoceros_socialis
Chaetoceros_subtilis
Chaetoceros_tenuissimus
Chaetoceros_throndsenii
Prorocentrum_dentatum
Chrysochromulina
Chrysochromulina_lanceolata
Copepod_nauplii
Corethron_hystrix
Corymbellus
Coscinodiscus
Cylindrotheca
Cylindrotheca_morphotype1
Dactyliosolen_blavyanus
Dactyliosolen_fragilissimus
Delphineis
Dictyocha
Dictyocysta
Didinium
Dinobryon
Dinophyceae
Dinophysis_acuminata
Dinophysis_norvegica
Dinophysis_tripos
Ditylum_brightwellii
Emiliania_huxleyi
Ephemera
Eucampia
Eucampia_morphytype1
Euglena
Euplotes
Euplotes_morphotype1
Eutintinnus
Favella
Gonyaulax
Guinardia_delicatula
Guinardia_delicatula_TAG_internal_parasite
Guinardia_flaccida
Guinardia_striata
Gyrodinium
Hemiaulus
Hemiaulus_membranaceus
Heterocapsa_rotundata
Kryptoperidinium_triquetrum
Karenia
Katodinium_or_Torodinium
Laboea_strobila
Lauderia_annulata
Leegaardiella_ovalis
Leptocylindrus
Leptocylindrus_mediterraneus
Licmophora
Margalefidinium
Mesodinium
Nanoneis
Odontella
Ophiaster
Oxytoxum
Paralia_sulcata
Parvicorbicula_socialis
Phaeocystis
Phaeocystis_debris
Pleuronema
Pleurosigma
Polykrikos
Prorocentrum
Prorocentrum_micans
Prorocentrum_triestinum
Proterythropsis
Protoperidinium
Pseudo-nitzschia
Pseudochattonella_farcimen
Pyramimonas
Pyramimonas_longicauda
Pyramimonas_morphotype1
Acantharia
Rhabdolithes
Rhizosolenia
Scrippsiella
Skeletonema
Stenosemella_morphotype1
Stenosemella_pacifica
Stephanopyxis
Pelagostrobilidium
Strombidium_capitatum
Strombidium_conicum
Strombidium_inclinatum
Strombidium_morphotype1
Strombidium_morphotype2
Strombidium_tintinnodes
Strombidium_wulffi
Syracosphaera_pulchra
Thalassionema
Thalassiosira
Thalassiosira_TAG_external_detritus
Thalassiosira_sp_aff_mala
Tiarina_fusus
Tintinnina
Tintinnidium_mucicola
Tintinnopsis
Tontonia_appendiculariformis
Paratontonia_gracillima
Trichodesmium
Vicicitus_globosus
Warnowia
Amoeba
bead
bubble
camera_spot
Ciliophora
coccolithophorid
Cryptophyta
detritus
detritus_transparent
fecal_pellet
fiber
fiber_TAG_external_detritus
flagellate
flagellate_morphotype1
flagellate_morphotype3
nanoplankton_mix
pennate
pennate_Pseudo-nitzschia
pennate_Thalassionema
pennate_morphotype1
pollen
shellfish_larvae
Bacillariophyceae_morphotype1
unknown2
zooplankton

Citation

If you use this model in your research, please cite the Woods Hole Oceanographic Institution and the Sosik Lab. A formal citation will be added here upon publication.

Acknowledgments

Developed at the Woods Hole Oceanographic Institution. Training data collected by the WHOI IFCB Lab on Northeast U.S. Shelf cruises.

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train sbatchelder/NES-plankton-classifier-2022

Evaluation results

  • F1 Weighted (validation) on NES-plankton-classifier-2022-dataset
    self-reported
    0.942
  • F1 Macro (validation) on NES-plankton-classifier-2022-dataset
    self-reported
    0.919