Plant Taxonomy Meets Plant Counting: A Fine-Grained, Taxonomic Dataset for Counting Hundreds of Plant Species

cs.CV Jinyu Xu, Tianqi Hu, Xiaonan Hu, Letian Zhou, Songliang Cao, Meng Zhang, Hao Lu · Mar 22, 2026

What it does

Why it matters

This paper introduces TPC–268, a dataset of 10,000 images spanning 268 countable plant categories across 242 species, annotated with full Linnaean taxonomies and biological organization levels. By framing plant counting as class-agnostic...

Main concern

Community signal

0 up · 0 down

AI Review AI reviewed

Plain-language introduction

Most visual counting benchmarks focus on rigid objects like crowds and vehicles, leaving fine-grained biological counting understudied. This paper introduces TPC–268, a dataset of 10,000 images spanning 268 countable plant categories across 242 species, annotated with full Linnaean taxonomies and biological organization levels. By framing plant counting as class-agnostic counting with taxonomic constraints, the authors provide a testbed for evaluating hierarchical generalization in vision models.

Critical review

Verdict

Bottom line

TPC–268 represents a valuable contribution to fine-grained visual counting, offering the first large-scale benchmark that explicitly couples instance counting with biological taxonomy. The MILP-based partitioning strategy ensures strict taxonomic separation between splits, providing a rigorous zero-shot evaluation protocol that prior datasets lack. However, the paper’s claims about cross-dataset generalization are overstated, and the heavy reliance on mixed data sources (Wikipedia, PlantCLEF, unconstrained Internet images) raises questions about annotation consistency and domain bias that are not fully addressed.

“Major sources are distributed among Wikipedia (34%), PlantCLEF (29%), Internet (14%), Tree Leaf Stomata (11%)...”

paper · Section 3.2

“This indicates that our plant counting dataset is a more challenging task than FSC–147 and the model trained on TPC–268 can generalize to generic objects naturally.”

paper · Section 4.4

What holds up

The taxonomic annotation scheme is well-conceived and systematically implemented. The dataset’s hierarchical structure (kingdom to species) and organization-level categories (tissue, organ, organism, population) enable genuine zero-shot evaluation across biologically meaningful splits. The MILP formulation for data partitioning (Section 3.4, Supplementary A) correctly formalizes the constraints for taxonomic independence while balancing density and scale coverage. The benchmark results in Table 2 provide a clear baseline showing that regression-based methods (LOCA, DAVE) outperform detection-based approaches on this fine-grained domain, and the t-SNE visualization (Fig. 6) convincingly demonstrates that current feature learning fails to capture taxonomic structure.

“The minimal indivisible unit (category) is defined as a species–organization pair... Each pair is treated as an independent category and is assigned to one subset... preventing any instance overlap between different sets.”

paper · Section 3.4

“Overall, regression-based models outperform detection-based models. This indicates that explicit object localization is hindered by the compact spatial arrangement and structural entanglement present in our dataset.”

paper · Section 4.2

Main concerns

The primary concern is data heterogeneity and quality control. While the paper mentions a “rigorous preprocessing pipeline” and “three-round review,” it provides no quantitative metrics for inter-annotator agreement or consistency across the diverse data sources (Wikipedia vs. controlled laboratory imagery). The cross-dataset transfer analysis (Table 3) suffers from confounding: TPC–268 has 10,000 images versus FSC–147’s 6,135, and the performance asymmetry may reflect dataset size rather than inherent generalization capability. The citation of TasselNetV4 as “ISPRS’26” (Table 2) suggests reliance on unpublished or future work, undermining reproducibility. Additionally, the paper claims the dataset includes “critically endangered” species validated against the IUCN Red List, but does not discuss privacy or ethical considerations for releasing precise geolocation data of rare organisms.

“All data undergoes a rigorous preprocessing pipeline... with annotations being manually refined or newly created to ensure quality.”

paper · Section 3.2

“TasselNetV4 [22] ... ISPRS’26”

paper · Table 2

Evidence and comparison

The comparison to FSC–147 and FSCD–LVIS in Table 1 is fair regarding scale, though the paper correctly identifies that these datasets lack fine-grained taxonomic structure. However, the evidence for taxonomic utility (Table 4) relies on a single method (CountGD) with text prompts, which is insufficient to support the broad claim that “structured biological knowledge provides a practical inductive bias.” The ablation showing improved performance with full taxonomy versus species name alone ($\text{MAE}$ 16.90 vs. 17.53) is small and lacks statistical significance testing. The paper also omits comparison to domain-specific plant datasets like GWHD or MinneApple in the benchmark results, making it difficult to assess whether the CAC paradigm actually improves upon specialized models for specific agricultural use cases.

“As shown in Table 4, this consistent improvement confirms that structured biological knowledge provides a practical inductive bias for the task.”

paper · Section 4.5

“Target Specification ... + full taxonomy ... MAE 16.90”

paper · Table 4

Reproducibility

The authors commit to releasing code and data at https://github.com/tiny-smart/TPC-268, which is essential given the complexity of the MILP partitioning scheme. However, the main paper lacks critical implementation details: hyperparameters for training the benchmarked models (learning rates, batch sizes, augmentation strategies), the exact prompt templates used for CountGD text experiments, and the threshold values for the MILP solver convergence. The annotation protocol (Section 3.3) describes manual refinement but does not specify the number of annotators, their domain expertise qualifications, or the adjudication process for disagreements. Without these details, independent reproduction of the dataset curation or benchmark results would be challenging.

“Dataset and code are available at https://github.com/tiny-smart/TPC-268.”

paper · Abstract

“Annotations for approximately 80% of the images were created from scratch by a professional annotation team over three months.”

paper · Section 3.3

Abstract

Visually cataloging and quantifying the natural world requires pushing the boundaries of both detailed visual classification and counting at scale. Despite significant progress, particularly in crowd and traffic analysis, the fine-grained, taxonomy-aware plant counting remains underexplored in vision. In contrast to crowds, plants exhibit nonrigid morphologies and physical appearance variations across growth stages and environments. To fill this gap, we present TPC-268, the first plant counting benchmark incorporating plant taxonomy. Our dataset couples instance-level point annotations with Linnaean labels (kingdom -> species) and organ categories, enabling hierarchical reasoning and species-aware evaluation. The dataset features 10,000 images with 678,050 point annotations, includes 268 countable plant categories over 242 plant species in Plantae and Fungi, and spans observation scales from canopy-level remote sensing imagery to tissue-level microscopy. We follow the problem setting of class-agnostic counting (CAC), provide taxonomy-consistent, scale-aware data splits, and benchmark state-of-the-art regression- and detection-based CAC approaches. By capturing the biodiversity, hierarchical structure, and multi-scale nature of botanical and mycological taxa, TPC-268 provides a biologically grounded testbed to advance fine-grained class-agnostic counting. Dataset and code are available at https://github.com/tiny-smart/TPC-268.

Challenge the Review

Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.

Challenges are public to read, but only signed-in members can post them. Your challenge text is stored with your account for moderation, but usernames are not shown in the public thread.

No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.