In 2020, ʻIKE Solutions was fortunate enough to be part of the CARES Act funded Aloha Connect Innovation project overseen by the Economic Development Alliance of Hawaii. The opportunity afforded by this program has resulted in a step-up in ML functionality for identifying tunas and other pelagics in the Pacific Ocean. During the program, three interns were trained in annotation services to build a machine learning library out of images. As part of the training, excess footage from ʻIKE SOLUTIONS partner vessels was used. In the end, the library was trimmed, checked for correctness, and grown to was include 539,417 annotations. Additional areas of interest were also built on top of the existing library, resulting in new capabilities of ML in an EM application which closed gaps identified by previous years of research.
Previously, our training library only allowed us to identify whole fish to species. Due to limitations with the training library, partial fish, or damaged fish would often be missed when ML was used to extract data from video. Additionally, the metadata created from just being able to identify bodies of fish caused some noise in the data, including duplicating fish counts, misidentifying an individual, and many false positives where the algorithm would identify inanimate objects as a fish. In the image below, (sourced from fishnet.ai, the Yellowfin Tuna was able to be detected and identified to species by the old model, but the fish occluded below escaped detection:
With our new library, we have been able to cut down on misidentifications, and false positives by training the model to place three boxes on each fish. By doing so we can now sort by confidence and get rid of the noise to provide only data that has gone through a preliminary check before a human needs to look at it. In the below example, a Mahi gets a computer-generated box on its body, head, and tail, and an occluded yellowfin tuna gets a box around its body and tail as the head is not available:
This advancement, done with three interns, represents what years of work previously could accomplish. Dedicated annotators are an indispensable part of growing this model. Any issues to be identified in the future can be solved with some old-fashioned hard annotation work to clean up the results. When testing this new model, we also found that this algorithm is optimized with cameras that we offer, meaning a new data set will need to be created if camera technology becomes better or different cameras are introduced to the system in future years.