Semi-automated development of a dataset for baseball pitch type recognition

  • Dylan Siegler
  • , Reed Chen
  • , Michael Fasko
  • , Shunkun Yang
  • , Xiong Luo
  • , Wenbing Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

In this paper, we report our work on developing a new dataset for baseball pitch type recognition based on youtube videos of the US Major League Baseball games. The core innovation is a largely automated procedure to extract relevant clips from the full game, and automatically label the clips by aligning the infographic information included in the broadcast and the PitchF/X data. We adopted the Needleman-Wunsch algorithm to address the challenges imposed by the aligning the two streams of data based on pitch speed, i.e., minimize gaps and mismatches between the two streams. Manual inspection is used only to select games that include infographic information for clip extraction and to remove erroneous clips for improve the quality of the dataset.
Original languageEnglish
Title of host publicationCommunications in Computer and Information Science
EditorsHuansheng Ning
Place of Publicationche
PublisherSpringer
Pages345-359
Number of pages15
Volume1138 CCIS
ISBN (Print)9789811519246
DOIs
StatePublished - Jan 1 2019
Event3rd International Conference on Cyberspace Data and Intelligence, Cyber DI 2019, and the International Conference on Cyber-Living, Cyber-Syndrome, and Cyber-Health, CyberLife 2019 - Beijing, China
Duration: Dec 16 2019Dec 18 2019

Publication series

NameCommunications in Computer and Information Science
PublisherSpringer
Volume1138 CCIS
ISSN (Print)18650929
ISSN (Electronic)18650937

Conference

Conference3rd International Conference on Cyberspace Data and Intelligence, Cyber DI 2019, and the International Conference on Cyber-Living, Cyber-Syndrome, and Cyber-Health, CyberLife 2019
Country/TerritoryChina
CityBeijing
Period12/16/1912/18/19

Keywords

  • Dataset
  • Needleman-Wunsch algorithm
  • Pitch type
  • PitchF/X
  • Video-based human activity recognition

Cite this