I'm a huge sports fan, from the Golden State Warriors (shoutout Looney) to the French National soccer team (shoutout Kanté) to the Duke basketball team (shoutout McCain). So naturally, why not build an artificial intelligence model that detects basketballs, shooting motions, the rim... Both a cool way to learn new tools and make my atrocious shooting form look more like Michael Jordan's: https://www.pierrequereuil.com/classifiers.
I personally feel like AI has been misrepresented with the recent advantages in generative models. Sure, Chat GPT generating a picture of a cat in a tuxedo is pretty fun. But detection models that can figure out what's in pictures, transcribe unreadable texts, help comb through medical data to find tumors, translate ASL... I feel like that's a whole lot cooler. They feel somehow more connected to their original statistical definitions than generative ones. So, I decided to apply these to basketball, such that it would be able to recognize shooting motions, ball paths, if the shot was made or not. There are a bunch of options for image classifiers, from Scikit-Learn to Tensorflow, but I wanted to try something new through the ultralytics library, specializing in object detection through their version of the YOLO algorithm (You Only Look Once). The actual model workflow is pretty straightforward: I found a couple datasets on Roboflow Universe, an online ML repository with great integration to ultralytics and used their guides to train a couple models with different parameters to find one that could perform well.
I'm of the belief that nowadays, pretty much anyone with a little knowledge of python can get a good detection model up and running. I am by no means an expert (although check out the Neural Network blog if you're interested in some more detailed discussions of machine learning), and the basketball object detection project worked super well for what I was expecting. However, I also wanted to integrate this tech with my portfolio website. This has always been a pretty major issue for me in my past projects, since I inevitably run in to a couple walls:
Finding Gradio and its hosted spaces on HuggingFace was a huge relief, since they allowed for infinite model deployment, a clean GUI, and a well-documented JS client that could handle cross-platform predictions. While it was still an uphill battle to integrate (maybe I'll figure out video API calls some day), I'm so glad it worked out for a showcase of image detection possibilities. You can check them all out on HuggingFace, or try them out on this website!