🌇

🌃

hi, i'm spencer

Polanding Twitter Bot

This was a project I worked on with my friend Nadia for our final project in Deep Learning for Computer Vision. We utilized the Tweepy Twitter API and TensorFlow Object Detection to make a Twitter bot that could locate a red-white boundary like that in the Polish flag. The project writeup can be seen below ☺️

What is ‘polanding’?

According to Urban Dictionary, it’s the process of “looking for the Poland flag in everyday things around you.”

Watch this TikTok video for more info.

Here’s a link to our presentation and a link to the Twitter bot.

Project Objectives

This project was inspired by the TikTok trend known as ‘polanding.’ In a TikTok video, a user will record themselves finding a Polish flag (a rectangle with the top half white and the bottom half red) in everday life, whether out on a walk, in a city, or in their immediate surroundings. The main objective of this project is to identify an instance of the Polish flag in any given image through a series of rotations and/or transformations. We aim to train an image classifier on a bundle of images and use it in turn for image classification on any image the user might feed it. The network will be able to distinguish between red and white and pick out when a border between the two is noticed. Ideally, our classifier will be able to take in any given image and pinpoint a rectangle in the image that looks like the Polish flag, also reporting the percent confidence in its selection.

Methods

Data Collection

First, we collected 208 images from Google Images that we could visibly pick out an instance of the Polish flag in, with varying levels of difficulty. Many images were those of simple objects, like stop signs or red/white awnings, and others were images crowded with many things, like a photo of Times Square. We used LabelMe, an online annotation tool, to manually outline instances of the Polish flag as seen from our own eyes. Once we found an instance of the Polish flag, we assigned it one of the four following labels (the ‘poland-red’ prefix indicates where the red in the rectangle is):

poland-red-right
poland-red-left
poland-red-up
poland-red-down

Pre-Trained Model

We pre-trained the TensorFlow Object Detection API and Google’s MobileNet architecture on the COCO dataset. We chose to use an API because building something similar would require substantial amounts of effort that would require more time than allotted for this project. The MobileNet architecture is made up of 17 building blocks, followed by a convolutional layer, global average pooling layer, and classification layer. Each building block contains three bottleneck convolutional layers: 1) an expansion layer that expands the number of channels from the input layer, 2) a layer that performs depthwise convolution from the first layer, and 3) a projection layer that projects data with many dimensions into a tensor with few dimensions.

Experimental Evidence

try1

On our first try for the project, we collected images and labeled them using LabelMe. We took in xml files, converted them to csv, and passed those results into a function using TensorFlow. This function gave us the necessary tfrecord files for our training and validation data.

To train the network we using Google’s Mobilenet V2 architecture pre-trained on the COCO dataset. For our first run, we performed random data augmentation, which would do things like flipping the image horizontally and/or vertically and adjusting brightness and contrast levels. Unfortunately we couldn’t get our network to converge, and validation accuracy only reached 65%. The results of this try were unreliable, as the network would correctly label several times, but also incorrectly label several times.

We believe we couldn’t get our network because when we manually labeled the images in LabelMe, we only found one instance of the Polish flag in each image, rather than highlight multiple occurrences. So, the network might find a Polish flag in a given image but not be confident in its decision so it would move on, when in actuality it would be correct. This led to our network over-training on its data.

try2

On our second try, we created a function that created a noisy red rectangle with the four labels, as shown below:

Once we trained the network on this data, we still ended up with low confidence and relatively high loss rates. We also saw that our network found very large rectangles in a given image covering many colors; in other words, the classification boundary was very uncertain. So what to do from here? We fed our network real-life images in try1, and strictly red-white images in try2, but were actually getting worse results. So, Prof. Belhumuer suggested we combine the two approaches: insert a Polish flag image into a real-life image.

try3

For our third and final try, we took images from the COCO dataset and superimposed Polish flags on them. While we were sure of this approach, our network still didn’t seem to give. Our loss failed to dip below 2, and while the validation accuracy improved some, it still didn’t get about 75%.

References/Resources

Made with in NYC.