Tuesday, January 26, 2010

Idea for the Project session 2

1. Identifying the classes of objects and their number in an image.

Given an image database this solutions attempts to find the objects that are present in the image, as well as the number of them present.
as humans we are good at recognising images of people or partial images. as in we can see a half cut table or a human who is facing backwards.
these seem to be things that computers would find very difficult to pattern recognize. 
hence if we give the players an image and ask them to recognize what they see in the image.. come up with a game mechanic that makes this fun, we will be generating classification data that can be used to further tag the images.

2. Tagging a feeling for a song or a radio.

we identify songs with the feelings taht they bring to us. however this is not the classification that we see on music labels.
by asking the player to classify the music he is hearing into one of the categories or a category of his own, we can have a better selection for music.
the categories can be based on feelings alone. the songs can be fed in live from the various radio stations across the globe.
this will create a useful database of information.. 
further if a music band wants to get a feedback for a song that it has composed, in terms of the feeling it intends to evoke, it could be made a part of this game.
they can get instant feedbacks on how they feel.
all this needs is an addictive game that is designed around the concept of music classification. people just hear songs and play the game.. perhaps there may not be a verbal indication of how the song feels but a visually generated world into which player transcends into.. .. .. .. 

3. Separation of noise and actual information.. voice, background vehicle sound. 

we know that dropping some item in the background when we are talking to someone does not make us turn around unless that item is large enough.. or its a person. we instantly know what is important. however a computer finds it difficult to classify what is noise and what is not. so given a random sound track it could be achallenge for a person to identify how many people are there or tag the different sounds that he can hear in the track. in essence he is eliminating the noise and is tagging only what is essential.
for example, if its a clip of a car moving on the road.. the player will rightly identify it as the car in the background.. and perhaps the sound of brakes.
humans build a context as they hear a track, but computers dont.. humans are context sensitive,hence they can analyse a given sound track in greater deths.

on my way back home i met a friend from CMU who is doing his PhD in image processing. as we talked, one of the things he mentioned caught my attention.

4. Orientation of an object in an Image

it is non-trivial to find the orientation of the image in an object. however as humans we are gifted with the ability to extrapolate the information that is not present in the image. like if there is a surveillance camera shot of a person who is walking in the subway, obviously he would not be facing the camera always. as humans we can figure out that it is the same guy no matter what is orientation is upto a certain point. but it becomes a complicated problem for a computer to solve. so determining the orientation of objects in images could be a challenging problem that can be crowdsourced. given two images with different orientations of the same object, its just a matter of few seconds for a human to recognize the object, but a mammoth task for the image recognition algorithm. this also lends itself as a good game to the players. so proceeding in this lines might be potentially a good idea. 

other ideas that we discussed:
either composition or orchestration of music
  - could be used as a teaching tool for children (educational)
  - Musical associations - colors, images



No comments:

Post a Comment

Note: Only a member of this blog may post a comment.