14/14 Posenet rabbithole

As mentioned previously, my ICM finals was to design a Pose-Karaoke experience. For my motivations and background on the project, please go to the post here.

While I had major issues with getting the ICM code to work on the ml5js platform, much of it has been rectified by the maintainers of the code-base and this current example solves pretty much all the problems.

But while I was doing this, I did not have those luxuries. This resulted me in understanding how Javascript works and figuring out the issues myself. The main issue was that a P5Image does not have a img html tag that ml5js needs to be able to run the algorithm. (Funnily, it works for video. No clue why this is done this way). This was solved using an image tag. But the problem with taking a screenshot from the live video still remained. I soldiered on and found my redemption in the toDataUrl() method.

But this was the easy part.

While starting the project, I did not realise the complexity of comparing 2 poses. A lot of what I had to do relied on being able to compare 2 images and it wasn’t a trivial problem. Trawling through the depths of the internet, I came across this post by Google research where they had worked on a similar problem. This post is a wealth of information on how to compare poses and it was outside my technical ability to be able to incorporate everything in my work. But the chief things that I could incorporate were:

1) Cosine similarity: It is a measure of similarity between two vectors: basically, it measures the angle between them and returns -1 if they’re exactly opposite, 1 if they’re exactly the same. Importantly, it’s a measure of orientation and not magnitude.

2) L2 normalization: which just means we’re scaling the vector to have a unit norm. This helps in ensuring that the scale does not play a factor in comparison and the 2 images can be compared normally.

The cosine similarity helped my code run faster and the L2 normalization ensured that the relative distance from the camera won’t play a role in the comparison.

Getting these 2 things to work proved to be a big challenge and once that was done, the comparison went pretty smoothly as seen in the video below:

I ran out of time to build a complete experience for the users which involve an engaging UI but that gives me something to do for the winter break. While I could not match the scope I had set initially, I am very happy that I could dive into algorithmic complexities and solve those issues to make something working. This gives me a lot of hope for the future and my coding abilities. All in all, time well spent!

(2/14) Please insert disk to continue...

Week-2 in ICM was pretty fun and deep. The class-work was focusing on manipulating shapes using variables and a short introduction to transformations (Link) As a take-away assignment, We were asked to create a piece in which:

  • One element controlled by the mouse.

  • One element that changes over time, independently of the mouse.

  • One element that is different every time you run the sketch.

While trying to figure out what to do, I chanced upon the clocks assignment which sent me down a deep rabbit-hole. (Thanks Cassie!) Going through John Maeda’s work has always been educational but the clocks piece blew me away. I also realised that its a wonderful, self-contained assignment for visual programming newbies to test the limits of their creativity. More details can be found below:

Maeda’s 12 clocks

The JS port of the original 12 clocks by Coding Train

Golan Levin’s assignment based on clocks

Golan Levin’s INSANELY DETAILED lecture on clocks, time-keeping and it’s representation in New Media. (It’s really worth your time to read through!)

As a first exploration, I decided to focus on being able to capture the current time and represent it using simple shape creation techniques that I learnt in Week 1. I got stuck with trying to calculate the arc angle but Shiffman’s coding challenge on clocks came to the rescue.

Tycho’s new album-cover. Prints available on request.

Tycho’s new album-cover. Prints available on request.

The first clock: Link

So now that I was confident of being able to capture and manipulate the time variables, I decided to go for a number representation. I was looking at Shiffman’s background fade sketch and that jittery pattern was quite interesting. I decided to use that as a base for the fill of my number shapes.

The numbers were then drawn with a simple grid and by manipulating the frame rate and opacity, you could achieve a nice blur effect:

Find light in the beautiful sea

Find light in the beautiful sea

I added some interactivity but being able to change the background color on mouse click. I did not have time to try out more effects (Which I am coming to believe will be the leitmotif of projects at ITP) but I was pretty happy with the outcome this time. The full sketch can be found at: Link

While I am pretty happy with the lessons so far, I think that I need to update my ability to manipulate shapes using Maths(!!!). I wonder if there is a “Maths for programming newbies”. Adding interactivity to shapes using co-ordinate based manipulation is not going to get me too far.


Confidence: +3

Missions: 3/3

Secrets: 0

Currently listening: Clocks-Coldplay

Current level: Code-Scavenger. Update in progress (1/14)...

ICM Assignment 1

Status report:

The task, which I chose to accept, was to use the primitive shapes of p5js and to create a screen drawing of my liking.

The constraints that I set on myself for this assignment were:

  • To use the limits of the videosit as a constraint instead of using concepts which I know of (loops, variables etc.) to make my life easier.

  • To focus on understanding the limitations of the shapes for creating an image and where it can get really hairy. One of the first questions in my mind was how would shapes could be parallel, perpendicular or aligned to each other when it was difficult to calculate the exact dimensions of its vertices. (For example: A rectangle which is perpendicular/ parallel to a hypotenuse of a triangle).

While I was thinking of shapes, I started thinking of Monument valley and how pretty the game was but built on repeating shapes and patterns. It seemed like a good enough template to try things with. Also, the player is tasked with recovering ‘sacred geometry’ in the game which fits into the theme of the assignment quite well. =)

So pretty!

I spent some time playing the game again (Hey! It was ‘research’ *Ahem*) and looking at it as a collection of basic shapes was quite illuminating. I also immediately realised the problem with creating isometric shapes with only co-ordinate values without using math-magic and vectors. Needless, I decided to press ahead and see where it would take me.

As test subjects, I decided to focus on the main character of the game, Ida and her awesome, mute side-kick, Totem. I felt that they had the right amount of curves and shapes that would make it a challenging trial.

I decided to start with the Totem first because he(?) is awesome. I immediately decided to not do it in an isometric view and go with a flat view instead. In the spirit of the challenge, I decided to focus on the shapes on Totem’s body on the left side as they would be more challenging to pull off.

It started out pretty well…


The first 2 shapes were very simple. I discovered the beginShape() function which made it a breeze. Though, the process of finding out the coordinates was quite onerous. I haven’t done counting like that since I was in High School!

My initial plan was to make the whole grid of boxes of the Totem which can be then printed and shaped into a 3-D paper object. The best laid plans of mice and men…

Seemingly satisfied with my progress, I decided to tackle the shape pattern inside the 2nd box. And the limitations of using a basic coordinate system were immediately laid bare.

Align weird shapes, they said. It will be easy, they said…

Align weird shapes, they said. It will be easy, they said…



I could not, for the life of me, figure out how to maintain a consistent distance at an angle between two shapes. Well, not that much of a hacker, I guess.

The failed sketch can be found at: Link

However, I could not leave Totem unfinished! He(?) has already been through a lot. Since I had 4 surfaces to choose from, I chose a relatively easier one and finished the sketch.

Screenshot 2018-09-12 05.30.09.png


The sketch can be found at: Link

I gave the curve functions a spin and I thought that I understood them. But when it came to re-creating Ada or anything with specific, controlled curves, I failed miserably. I have no idea how to control the curves to flow into shapes that I want.

Ada will have to wait. But, I shall be back!

Major learnings:

  • I might be missing something but shapes seem to not be useful without a relative coordinate system. We invented computers so that we don’t have to manually figure out coordinates. But I have no idea how to do that.

  • I have no idea how to use the curve functions to form any kind of a remotely controlled curve. My respect for the programmers who wrote Illustrator’s Pen tool went up by a 100.

  • Must learn curves.



Confidence: +1

Ego: -5

Missions: 1/2

Secrets: 1

Currently listening: Tycho-Awake


Logo. Basic. Hyper-talk. C++. Html.

Most computer programming origin stories begin with a young child magically coming up on these weird and strange markings with their first ever computer () and that sets them on a journey of discovery, ecstacy and self-loathing across bazaars and cathedrals.

That’s not my story.

But that’s not because I hated computers.

In fact, I loved them too much.

I chatted with people on IRC over a 28 kbps dial-up connection. I saw the first geocities website being born. I was amongst the first Indians on Myspace. Hell, I made a myspace page for a friend’s band with blinking tags(!) and all the visual disasters that only Myspace could allow. (Yes, I am THAT old.)

But I hated programming. Whether it be C, Logo, Basic or html; It sucked the living soul out of my body and it was my childhood version of a dementor. I hated my time spent in engineering and happily gave up on it by jumping ship to ‘easier’ (or so I thought) pastures of UX design.

Which is where life had a sucker-punch waiting for me. We had to pick up Flash to prototype our UX designs and it was love at first sight. Remember the first time you do something in code and it changes visually? HOLY SHIT! (Am I allowed to swear here?)

It’s funny how destiny does that. So it goes.

And the flash development UI is one of the greatest things ever made. I will fight you on this.

But, back to the story. After picking up flash, my journey has taken me across different UX/UI jobs where I have managed to call myself a code-scavenger. I can put multiple things together to make it work. I have also been equally frustrated in being not being able to articulate the exact intent of a gesture or an interaction and having to resort to hand-waving (A LOT OF HAND-WAVING!) while my developer friends look at me with a mixture of pity and frustration. Which brings me to ITP. A very expensive rehab for a frustrated aspiring programmer.

My intent with the course is pretty basic:

  • Level up on my programming stats. Less Barbarian. More Mage.

  • Learn enough to be able to pick up new languages without sweating buckets. Form a framework to approach any new language and treat it the same as I would treat any new software.

  • Decrease the gap between coming up interesting, interactive ideas and the ability to express them in a working piece rather than using some hand-wavy wizardry. More demo, less after-effects.

  • Learn enough to be able to program new art and use machine learning in generative pieces and installations. I look at generative and installation artists with envy. I do not want to keep envying them.

Currently listening: Muse-Starlight