Andrey Kurenkov's Web World Jekyll 2018-05-19T07:07:59-07:00 / Andrey Kurenkov / contact@andreykurenkov.com <![CDATA[Task Aware Grasping]]> /projects/research/task-aware-grasping 2018-04-29T00:00:00-07:00 2018-04-29T00:00:00-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>Arxiv paper explaining it all coming soon!</p> <p><a href="/projects/research/task-aware-grasping/">Task Aware Grasping</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on April 29, 2018.</p> <![CDATA[Effective Teamwork - a One Page Guide]]> /writing/project/effective-teamwork 2018-04-29T00:00:00-07:00 2018-04-29T00:00:00-07:00 Andrey Kurenkov www.andreykurenkov.com contact@andreykurenkov.com <blockquote> <p>“Fools learn from experience. I prefer to learn from the experience of others.” <br /> -Otto von Bismarck</p> </blockquote> <p>How this happened: I was helping out a team of undergrads and finding myself frustrated with their workflow. Then it occured to me my many years of experience with big team projects, professional software engineering, and grad-level research have led me to have some pretty strong beliefs about how effective teamwork ought to be done.</p> <p>So below is a one-page summary of all the lessons I have learned having done these many team projects (and having been a culprit in bad teammwork innumerable times). Enjoy!</p> <object data="/writing/files/effective-teamwork/EffectiveTeamwork.pdf" type="application/pdf" width="100%" height="100%"> <p><b>Effective Teamework</b>: This browser does not support PDFs. Please download the PDF to view it: <a href="/pdf/sample-3pp.pdf">Download PDF</a>.</p> </object> <p>Feel free to leave comments <a href="https://docs.google.com/document/d/1-dMVVf5Y0FaCXSW4P4V-vX5TF2yCmgT65S2Xxr2nlVo/edit?usp=sharing">on the Google Doc</a>! (pls dont spam thx).</p> <p><a href="/writing/project/effective-teamwork/">Effective Teamwork - a One Page Guide</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on April 29, 2018.</p> <![CDATA[Skynet Today]]> /projects/team_projects/Skynet-Today 2018-04-02T00:00:00-07:00 2018-04-02T00:00:00-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>A lot of work and team building. Visit the site!</p> <p><a href="/projects/team_projects/Skynet-Today/">Skynet Today</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on April 02, 2018.</p> <![CDATA[AlphaGo Zero Is Not A Sign of Imminent Human-Level AI]]> /writing/is-alphago-zero-overrated 2018-03-30T00:00:00-07:00 2018-03-30T00:00:00-07:00 www.andreykurenkov.com contact@andreykurenkov.com <h2 id="why-alphago-zero-is-great">Why AlphaGo Zero Is Great</h2> <p>Let’s start with the coverage about DeepMind’s recent successor to AlphaGo<sup id="fnref:AlphaGo"><a href="#fn:AlphaGo" class="footnote">1</a></sup>, AlphaGo Zero:</p> <ul> <li><a href="http://fortune.com/2017/10/19/google-alphago-zero-deepmind-artificial-intelligence/">“Google’s New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone”</a>: &gt; “While it sounds like some sort of soda, AlphaGo Zero may represent as much of a breakthrough as its predecessor, since it could presage the development of algorithms with skills that humans do not have. … AlphaGo achieved its dominance in the game of Go by studying the moves of human experts and by playing against itself—a technique known as reinforcement learning. AlphaGo Zero, meanwhile, trained itself entirely through reinforcement learning. And, despite starting with no tactical guidance or information beyond the rules of the game, the newer algorithm managed to beat the older AlphaGo by 100 games to zero.”</li> <li><a href="https://www.theverge.com/2017/10/18/16495548/deepmind-ai-go-alphago-zero-self-taught">“DeepMind’s Go-playing AI doesn’t need human help to beat us anymore”</a>: &gt; “The company’s latest AlphaGo AI learned superhuman skills by playing itself over and over”</li> <li><a href="https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own">“‘It’s able to create knowledge itself’: Google unveils AI that learns on its own”</a>: &gt; “In a major breakthrough for artificial intelligence, AlphaGo Zero took just three days to master the ancient Chinese board game of Go … with no human help”</li> <li><a href="https://www.inc.com/lisa-calhoun/google-artificial-intelligence-alpha-go-zero-just-pressed-reset-on-how-we-learn.html">” Google Artificial Intelligence ‘Alpha Go Zero’ Just Pressed Reset On How To Learn”</a>: &gt; “Alpha Go Zero is changing the game for how we solve big problems.”</li> </ul> <p>Point being: AlphaGo Zero (which we’ll go ahead and shorten to AG0) is arguably the most impressive and definitely the most praised<sup id="fnref:praised"><a href="#fn:praised" class="footnote">2</a></sup> recent AI accomplishment<sup id="fnref:unaware"><a href="#fn:unaware" class="footnote">3</a></sup>. Roughly speaking, AG0 is just a <a href="http://theai.wiki/Deep%20Learning">Deep</a> <a href="http://theai.wiki/Neural%20Network">Neural Network</a> that takes the current state of a Go board as input, and outputs a Go move. Not only is this much simpler than the original AlphaGo<sup id="fnref:simpler"><a href="#fn:simpler" class="footnote">4</a></sup>, but it is also trained purely through self-play (pitting different AlphaGo Zero neural nets against each other; the original AlphaGo was ‘warmed up’ by training to mimic human expert Go players). It’s not exactly right that it learns ‘with no human help’, since the very rules of Go are hand-coded by humans rather than learned by AlphaGo, but the basic idea that it learns through self-play rather without any mimicry of human Go players is correct. I’ll let the key researcher behind it expand on that:</p> <figure> <iframe width="560" height="315" src="https://www.youtube.com/embed/tXlM99xPQC8" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <figcaption>DeepMind's own explanation of why AG0 is so exciting. </figcaption> </figure> <p>So, surely DeepMind’s demonstration that an AI algorithm can achieve superhuman Go and Chess play purely through self-play is a testament to the usefulness of such techniques for solving the hard problems of AI? Well, to some extent yes — it has taken the field decades to get here, since the <a href="http://theai.wiki/Branching%20Factor">branching factor</a> of Go does indeed make it a challenging board game. This is also the first a time the same Deep Learning algorithm was used to crack both Chess and Go<sup id="fnref:general"><a href="#fn:general" class="footnote">5</a></sup>, and was not specifically tailored for it such as was the case with <a href="https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)">Deep Blue</a> (the much heralded machine of IBM that was the first to beat humanity’s best at Chess) and the original AlphaGo. Therefore, AG0 is certainly monumental and exciting work (and great PR).</p> <figure> <img src="/content/editorials/images/is-alphago-zero-overrated/history.png" alt="Game History" /> <figcaption>AlphaGo is the culmination of research into Game AI that stretches all the way back to the birth of AI as a research field. So, it is an inarguably great and historic achievement.<b><a href="http://www.andreykurenkov.com/writing/ai/a-brief-history-of-game-ai/">(source)</a></b></figcaption> </figure> <h2 id="why-alphago-zero-is-not-that-great">Why AlphaGo Zero Is Not That Great</h2> <p>With those positive things having been said, some perspective: AG0 is not really a testament to the usefulness of such techniques for solving the hard problems of AI. You see, Go is only hard within the context of the simplest category of AI problems. That is, it is in the category of problems with every property that makes a learning task easy: it is <a href="https://en.wikibooks.org/wiki/Artificial_Intelligence/AI_Agents_and_their_Environments">deterministic, discrete, static, fully observable, fully-known, single-agent, episodic</a>, cheap and easy to simulate, easy to score… Literally the only challenging aspect of Go is its huge branching factor. Predictions that <a href="http://theai.wiki/AGI">AGI (Artificial General Intelligence)</a> is imminent based only on AlphaGo’s success can be safely dismissed — <a href="https://medium.com/@karpathy/alphago-in-context-c47718cb95a5">the real world is vastly more complex than a simple game like Go</a>. Even fairly similar problems that have most but not all of the properties that make a learning task easy, <a href="/content/news/openai-dota-ii/">such as the strategic video game DotA II</a>, are far beyond our grasp right now.</p> <figure> <img src="/content/editorials/images/is-alphago-zero-overrated/venn.svg" alt="Venn" /> <figcaption>A (rough) diagram of AI problem complexity. Note that Go and (most) Atari games are in the same league as chess; just about the only distinction is branching factor. The techniques that power AG0 may solve games like Go, but as <a href="http://www.skynettoday.com/news/alphago/">I've written elsewhere in more detail</a> most AI problems are vastly more difficult — categorically different. </figcaption> </figure> <p>Another important thing to understand beyond the categorical simplicity of Go is its narrowness. AG0 is a definite example of <a href="http://theai.wiki/Weak%20AI">Weak AI</a>, also known as narrow AI. Weak AI agents are characterized by only being able to perform one ‘narrow’ task, such as playing a 19 by 19 game of Go. Though AG0 has the impressive ability to learn to play 3 different board games, it does so separately per game <sup id="fnref:separately"><a href="#fn:separately" class="footnote">6</a></sup>. And, it can only learn a vary narrow range of games: basically just 2-player grid based board games without any necessary memorization of prior positions or moves<sup id="fnref:memorization"><a href="#fn:memorization" class="footnote">7</a></sup>.</p> <figure> <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">&quot;Generalized AI is worth thinking about because it stretches our imaginations and it gets us to think about our core values and issues of choice and free will that actually do have significant applications for specialized AI.&quot; - <a href="https://twitter.com/BarackObama?ref_src=twsrc%5Etfw">@BarackObama</a> <a href="https://t.co/VFhJsMXuIq">pic.twitter.com/VFhJsMXuIq</a></p>&mdash; Lex Fridman (@lexfridman) <a href="https://twitter.com/lexfridman/status/976461233443561477?ref_src=twsrc%5Etfw">March 21, 2018</a></blockquote> <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> <figcaption>In a <a href="https://www.wired.com/2016/10/president-obama-mit-joi-ito-interview/">lengthy interview with Wired</a>, then-president Obama displayed an impressively nuanced understanding of the state of AI. If only some unnamed billionaires would communicate to the public similarly... </figcaption> </figure> <p>So, while AG0 works and its achievement is impressive, it is fundamentally similar to Deep Blue in being an expensive system engineered over many years with millions of dollars of investment purely for the task of playing a game — nothing else. Though Deep Blue was great PR for IBM, all that work and investment is not usually seen as having contributed much to the progress of broader AI research, having been ultra-specific to solving the problem of playing Chess. Just as with the algorithms that power AG0, human-tweaked heuristics and sheer computational brute force can definitely be used to solve some challenging problems — but they ultimately did not get us far beyond Chess, not even to Go. We should ask ourselves: can the techniques behind AG0 get us far beyond Go?</p> <figure> <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">&quot;Games (Chess, Go, Dota) represent closed systems, which means we humans filled the machine with a target, with rules. There is no automatic transfer of the knowledge that machines could accumulate in closed systems to open-ended systems.&quot; - Garry Kasparov <a href="https://t.co/ysdV7sG9Qv">pic.twitter.com/ysdV7sG9Qv</a></p>&mdash; Lex Fridman (@lexfridman) <a href="https://twitter.com/lexfridman/status/974624143441350658?ref_src=twsrc%5Etfw">March 16, 2018</a></blockquote> <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> <figcaption>Gary Kasparov, the very man who faced and ultimately lost to Deep Blue, on the limitations of Deep Blue and implicitly AlphaGo. </figcaption> </figure> <p>Probably, yes; the algorithms behind AG0 (Deep Learning and self-play) are inherently more general than human-defined heuristics and brute computation<sup id="fnref:likely"><a href="#fn:likely" class="footnote">8</a></sup>. Still, it is important to understand and remember the parallels between Deep Blue and AG0: <strong>at the end of the day, both Deep Blue and AG0 are narrow AI programs that were built (at least in part) as PR boons for large companies by huge teams at the costs of millions of dollars; they deal with problems which are difficult for humans, but which are also relatively simple for computers.</strong></p> <figure> <img src="/content/editorials/images/is-alphago-zero-overrated/ibm.png" alt="PR" /> <figcaption>"One day after its chess computer defeated Garry Kasparov, the world chess champion, I.B.M. stock surged to a 10-year high and was only a bit shy of its record." <b><a href="http://www.nytimes.com/1997/05/13/business/ibm-s-stock-surges-by-3.6.html">(source)</a></b> </figcaption> </figure> <p>I write this not to be controversial or take away from DeepMind’s fantastic work, but rather to fight against all the unwarranted hype AG0’s success has generated and encourage more conversation about the limitations of deep learning and self-play. More people need to <a href="https://arxiv.org/abs/1801.05667">step up</a> and <a href="https://www.alexirpan.com/2018/02/14/rl-hard.html">say this kind of stuff</a> for the general public as well as the AI research community to not be led astray by hype and PR.</p> <figure> <img src="/content/editorials/images/is-alphago-zero-overrated/hype.png" alt="AI Hype" /> <figcaption>AGI Doomsayers overhype the significance of things like AG0 while people like me try to counter them and bring about disillusionment; meanwhile there are plenty of ethical concerns and potential misuses of AI to worry about already. Let's hope we reach the plateau of productivity soon... <a href="https://en.wikipedia.org/wiki/Hype_cycle"><b>(source)</b></a></figcaption> </figure> <p>And all that aside, it should still be asked: might there be a better for AI agents to learn to play Go? The very name AlphaGo Zero is in reference to the idea that the model learns to play Go <a href="https://deepmind.com/blog/alphago-zero-learning-scratch/">“from scratch”</a>, without any further human input or explanation. But is learning ‘from scratch’ really such a good thing? Imagine you knew nothing about Go and decided to start learning it. You would definitely read the rules, some high level strategies, recall how you played similar games in the past, get some advice… right? And it indeed at least partially because of the learning ‘from scratch’ <strong>limitation</strong> of AlphaGo Zero that it is not truly impressive compared to human learning: like Deep Blue, it still relies on seeing orders of magnitude more Go games and planning for orders of magnitude more scenarios in any given game than any human ever does.</p> <figure> <img src="/content/editorials/images/is-alphago-zero-overrated/go_gif.gif" alt="Go GIF" /> <figcaption>The progression of AG0's skill. It is certainly impressive that it takes 'just' 3 days of non-stop computation to get to best-human-in-the-world skill. But perhaps we should also note it takes a whole day and orders of magnitude more games than humans get to experience in their lifetimes to get to an ELO score of 0 (which even the weakest human can do easily)? From <a href="https://deepmind.com/blog/alphago-zero-learning-scratch/"><b>"DeepMind's AlphaGo Zero Blog Post"</b></a> </figcaption> </figure> <h2 id="tldr">TLDR</h2> <p>So, let’s sum up: though AlphaGo and AG0’s achievements are historic and impressive, they also represent little if any progress in tackling the truly hard problems of AI (not to mention AGI). Still, as with any field all AI researchers stand on the shoulders of their predecessors; though these techniques may not foreshadow the coming of AGI, they are definitely part of the Deep Learning Revolution the field is still in the midst of and the ideas that they are based on will doubtlessly enable future progress. As with Deep Learning as a whole, it is important to appreciate these fantastic accomplishments for the field of AI without losing perspective about their limitations.</p> <div class="footnotes"> <ol> <li id="fn:AlphaGo"> <p>AlphaGo is the program that famously beat humanity’s best Go player Lee Sedol. <a href="#fnref:AlphaGo" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:praised"> <p>At least, praised by non-technical news coverage <a href="#fnref:praised" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:unaware"> <p>For those unaware, AlphaGo Zero is the name of an algorithm discussed in a paper late last year. <a href="#fnref:unaware" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:simpler"> <p>In contrast to AG0, AlphaGo involved several neural nets and features specific to Go. <a href="#fnref:simpler" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:general"> <p>There are many approaches to <a href="https://en.wikipedia.org/wiki/General_game_playing">General Game Playing</a> that cover much more than just Chess and Go, and neither AG0 nor any Deep Learning approach had yet to be compared to those in the standard competitions they are pitted against each other at. <a href="#fnref:general" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:separately"> <p>That is, there is not a single trained neural net that can play the 3 games, but 3 separate neural nets with one for each game. <a href="#fnref:separately" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:memorization"> <p>That is, on any given move the board contains all the necessary information to decide on the next move; no memory of the past required. <a href="#fnref:memorization" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:likely"> <p>It’s quite likely that right now researchers and engineers at DeepMind are working hard to demonstrate the next version of AlphaGo, which will presumably learn to play multiple games rather than just one. <a href="#fnref:likely" class="reversefootnote">&#8617;</a></p> </li> </ol> </div> <p><a href="/writing/is-alphago-zero-overrated/">AlphaGo Zero Is Not A Sign of Imminent Human-Level AI</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on March 30, 2018.</p> <![CDATA[The 2018 Best Picture Nominees Ranked, Reviewed, and Reflected Upon]]> /writing/art/oscar-noms-reviewed 2018-03-03T00:00:00-08:00 2018-03-03T00:00:00-08:00 Andrey Kurenkov www.andreykurenkov.com contact@andreykurenkov.com <h2 id="intro">Intro</h2> <p>When the 2018 Oscar best picture nominees were released, I realized I had already seen most of them before they were known to be the nominees purely because I was excited for them as works of art. Then, just for fun, I endeavored to watch the rest and cataloged my personal <a href="https://letterboxd.com/andreykurenkov/list/oscars-best-picture-2018-my-faves/">preferences</a>. But then I realized I could do more. I had reviews and further thoughts on most of these films, and could write up a whole big summary - what you are reading now! So without further ado, ordered from least to most liked for the sake of ending on the most positive note, reviews and thoughts on all the best pictures nominees:</p> <h1 id="darkest-hour">Darkest Hour</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/darkest.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/darkest.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-020318">Viewing Date: 02/03/18</h2> <h2 id="original-score-355">Original Score: 3.5/5</h2> <h2 id="original-review">Original Review:</h2> <p>A showy movie of romance and splendor - too much romance and splendor, in fact. Especially in its first half, the loud soundtrack, fancy panning/zooming/flying shots, and weird bouts of humor all add up to annoy and exhaust rather than highlight the excellent central performance. This improves in the second half, with show-off shots that are quite nice and serve to highlight the peril Churchill stands in. But, then the romantic bit cuts in and ruins that improvement with a bastardized version of history that does not seem realistic and <a href="http://time.com/5183140/gary-oldman-winston-churchill-true-history/">indeed is utterly false</a>. Still, the movie looks great, Oldman crafts a compelling portrayal, and overindulgence combined with a liberal interpretation of history are at least a far lesser sin than being boring or shallow.</p> <h2 id="current-score-355">Current Score: 3.5/5</h2> <h2 id="thoughts">Thoughts:</h2> <p>The only movie among all of these I would not call great. This is an unusual case of a movie on which I am split, as I can find both great and not so great elements within it. Sadly, great plus not so great equals not so great.</p> <p>As with the rest of these, my perspective on it has been affected a good deal by external listening and reading, in this case <a href="http://www.theqandapodcast.com/2017/12/darkest-hour-anthony-mccarten-q.html">The Q&amp;A with Jeff Goldsmith</a> which reveals many interesting tidbits such as:</p> <ul> <li>The linked interview is with the screenwriter Anthony McCarten, who has previously written The Theory of Everything. My impression of the critical response to that one was that a sort-of schmaltzy Oscar bait flick, which is a broadly correct characterization of this movie as well.</li> <li>The movie’s chief theme is the power of words, but the case of Churchill hardly seems like a good case for that; many figures (MLK Jr. comes to mind) have made far far more impactful speeches than the ones shown the film.</li> <li>There is also a focus on the wisdom of having uncertainty as a leader, but once more I don’t think Churchill and Dunkirk are the best possible frames to tell that story.</li> <li>Gary Oldman, who gives a memorable performance despite looking nothing like Churchill, had to go through three and a half hours of makeup each day to be able to look like the portly old Winston. Though it certainly is a good portrayal, it seems a bit overkill given that the movie’s showy direction ultimately overshadows the acting and plenty of fine actors who look more like Churchill could have done a fine job as well…</li> </ul> <h1 id="the-post">The Post</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/post.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/post.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-250218">Viewing Date: 25/02/18</h2> <h2 id="original-score-455">Original Score: 4.5/5</h2> <h2 id="original-review-1">Original Review:</h2> <p>A timely and smart allegory for the importance of a free press and the pervasiveness of sexism in the 1970s (and rather obviously, in the present). Strikes just the right balance of gripping-character-driven story and not-at-all-subtle social critique, and Spielberg is as good as ever at the basic craft of cinema. Hanks is a little predictable if solid, but Meryl Streep absolutely owns the role and makes the character come alive. Plenty of critics will just dismiss this as forgettable awards fluff, but when watched without cynicism it’s clear this is an excellently made, thoroughly enjoyable, and unusually intelligent film.</p> <h2 id="current-score-455">Current Score: 4.5/5</h2> <h2 id="thoughts-1">Thoughts:</h2> <p>I have seen this one recently, and unlike Darkest Hour I don’t think it is schmaltzy Oscat-bait at all despite looking like it might be. Spielberg smartly made it a lean and direct message-movie, and its direct address of the importance of the press and <a href="https://youtu.be/45r_klAYPbw?t=2m7s">the reality of sexism in society</a> is still as relevant as it was in the 1970s. And again, Meryl Streep is just so sublime, the movie is worth seeing for that alone. I leave you with a fantastic interview that delves into the impressive real life figure Streep portrayed so well:</p> <iframe width="560" height="315" src="https://www.youtube.com/embed/M56VRfUSqkA" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p><br /></p> <h1 id="get-out">Get Out</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/get_out.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/get_out.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-210118">Viewing Date: 21/01/18</h2> <h2 id="original-score-45">Original Score: 4/5</h2> <h2 id="original-review-2">Original Review:</h2> <p>A clever, effective, and timely satire horror comedy.</p> <h2 id="current-score-455-1">Current Score: 4.5/5</h2> <h2 id="thoughts-2">Thoughts:</h2> <p>One of the multiple movies among these for which my esteem has only grown. As with others, listening to conversations with its makers played an important role in evolving my view of it. In particular, the <a href="http://www.theqandapodcast.com/2017/02/jordan-peele-get-out-q.html">Q&amp;A with Jeff Goldsmith interview with the writer and director Jordan Peele</a> revealed many suble and smart aspects:</p> <ul> <li>Grounding the film’s horror in a universal experience (meeting a significant other’s parents) made it relatable for all viewers.</li> <li>Peele has had the idea for many years, and he decided to collaborate with Blumhouse Productions because they encouraged him to have full creative control and craft it in accordance with his vision. This has resulted in one of the most original and timely movies of the year.</li> <li>There is more subtle symbolism and implied detail than I realized while watching the film, which not typically true of such audience pleasers/crowd favorites.</li> </ul> <p>Additionally, others have done an excellent job aspousing how smart the movie actually is:</p> <iframe width="560" height="315" src="https://www.youtube.com/embed/Jdd0JF79q4I" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p><br /></p> <iframe width="560" height="315" src="https://www.youtube.com/embed/AJLHsXw-LFI" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p><br /></p> <h1 id="lady-bird">Lady Bird</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/lady_bird.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/lady_bird.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-181117">Viewing Date: 18/11/17</h2> <h2 id="original-score-455-1">Original Score: 4.5/5</h2> <h2 id="original-review-3">Original Review:</h2> <p>A great coming-of-age movie with fantastic acting, editing, writing, and… just about everything. A few bits of dialogue feel a bit too clever for their own good, and having a few scenes go slower might’ve been good, but the strength of the central performance and so many individual memorable moments alone make this utterly worth watching.</p> <h2 id="current-score-45">Current Score: 4/5</h2> <h2 id="thoughts-3">Thoughts:</h2> <p>A rare film for which my esteem has somewhat dropped. Some of the writing just feels too clever for its own good (like the ‘my mother made one mistake’ scene notably highlighted in the trailer), and like its protagonist the movie feels like its trying hard to be cool and smart but is not being vulnerable and honest in the process. Still, a fantastic growing of age movie, in particular because of the brilliance of lead actress Saoirse Ronan and writer/director Greta Gerwig - definitely some of the most exciting young talents in film today.</p> <iframe width="560" height="315" src="https://www.youtube.com/embed/iODwgFDvdC0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p><br /></p> <h1 id="phantom-thread">Phantom Thread</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/phantom_thread.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/phantom_thread.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-030218">Viewing Date: 03/02/18</h2> <h2 id="original-score-455-2">Original Score: 4.5/5</h2> <h2 id="original-review-4">Original Review:</h2> <p>A surprisingly sly, perverse, and darkly funny period romance. Exquisitely staged, exquisitely acted, exquisitely directed… exquisite.</p> <h2 id="current-score-55">Current Score: 5/5</h2> <h2 id="thoughts-4">Thoughts:</h2> <p>And now we get to the set of movies I unabashedly love! Once again my appreciation has grown due to <a href="http://www.bbc.co.uk/programmes/b09pn8pl">BBC Film Programme</a> interview:</p> <ul> <li>The key inspiration for the movie, the writer director Paul Thomas Anderson being sick and taken care of by his wife, is unusual and unexpected. He gets at aspects of romance and relationships not often explored, which is impressive.</li> <li>As the interviewer notes, on a second viewing it’s much easier to see the film as being utterly comedic rather than serious and dramatic like the director’s other work. I have been describing this as a ‘sly, intelligent, dark romantic comedy’ to friends, and it is refreshing to see such a non-formulatic romantic comedy.</li> <li>Another unexpected aspect is the richness of the female characters, who may be more fully three dimensional than Daniel Day Lewsis’s excellently portrayed male character. As Anderson says in his interview:</li> </ul> <blockquote> <p>“Daniel is front and center because he is Daniel… I think Audiences go in with expectation this is Daniel’s film, but in fact he is support for the girls who are our protags.”</p> </blockquote> <h1 id="shape-of-water">Shape of Water</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/shape_of_water.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/shape_of_water.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-091217">Viewing Date: 09/12/17</h2> <h2 id="original-score-455-3">Original Score: 4.5/5</h2> <h2 id="original-review-5">Original Review:</h2> <p>Like the fantastical creature at its core, The Shape of Water is beautiful, strange, and exquisitely crafted.</p> <h2 id="current-score-45-1">Current Score: 4/5</h2> <h2 id="thoughts-5">Thoughts:</h2> <p>While watching the movie, I was struck by how unabashedly and unreservedly beautiful it was. So I was glad to hear that was Guillermo del Toro’s intent; as stated on the <a href="http://www.bbc.co.uk/programmes/b09qhbt8">BBC Film Programme</a></p> <blockquote> <p>“The overwhelming reaction is the same overwhelming reason why I wanted to make it, which is, can you please show me something beautiful, can you please show me something life affirming, can you please take me out of the news, right now…”</p> </blockquote> <p>That being said, it is also one of the few movies I consider great for which my esteem has lessened. A big part of the reason is the lengthy discussion on <a href="https://megaphone.link/PPY7532927140">the Next Picture Show podcast</a>, in which most of the participants felt tepid about the film. I agree with their perspective that the movie feels a little artificial in all its beauty, and that going as far as it does in portraying the romance explicitly undercuts it somewhat. Still, this is a rare simple allegory of sensual delights that is definitely worth seeing.</p> <h1 id="dunkirk">Dunkirk</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/dunkirk.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/dunkirk.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-230717">Viewing Date: 23/07/17</h2> <h2 id="original-score-45-1">Original Score: 4/5</h2> <h2 id="original-review-6">Original Review:</h2> <p>Visceral. Unlike Nolan’s many other neat-idea or plot-puzzle movies, this one is made with the obvious intent of making you feel an experience to your bones. But, some of Nolan’s fondness for fancy intellectually fun structure is here - there is no conventional 3 act structure or protagonist with an arc and instead an ensemble cast going through overlapping but temporally offset narratives that come together by the end. Yet more daringly, the ensemble story has minimalist character development and plot, which leaves it free to almost entirely avoid exposition and fully immerse you in the characters’ dreaded pulse-pounding experiences. And does it ever do that - especially in the perfect-for-this-film IMAX, it’s great big stark melancholy colored images, seemingly never stopping soundtrack, and clearly big budget production all work to grab you and not let you go til the movie is over and you can cry along with the soldiers at the plane sight of English greenery.</p> <p>This all is great and for the most part executed beautifully, except that I am not sure if the 3 narrative threads really work well together, and by the end the cross-cutting between narrative threads gets overdone and lessens the impact of the ending a good deal. And, after that ending there is just not a whole lot to think back on or relate to; it’s a hell of a ride, but not much more. It made me think back to Gravity, which is likewise a non-stop visceral tale of survival, but one that I think managed to still have a resonant theme and message. But damn, is this one of a hell of a ride.</p> <h2 id="current-score-455-2">Current Score: 4.5/5</h2> <h2 id="thoughts-6">Thoughts:</h2> <p>So many thoughts… as indicated in the review I was unsure of Nolan’s layered fancy structure was actually warranted, but I have since grown to appreciate it more. Once, much insight comes from an interview on the <a href="http://www.bbc.co.uk/programmes/b08y26qr">BBC Film Programme</a>:</p> <ul> <li>The movie is discussed as ‘The most expensive experimental film in cinema history’, which feels quite right. Just the audacity to do something so intricate and out there in a big budget movie earns a lot of respect from me.</li> <li>The idea of the structure came before writing the script, which makes sense; the script is utterly</li> <li>Nolan describes the movie as being ‘Intensively subjective’, and justifies the structure by its ability to get across the bigger picture of the Dunkirk event while also staying extremely close to the boots-on-the-ground soldiers. He has a nice line of not having wanted to cut to a room of general discussing the situation, and indeed he does not, and the movie is far far stronger for it.</li> <li>With regard to the experimentation of the film, Nolan says that he wants to work at the ‘Edge of what mainstream audience can get enlivened by’ and feels that audiences that don’t treat movies like puzzles but rather just let ‘the experience of the movie wash over them’ will enjoy them most. I feel that Nolan’s movies are far too often treated as puzzles, and liked his perspective on this a lot.</li> <li>Nolan commented that cinema has unique ability to manipulate perception of time, and that his intent pull with this film was to ‘pull that part of machinery out’ and have an ‘explicit discussion with the audience’ rather letting it be subliminal as in most movies.</li> <li>What I appreciated most, and still appreciate most, about the film is its ability to make me feel like I was there. It was evident this was not accidental * Nolan discusses at length of how he wanted to make it feel as real as possible by setting things in real physical spaces, and by leveraging the grammar of suspense ‘present tense narrative’. That is, he focused on just showing the character in a given situation without extra dramatization or characterization; such daring minimalism is rare and was incredibly successful here.</li> <li>A last interesting tidbit is that early in the music composition, Nolan sent the composer a recording of a pocket watch. Unlike most movies, which add music after the movie is made, composition was done from very early on so that the ‘score could be fused with picture and sound effects early on’. This worked fantastically well - as Nolan says, there is a ‘physical fusion of picture and music and sound effects unlike anything else’. The music is ‘objective’; just the screenplay is stripped of all unnecessary dialogue (all conversation except that which is needed to survive), the music is stripped of all unnecessary emotion and is meant to convey the experience of the preset moment. But the intelligence and the brilliance of the score goes even beyond that, as highlighted in this video:</li> </ul> <iframe width="560" height="315" src="https://www.youtube.com/embed/LVWTQcZbLgY" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p><br /></p> <h1 id="call-me-by-your-name">Call Me By Your Name</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/name.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/name.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-210118-1">Viewing Date: 21/01/18</h2> <h2 id="original-score-55">Original Score: 5/5</h2> <h2 id="original-review-7">Original Review:</h2> <p>Luscious, sensual, rapturous - perfectly captures the feeling of being young during a care free summer of discovery that is equal parts exciting and nerve wrecking. And it does so beautifully, artistically; there are so many close ups and frames in this movie that convey an endless depth of emotion and thought without a word being spoken. And it does so intelligently, thematically tying back to the greeks’ views on sensuality and love; unlike so much of recent 80-set media, this does not feel nostalgic - it feels timeless.</p> <p>And the music! Sufjan Stevens is such a perfect fit for this, singing in his soft voice of the ‘Mystery of Love’. And the writing! Adapted by now 89-year old James Ivory, it feels wise and fully aware of the subtle unspeakable eternal truths of life. I credit Ivory too, with the unusual lack of tension over being found out as gay and the character’s fully supportive parts; as discussed <a href="https://www.dailyemerald.com/2018/02/05/call-name-screenwriter-james-ivory-returns-uo-screening-qa/"> in an interview </a>, this felt like a welcome change from more conventional cinema:</p> <blockquote cite="https://www.dailyemerald.com/2018/02/05/call-name-screenwriter-james-ivory-returns-uo-screening-qa/"> ... it’s been very hard to find a gay film which was about happiness and joy and love,” Ivory said during the Q&amp;A after the film. ... The moderator of the event, cinema studies instructor Sergio Rigoletto, said that one of the revolutionary things about “Call Me By Your Name” is that it allows its gay characters to be in love without the impending threat of punishment. Sure, the fear of being found does live in the back of their minds, but their main reasons for not being able to be together are their age differences and the fact that Elio and Oliver only have six weeks of their summer together. </blockquote> <p>So, the movie is joy, happiness, love. And the final shot! Watch this film, if only for the final conversation, and the final shot.</p> <p>“I remember everything”</p> <h2 id="current-score-55-1">Current Score: 5/5</h2> <h2 id="thoughts-7">Thoughts:</h2> <p>Well, my appreciation of this movie has certainly not lessened - everything that I said in that original review, I still feel. But as with Three Billboards, hearing interviews and watching videos has also deepened my appreciation of the movie. In particular, the lengthy and detailed interview of the director and co-author by Luca Guadagnino on the <a href="http://www.theqandapodcast.com/2017/12/call-me-by-your-name-q.html">The Q&amp;A with Jeff Goldsmith</a> highlights many great elements:</p> <ul> <li>The protagonist is very well portrayed as what he is - a 17 year old kid, a “mini bomb of different contradictions”. He has not come to think of himself as belonging to any group or label, and is appropriately excited and afraid of the desires he feels.</li> <li>The interview frankly discusses the issue of the age difference between the 17-year old protagonist and his 24-year old lover. For one the age of consent in Italy is 16, and for another this has been done before with such classics as Dirty Dancing. But far more importantly, the movie does something many romantic movies without such an age difference do not: show direct and clear disclosure of one’s feelings, portray explicit and enthusiastic consent during sex, and linger on direct discussions over each person’s concerns and desires. And it does so while not losing any sense of romance, passion, or beauty - what a feat!</li> <li>The director would not change anything about the movie if here to do it over. And he is right! I would not want him to.</li> <li>Sufjan Stevens came into collaboration on the movie’s soundtrack after a large amount of narration was cut out from the original screenplay. Stevens’ songs are meant to act as a sort of third person narrator communicating the universality of the love story we are seeing, and I think that was done perfectly.</li> </ul> <iframe width="560" height="315" src="https://www.youtube.com/embed/KQT32vW61eI" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p>Lastly, what I’ve grown to appreciate even more than I did after just watching the movie is this element of portraying an utterly happy love story (even if it ends with separation, rather than marriage). As <a href="https://youtu.be/5u2MAUPbFxo?t=2m45s">stated perfectly by the book’s author in an interview of both him and the movie’s director</a>,</p> <blockquote> <p>“Both the novel and film do one thing that is so essential - there is no accident, there is no death, there is no banning of any sexual proclivity - these are two individuals who have a relationship, and I think it should serve as a model for essentially happy romance.”</p> </blockquote> <iframe width="560" height="315" src="https://www.youtube.com/embed/eE01rqDJ13A" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p><br /></p> <h1 id="three-billboards-outside-ebbing-missouri">Three Billboards Outside Ebbing, Missouri</h1> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-03-03-oscar-noms-reviewed/three_billboards.jpg"><img class="postimageactual" src="/writing/images/2018-03-03-oscar-noms-reviewed/three_billboards.jpg" alt="Poster" /></a></p> </div></figure> <h2 id="viewing-date-210118-2">Viewing Date: 21/01/18</h2> <h2 id="original-score-455-4">Original Score: 4.5/5</h2> <h2 id="original-review-8">Original Review:</h2> <p>Challenging, unexpected, dark, funny - a rare treat. Definitely worth seeing.</p> <h2 id="current-score-55-2">Current Score: 5/5</h2> <h2 id="thoughts-8">Thoughts:</h2> <p>A rare movie for which my esteem has only grown, and in just one month.</p> <p>I think what I appreciate so much about this film (and why I think it deserves to win best picture) is that it is far more challenging that any of the other films on here. As eloquently stated by the writer and director Martin McDonagh on <a href="https://www.bbc.co.uk/programmes/b09l203n">the BBC Film Programme</a>, the core of the film is two characters going to war in which neither one is really the bad guy. It would be so easy to make the grieving mother the one to clearly root for and to portray the authorities as incapable and uncaring. But as <a href="http://www.denofgeek.com/uk/movies/martin-mcdonagh/54436/martin-mcdonagh-interview-three-billboards-outside-ebbing-missouri">McDonah says</a>:</p> <blockquote> <p>“Even though she’s technically the hero of the piece she does things that are way out of order, indefensible. That’s part of why I really like her, the character and Frances’ playing of her, she’s really three dimensional and not somebody you could say is the perfect person at all. That’s good characterisation, I think, and makes it hopefully a film you can see more than once. It’s not a simple heroes against villains story.”</p> </blockquote> <p>The very simple and clear setup of this conflict - the titular three billboards - is also utterly inspired and immediately interesting. So it’s unsurprising that such a striking moment <a href="http://www.denofgeek.com/uk/movies/martin-mcdonagh/54436/martin-mcdonagh-interview-three-billboards-outside-ebbing-missouri">is actually drawn from the real world</a>:</p> <blockquote> <p>“I saw something on a bunch of billboards about twenty years ago which is almost identical to what we see on the first two billboards, they’re literally verbatim.”</p> </blockquote> <p>There was some controversy and negative takes about this film, with regard to whether it ultimately redeemed the racist and violent police officer played by Sam Rockwell. I am sure this controversy will likely make this movie not receive the best picture, but I also just don’t agree with it at all. Like the non-perfection of the central character, the ‘redemption’ of Rockwell’s character is actually completely subverted in opposition to what would typically be done in Hollywood. The morally ambiguous and subversive ending is actually one of my favorite things about the movie, and I recommend it to anyone who is a fan of black comedy, complicated explorations of morality, and just great cinema.</p> <iframe width="560" height="315" src="https://www.youtube.com/embed/NbNNNCjm32M" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe> <p><a href="/writing/art/oscar-noms-reviewed/">The 2018 Best Picture Nominees Ranked, Reviewed, and Reflected Upon</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on March 03, 2018.</p> <![CDATA[The Sickness That Is Depression]]> /writing/life/conveying-depression 2018-02-28T00:00:00-08:00 2018-02-28T00:00:00-08:00 Andrey Kurenkov www.andreykurenkov.com contact@andreykurenkov.com <p>Note: best read on a sizeable computer screen</p> <div><button class="btn" data-toggle="collapse" data-target="#foreword"> Foreword &raquo; </button></div> <blockquote class="aside"><p id="foreword" class="collapse" style="height: 0px;"> Like most woke twenty-somethings, I broadly agree with the often expressed notion that mental illness should be de-stigmatized. I am also often annoyed with the ambiguity of empty pronouncements in support of that cause. How does that de-stigmatization actually happen? <br /><br /> I suppose this is my own answer to that question: it happens at least partially through people like me publicly writing about their experience with depression and its associated side effects. And it happens through more people understanding what depression actually is and what people suffering from it have to deal with. <br /><br /> I suffered from a severe episode of clinical depression for the last few months of 2017, from about mid September to just about the end of December. It was so severe that I started having (completely unprecedented) suicidal thoughts, and sought consultation with both a therapist and a psychotherapist. Having recovered from it for almost two months now, I can see in hindsight that part of what made it so severe was that I was just not ready to comprehend or accept the scope of depression’s ability to warp my thoughts, outlook, and physical condition. <br /><br /> I can also see in hindsight that, frankly, the stigma affected me also. I did not share that I was afflicted with severe depression with many of my best friends, nor with most of the people I work with. Many of those people would be surprised to find this out, as I often masked any sign of it by acting cordial and energetic. So, this is my admission to having worn that mask myself. <br /><br /> But I hope to do more that just be open and honest about depression; I hope to promote understanding of it. I hope to expose the experience of the sickness that is depression dissected, spread out, annotated. I hope to make it easier to understand for those who have not dealt with it that depression is a profoundly strange and terrible illness, yes <b>illness</b>, and that those suffering from it are dealing with a lot more than just a period of sadness. Hopefully, if you read this those things will come across. <br /><br /> </p></blockquote> <p><br /></p> <iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/vWGsVfxvB3E?autoplay=0&amp;loop=1&amp;rel=0&amp;controls=0&amp;playlist=vWGsVfxvB3E" frameborder="0" allow="encrypted-media" allowfullscreen=""></iframe> <figure class="sidefigureright"> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/rothko.jpg" alt="Mark Rothko, Untitled " /> <figcaption>Mark Rothko, Untitled <a href="http://www.artnet.com/artists/mark-rothko/untitled-j7Y_lpV8AsioM5amWt1tjQ2"><b>(Source)</b></a></figcaption> </figure> <figure class="sidefigureleft"> <img class="postimagesmall" style="width:70%;" src="/writing/images/2018-02-26-depression/rothko_brown.jpg" alt="Mark Rothko, Untitled " /> <figcaption>Mark Rothko, Untitled <a href="http://www.artnet.com/artists/mark-rothko/untitled-j7Y_lpV8AsioM5amWt1tjQ2"><b>(Source)</b></a></figcaption> </figure> <p>This is how I remember depression <sup id="fnref:Clinical"><a href="#fn:Clinical" class="footnote">1</a></sup>. <br />It was not exactly like this; there were highs, lows, bad days, good days.</p> <p>But this is how I remember it.</p> <p>Tired. So tired, cold, ambivalent. Anxious, miserable, uncomfortable, unfocused<sup id="fnref:Symptoms"><a href="#fn:Symptoms" class="footnote">2</a></sup>.</p> <p>That’s depression. <br />Except, these are just words. <br />Depression cannot be conveyed with just such words.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-02-26-depression/matta_black_idea.jpg"><img class="postimage" src="/writing/images/2018-02-26-depression/matta_black_idea.jpg" alt="Roberto Matta, The Black Idea" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Roberto Matta, The Black Idea</p> </figcaption></figure> <figure class="sidefigureleft"> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/honaker_lost.jpeg" alt="Honaker" /> <figcaption>Edward Honaker, II <a href="http://www.edwardhonaker.com/booktwo/"><b>(Source)</b></a></figcaption> </figure> <figure class="sidefigureright"> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/honaker_headshake.jpg" alt="Honaker" /> <figcaption>Edward Honaker, II <a href="http://www.edwardhonaker.com/booktwo/"><b>(Source)</b></a></figcaption> </figure> <p>You know how being in a bad mood is like a filter that squeezes out the joy from otherwise pleasant things? <br />Imagine being in the worst mood of your life, for months<sup id="fnref:Months"><a href="#fn:Months" class="footnote">3</a></sup>.</p> <p>You know the difference between dragging yourself to work after waking up with a killer hangover, and heading home having just exercised? <br />Being depressed was like a perpetual hangover, and being healthy again was like having a perpetual dopamine high<sup id="fnref:BrainStuff"><a href="#fn:BrainStuff" class="footnote">4</a></sup>.</p> <p>You know how when you are drunk, or high, you think thoughts that seem absurd in retrospect? <br />That happens all the time, constantly, with depression. Eventually I learned to distrust my ‘depressed self’.</p> <p>You know that feeling of getting in a cold shower and having your body revolt, the feeling of ‘<strong>make this stop</strong>’? <br />It was like that, every morning, being hit with a wave of formless misery moments after waking up, like ‘oh, right, <strong>this</strong>’.</p> <p>You know how it feels when you are at a social event, and just can’t seem to get along with anyone - alienated ? <br />Depression makes you feel like that, even among friends.</p> <p>You know how nice it is to warm up by a fire after being cold, and how painful it is to then leave that fire? <br />Spending time with friends or family felt like that; that warmth soon went away to be replaced by the familiar cold.</p> <p>You know the difference between feeling dead tired and ambivalent about everything except sleep and the feeling of having just awoken on the first day of a vacation? <br />One day I awoke and felt that I really was healed of depression, and it was like that.</p> <p>You know that feeling of doing something you could not care less about, just wanting it to be done with? <br />Being alive felt like that.</p> <p>I had supportive family, friends, security, and prospects. <br />I excercised, meditated, slept well, took on a lighter workload, had a good diet. <br />But it did not matter. It still took me months, and medication, to improve<sup id="fnref:Medication"><a href="#fn:Medication" class="footnote">5</a></sup>. <br />I still had suicidal thoughts<sup id="fnref:SideEffect"><a href="#fn:SideEffect" class="footnote">6</a></sup>.</p> <p>That’s depression. <br />Except, it was not a feeling. <br />It was a state, a condition, a sickness. A sickness I was stuck with.</p> <figure class="sidefigureleft"> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/depression_comics_normal.jpg" alt="Depression Comix" /> <figcaption><a href="https://www.depressioncomix.com/?"><b>Depression Comix, Being</b></a></figcaption> </figure> <figure class="sidefigureright"> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/depression_comics_future.jpg" alt="Depression Comix" /> <figcaption><a href="https://www.depressioncomix.com/?"><b>Depression Comix, To See A Future</b></a></figcaption> </figure> <p><br /></p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-02-26-depression/honaker_white.jpg"><img class="postimage" src="/writing/images/2018-02-26-depression/honaker_white.jpeg" alt="Edward Honaker, II" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Edward Honaker, II</p> </figcaption></figure> <p><br /><br /></p> <p>It was like I was no longer really myself. <br />I did not have interests, energy, drive - all that was left was a dull miserable husk of myself.</p> <p>It was like I was just play-acting as Andrey. <br />The play-acting turned out to be easy, and occasionally it made me feel better.</p> <p>It was like a temporary state of insanity. <br /> If you have ever been in love, the feeling of temporary insanity was like that.</p> <figure class="sidefigureleft"> <br /> <img class="postimagesmall" src="/writing/images/2018-02-26-depression/bacon_untitled_crouching_figures.jpg" alt="Bacon" /> <figcaption>Francis Bacon, Two Figures At a Window</figcaption> </figure> <figure class="sidefigureright"> <br /> <img class="postimagesmall" src="/writing/images/2018-02-26-depression/bacon_figure_in_sea.jpg" style="width:80%;" alt="Bacon" /> <figcaption>Francis Bacon, Figure In Sea</figcaption> </figure> <p>It was like sleep was my only goal, my only escape, every day. <br />I just wanted to close my eyes. I just wanted to lie down still in a dark silent room.</p> <p>It was like being up high and struggling for breath, or underwater running out of air. <br /> I thought about thinking about suicide - it became mundane and tiresome.</p> <p>It was like I could no longer see a future, hated the present, was haunted by the past. <br />I wanted nothing, was incapable of wanting to strive for anything.</p> <p><a href="https://youtu.be/IHnHpX77oKs">It was like life was just a thing that I did</a>. <br /> And that thing had lost its appeal.</p> <p>It was like there was some horrible droning static always there. <br /> The music that once brought me joy just added to that static.</p> <p>It was like I was some Dickensian street urchin, scraping by on every piece of happiness. <br />Every free meal I got felt like an odd victory that I devoured.</p> <p>It was like life had become some modern piece of art, inscrutable and cold, nonsensical. <br />Life never felt more absurd, more grotesque, more pointless <sup id="fnref:Absurd"><a href="#fn:Absurd" class="footnote">7</a></sup>.</p> <p>It was like all my anxieties and insecurities grew to a mammoth size and crushed me. <br /> I am and was hard working, fit, smart, gregarious. And still I felt like a worthless worm.</p> <p>It was like a dream where I walked around in a snow storm wearing only t-shirt and shorts, in a big dead city surrounded by towering gray skyscrapers. <br />And somehow this is supposed to make sense.</p> <p>It was like no matter how hard I tried, I could not feel good. <br /> Often the best I could do was numb the pain. My failure to improve made me feel worse. <br /> I meditated, exercised, socialized, went for long walks, read, watched movies. <br /> All of it helped, but none of it made the depression end<sup id="fnref:Self-Care"><a href="#fn:Self-Care" class="footnote">8</a></sup>. It just did not make sense, being so helpless.</p> <p>That’s depression. <br /> Except, it did make sense. <br /> I was afflicted with one of the most common but least understood of sicknesses <sup id="fnref:Common"><a href="#fn:Common" class="footnote">9</a></sup>.</p> <figure class="sidefigureleft"> <br /> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/oswaldo_el_grito_1.jpeg" style="width:73%;" alt="Bacon" /> <figcaption>Oswaldo Guayasamín, El Grito 1</figcaption> </figure> <figure class="sidefigureright"> <br /> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/oswaldo_el_grito_3.jpg" alt="Bacon" /> <figcaption>Oswaldo Guayasamín, El Grito 3</figcaption> </figure> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-02-26-depression/oswaldo_waiting.jpg"><img class="postimage" src="/writing/images/2018-02-26-depression/oswaldo_waiting.jpg" alt="History" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Oswaldo Guayasamín, Waiting</p> </figcaption></figure> <figure class="sidefigureleft"> <br /> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/picasso_weeping_woman.jpg" alt="Bacon" /> <figcaption>Pablo Picasso, The Weeping Woman</figcaption> </figure> <figure class="sidefigureright"> <br /> <img class="postimagesmaller" src="/writing/images/2018-02-26-depression/picasso_crossed_Arms.jpg" alt="Bacon" /> <figcaption>Pablo Picasso, Woman with Folded Arms</figcaption> </figure> <p>It did make sense, just not to my sick mind.</p> <p>It did make sense that I needed treatment and when I got it I got better. <br /> Most dont<sup id="fnref:Treatment"><a href="#fn:Treatment" class="footnote">10</a></sup>.</p> <p>It did make sense that I spoke of it infrequently. <br /> I knew it was a sickness, yet I still felt ashamed, like I had allowed myself to fall to a stupid first world ‘illness’ <sup id="fnref:FirstWorld"><a href="#fn:FirstWorld" class="footnote">11</a></sup>.</p> <p>It did make sense that I could not think clearly, could not make decisions easily. <br /> I was lucky to have had others to help me seek treatment and not to give up on my aspirations.</p> <p>It did make sense that I got sick. <br /> I had stress, burnout, family history <sup id="fnref:Family"><a href="#fn:Family" class="footnote">12</a></sup>, prior episodes <sup id="fnref:Recurrence"><a href="#fn:Recurrence" class="footnote">13</a></sup>, failures, mistakes, changes. <br /> So much to explain it<sup id="fnref:NoReason"><a href="#fn:NoReason" class="footnote">14</a></sup>. <br /> And still I doubted I was truly depressed, and delayed getting treatment.</p> <p>That’s depression. <br /> Except, that <strong>was</strong> my depression. <br /> I got treatment, rest, support, and I am no longer sick.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-02-26-depression/tarkovsky_candle.png"><img class="postimage" src="/writing/images/2018-02-26-depression/tarkovsky_candle.png" alt="History" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Andrei Tarkovsky, Nostalghia (frame)</p> </figcaption></figure> <p>Having healed, I relish just being alive, just being myself.</p> <p>Having healed, my nightmares mostly revolve around having a relapse.</p> <p>Having healed, my energy and enthusiasm feel boundless.</p> <p>Having healed, I am still taking antidepressants, still seeing a therapist, doing all I can to remain healthy.</p> <p>Having healed, the sounds of music often make me indescribably ecstatic.</p> <p>Having healed, I understand depression better, and hate and fear and loathe it like few other things<sup id="fnref:Romanticizing"><a href="#fn:Romanticizing" class="footnote">15</a></sup>.</p> <p>Having healed, the anxieties and uncertainties and insecurities that so haunted me before barely bother me.</p> <p>Having healed, what I remember most vividly is not the suicidal thoughts, the inability to focus, the anxiety, the shame; what I remember most is lying in my room at 8PM, and feeling like there was nothing else at all that could make me feel better - that sensation of nothing else but this quiet, this stillness, this darkness.</p> <p>Having healed, I write this, reflecting on that period of surreal sickness. <br /> I see the need to de-stigmatize and explain my experience far more than I did before<sup id="fnref:Stigma"><a href="#fn:Stigma" class="footnote">16</a></sup>.</p> <p>So don’t feel bad for me. <br /> I seriously feel superhuman, having healed of this sickness. <br /> Just, try to understand.</p> <hr /> <p>While reading this, if you felt you related and may be going through depression, <strong>please</strong> seek consulation with a mental health professional and talk to your friends and family, or at least complete a short <a href="https://depression.org.nz/is-it-depression-anxiety/self-test/">self-test</a> to evaluate whether talking to others is likely warranted. If you are currently struggling with depression, <a href="https://www.everydayhealth.com/depression/guide/resources/">this is a list of resources that might help</a> and if you are in a truly bad place please be aware of the <a href="https://suicidepreventionlifeline.org/">suicide prevention lifeline</a>. When I was depressed it felt like it was never going to end, but it did, and I fully attribute seeking treatment and support as the cause for my recovery. Trust me, it may feel like it will never end, but it will.</p> <hr /> <div class="footnotes"> <ol> <li id="fn:Clinical"> <p>That is, <a href="https://www.webmd.com/depression/guide/major-depression#1">clinical</a> depression, which I was diagnosed with having. It is a <a href="https://www.ncbi.nlm.nih.gov/books/NBK64063/">diagnosable condition</a>: “<br />Diagnostic criteria:<br />Depressed mood and/or loss of interest or pleasure in life activities for at least 2 weeks and at least five of the following symptoms that cause clinically significant impairment in social, work, or other important areas of functioning almost every day<br />1.Depressed mood most of the day.<br />2.Diminished interest or pleasure in all or most activities.<br />3.Significant unintentional weight loss or gain.<br />4.Insomnia or sleeping too much.<br />5.Agitation or psychomotor retardation noticed by others.<br />6.Fatigue or loss of energy.<br />7.Feelings of worthlessness or excessive guilt.<br />8.Diminished ability to think or concentrate, or indecisiveness.<br />9.Recurrent thoughts of death (APA, 2000, p. 356).” <a href="#fnref:Clinical" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Symptoms"> <p>And these were just my symptoms. <br />Again, <a href="https://www.webmd.com/depression/guide/detecting-depression#1">the list goes on</a>: <br />Fatigue, <br />Feelings of guilt, worthlessness, and helplessness, <br />Pessimism and hopelessness, <br />Insomnia, early-morning wakefulness, or sleeping too much, <br />Irritability, <br />Restlessness, <br />Loss of interest in things once pleasurable, including sex, <br />Overeating, or appetite loss, <br />Aches, pains, headaches, or cramps that won’t go away, <br />Digestive problems that don’t get better, even with treatment, <br />Persistent sad, anxious, or “empty” feelings, <br />Suicidal thoughts or attempts. <a href="#fnref:Symptoms" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Months"> <p>This is important to understand. <a href="http://www.mydr.com.au/mental-health/depression-q-and-a">Depression is <strong>not</strong> a temporary mood</a>: “It is natural to temporarily feel ‘down in the dumps’ from time to time, especially if you are going through an upheaval, loss or stressful situation. Some people refer to this as ‘feeling depressed’. However, if these feelings are intense and persist over weeks or months and if they stop you enjoying or even doing your normal activities, it’s likely that you have depression. Depression is a serious illness that can have a great impact on your everyday life. It’s not something you can normally ‘just snap out of’.” <a href="#fnref:Months" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:BrainStuff"> <p>The science is <a href="https://www.scientificamerican.com/article/is-depression-just-bad-chemistry/">still unclear</a> on the precise physical causes for clinical depression, and indeed it is most likely caused <a href="https://www.thecut.com/2018/01/antidepressants-and-the-chemical-imbalance-of-depression.html">by a combination of factors</a>: “it’s accurate that depression is not just a matter of a “chemical imbalance” in the brain, but instead is a combination of internal and external factors, from a person’s genes to their environment… It’s true that some people don’t benefit from antidepressants at all. Others can’t do without them. They aren’t some magic bullet cure-all, but neither are they some sinister moneymaking scheme perpetuated by shady pharmaceutical companies and uncaring professionals.”. Still, as someone who was baffled by my healthy lifestyle and rest not curing me and as someone who greatly benefited from antidepressants, my personal experience makes me feel chemical imbalance is part of the equation. <a href="#fnref:BrainStuff" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Medication"> <p>Antidepressants often get a bad rep, but they (Cymbalta in my case) seem to have been truly effective for me. Still, <a href="https://www.webmd.com/depression/features/are-antidepressants-effective#1">they are not the solution for everyone</a>: “But a report recently published in The Journal of the American Medical Association showed that the drugs work best for very severe cases of depression and have little or no benefit over placebo (inactive pills) in less serious cases.” Certainly <a href="https://www.goodtherapy.org/blog/can-depression-be-cured-without-medication-1117144">mild to moderate depression can be healed without medication</a>, and in fact that has worked for me in the past, but in a severe case such as mine it should really be considered. <a href="#fnref:Medication" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:SideEffect"> <p>Although it is weird to consider thoughts a side effect of illness, <a href="http://www.dbsalliance.org/site/PageServer?pagename=education_statistics_depression">it’s true</a>: “Depression is the cause of over two-thirds of the 30,000 reported suicides in the U.S. each year. (White House Conference on Mental Health, 1999)” <a href="#fnref:SideEffect" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Absurd"> <p>I had been prone to existential dread in the past, but the sheer nauseating sense of <a href="https://en.wikipedia.org/wiki/Absurdism">The Absurd</a> was abhorrent; I felt exactly like <a href="https://en.wikipedia.org/wiki/The_Myth_of_Sisyphus">Sisyphus</a>: “life is meaningless and nonsensical, but humans strive constantly for meaning and sense in it… Once stripped of its common romanticism, the world is a foreign, strange and inhuman place.” <a href="#fnref:Absurd" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Self-Care"> <p>I got a lot of well-intentioned but unhelpful advice from people, ranging from being checked for some sort of vitamin deficiency to just trying to chill. I was proactive in trying to get better and I think a lot of things contributed to my recovery, but at the same time it is important to understand that my condition was a sickness. Just like a cold or broken bone, no amount of effort or rest or positive thinking on my part would make me feel better right away; time was necessary, and medication helped. <a href="#fnref:Self-Care" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Common"> <p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2169519/">Seriously, it is one of the most common forms of sickness there is</a>: “Major Depressive Disorder is one of the most common forms of psychopathology, one that will affect approximately one in six men and one in four women in their lifetimes.” That <a href="http://www.dbsalliance.org/site/PageServer?pagename=education_statistics_depression">means that</a> “Major depressive disorder affects approximately 14.8 million American adults, or about 6.7 percent of the U.S. population age 18 and older, in a given year. (Archives of General Psychiatry, 2005 Jun; 62(6): 617-27)”. <a href="#fnref:Common" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Treatment"> <p><a href="http://time.com/3615353/depression-seeking-treatment/">Most who really need treatment don’t get it</a>: <br />But what was most concerning to study co-author Laura Pratt, an epidemiologist at the NCHS, was that 65% of people with severe symptoms of depression were not getting help from a mental health professional.” This, despite <a href="http://www.dbsalliance.org/site/PageServer?pagename=education_statistics_depression">it being true that</a> “Up to 80% of those treated for depression show an improvement in their symptoms generally within four to six weeks of beginning medication, psychotherapy, attending support groups or a combination of these treatments. (National Institute of Health, 1998). Despite its high treatment success rate, nearly two out of three people suffering with depression do not actively seek nor receive proper treatment. (DBSA, 1996)”. <a href="#fnref:Treatment" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:FirstWorld"> <p>Which, by the way, <a href="https://www.washingtonpost.com/news/worldviews/wp/2013/11/07/a-stunning-map-of-depression-rates-around-the-world/?utm_term=.0a6535d7a636">is inaccurate</a> <a href="#fnref:FirstWorld" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Family"> <p><a href="https://www.healthline.com/health/depression/genetic#genetics">Family history</a>: “Research has also shown that people with parents or siblings who have depression are up to three times more likely to have the condition.” <a href="#fnref:Family" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Recurrence"> <p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2169519/">Recurrence</a>: “[Major depressive disorder] is usually highly recurrent, with at least 50% of those who recover from first episode of depression having one or more additional episodes in their lifetime, and approximately 80% of those with a history of two episodes having another recurrence.” <a href="#fnref:Recurrence" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:NoReason"> <p>Granted, depression can also strike <a href="https://www.everydayhealth.com/hs/major-depression/why-am-i-depressed/">with no obvious life events to cause it</a>. Indeed, whatever was going on in my life was not nearly bad enough to justify the pain I was feeling. <a href="#fnref:NoReason" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Romanticizing"> <p>There is a troubling tendency to <a href="https://www.theodysseyonline.com/stop-romanticizing-depression">romantacize</a> depression, <a href="https://www.theatlantic.com/health/archive/2013/10/social-media-is-redefining-depression/280818/">especially on social media</a>. It is disturbing, misguided, and harmful. Recently there have also been <a href="https://www.theodysseyonline.com/13-reasons-why-unacceptable">popular portrayals of mentall illness</a> that got it completely wrong, which is even more odious. <a href="#fnref:Romanticizing" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:Stigma"> <p>And it’s not just a hunch either - from <a href="http://journals.sagepub.com/stoken/rbtfl/dDpyhM2zRi.Fg/full">“The Impact of Mental Illness Stigma on Seeking and Participating in Mental Health Care”</a>: “Mental health literacy seems to have a promising effect on care seeking. Individuals who better recognize their mental illness and its manifestations, as well as treatment options to address its varied impressions, might better avail themselves of those options (Jorm, 2012).” <a href="#fnref:Stigma" class="reversefootnote">&#8617;</a></p> </li> </ol> </div> <p><a href="/writing/life/conveying-depression/">The Sickness That Is Depression</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on February 28, 2018.</p> <![CDATA[Habits and Tools, Old and New]]> /writing/life/habits 2018-01-26T15:19:34-08:00 2018-01-26T15:19:34-08:00 www.andreykurenkov.com contact@andreykurenkov.com <h2 id="the-prelude">The Prelude</h2> <blockquote> <p>“We are what we repeatedly do. Excellence, then, is not an act, but a habit.” <br /> -Aristotle (except actually <a href="http://blogs.umb.edu/quoteunquote/2012/05/08/its-a-much-more-effective-quotation-to-attribute-it-to-aristotle-rather-than-to-will-durant/">Will Durant</a>)</p> </blockquote> <p>Common wisdom, right? Yet, somehow it took me until recently to seriously take stock of my time management and habits. In high school and undergrad, my time management could be boiled down to “do the next thing”. I more or less kept adding work until it was just about too much, and then did whatever needed doing to keep up.</p> <p>Turns out, that’s not a great way to go about things. After graduating, I had a glut of free time on my hands and little idea of what to do with it all. So I started getting more methodical by consistently making time for writing, excercising, and all the fun splendor of life. And I found various tools - apps, subscriptions, things - to make that much easier. Now, a few years later, I no longer have a glut of free time but still want to do all this stuff (blog writing, being healthy, etc). And so it finally hit me that I need to double down on this habits and routine thing; my no calendar, no todo list, no consistency lifestlye will not work anymore.</p> <p>So, despite not being a new year resolutions kind of person, the beginning of this year was marked for me with a whole lot of reflection about what habbits I wanted to commit to going forward, and what tools I would use to fullfil that commitment. This, on top of several years of slowly building a set of habits and tools I had not used prior to graduating undergrad. So, here you are, reading my reflection on all these wonderful habits and tools. If that sounds boring, stop here, but if not - I truly do think these things have made a positive impact in my life, and recommend you consider them for your own.</p> <h2 id="the-habits">The Habits</h2> <h3 id="sleep">Sleep</h3> <h4 id="old-get-enough-sleep">Old: get enough sleep</h4> <p>When I’ve described my often surprising workloads, people have often asked me ‘how much do you sleep’ or commented that I must sleep very little. No. Going back many years, to at least the sophemore year of college, I recognized the importance of sleep and the stupidity of all nighters. This may have translated to averaging only 7 or 6 hours most weeks - but average 7 or 6 hours I did.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/sleep_duration.png"><img class="postimagesmall" style="width:40%;" src="/writing/images/2018-01-26-habits/sleep_duration.png" alt="Sleep Duration" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Not so consistent… but on average not bad.</p> </figcaption></figure> <h4 id="new-wake-up-consistently-night-ritual">New: wake up consistently, night ritual</h4> <p>This is a tough one. Since the end of high school, I have never managed to wake up at a consistent time. Not even when I started working as a software engineer - my initial enthusiasm for showing up early quickly faded and I went back to winging it based on how I felt on the day. Turns out, not waking up at a consistent time makes it hard to have a consistent morning routine, and a morning routine is pretty important for having a routine at all. And, it makes it harder to wake up at all. So, time for fancy alarm clocks, lights that turn on at the same time as the alarm, all that.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/sleep_from_to.png"><img class="postimagesmall" style="width:40%;" src="/writing/images/2018-01-26-habits/sleep_from_to.png" alt="Sleep From To" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Pretty crazy, right?</p> </figcaption></figure> <p>But, more importantly, I have decided to have better sleep. Specifically, to fall asleep more easily and calmly. And so, I decided to stick to the whole array of recommended pre-sleep activities - dim the lights, don’t look at screens, drink special sleep-time tea with honey, read. And it’s great! Commiting to spending the last half hour prior to sleep in this fashion has made me feel far more tranquil as I go to sleep than I used to.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/night_ritual.jpg"><img class="postimagesmall" style="width:40%;" src="/writing/images/2018-01-26-habits/night_ritual.jpg" alt="Sleep ritual" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Highly recommended</p> </figcaption></figure> <h3 id="excercise">Excercise</h3> <h4 id="old-boxingfitstar">Old: Boxing,Fitstar</h4> <p>You know what’s great? Fitness classes. Seriously, I’ve not once excercised as rigorously by myself as when led by a good instructor. Practically every time I went to a boxing gym I felt that I was barely able to keep up. So, I continue taking kickboxing classes to this day and hope to not stop.</p> <p>You know what else is great? Not having to go to the gym to excercise. And you dont! All one needs to excercise is a hard floor (though a mat does help). Going back even to my undergrad days, I was often unable to make it to the gym. So I just excercised at home, and kept in quite good shape doing so.</p> <p>But what’s better than either of these? The best of both! Last year I realized apps like FitStar (and Nike Traning Club, and Skimble) could be used to excercise at home while still pushing me to my limit like trainers do. Of course, they are not quite as effective, but still better than excercising driven by my motivation alone.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/fitstar.png"><img class="postimagesmall" style="width:40%;" src="/writing/images/2018-01-26-habits/fitstar.png" alt="Sleep Duration" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Fitstar. It works exactly as you</p> </figcaption></figure> <h4 id="new-morning-kick-off">New: Morning Kick Off</h4> <p>All too often, work would get heavy, I would have to pick my battles, and I’d miss the gym for days, even weeks. That does not feel good, seem good, or is good. So my new idea? Excercise for 10-15 minutes right upon waking up. Specifically, do a 10-15 Fitstar excercise. Not only does that keep me not feeling like I’ve gone for weeks without excercising, but it also makes it easier for me to wake up and get going. Not easy to keep up, but worth the effort.</p> <h3 id="diet">Diet</h3> <h4 id="old-soylent">Old: Soylent</h4> <p>I already <a href="http://www.andreykurenkov.com/writing/art/why-i-consume-soylent/">made my case for Soylent</a> (TDLR - convenient, healthy, affordable, futuristic, makes real food taste even better), and years later the case still holds up. More than holds up, actually - with the introduction of Soylent 2.0 and Soylent Coffee and Cocoa, the convenience and flavor have both seen a significant boost in the past few years. Most I tell this still scoff, but whenever I am so inconvenienced as to have to eat real food for lunch I am reminded of how much time it gets just to get food and how sluggish I feel afterwards.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/soylent.jpg"><img class="postimageactual" src="/writing/images/2018-01-26-habits/soylent.jpg" alt="Soylent" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>A photo from years ago. I still have about this many bottles at home at the start of every month.</p> </figcaption></figure> <h4 id="new-meal-squares">New: Meal squares</h4> <p>But, let’s face it, drinking two bottles of nutritious goo can get a bit repititious. So, I was most excited by my recent discovery of MealSquares. These little nutrition bricks are dry, dense, and chalky - but also contain bits of chocolate, and go down fantastically when paired with a drink such as tea (or Soylent!).</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/food.jpg"><img class="postimageactual" src="/writing/images/2018-01-26-habits/food.jpg" alt="Food" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>My Current Lunch</p> </figcaption></figure> <h3 id="mental-health">Mental Health</h3> <h4 id="old-meditation">Old: Meditation</h4> <p>This too, <a href="http://www.andreykurenkov.com/writing/life/some-thoughts-on-meditation/">I have written about</a>. More and more, I am convinced that meditation will gradually become as commonly acknowledged as a crucial component of a healthy life as physical excercise. Nowdays it’s easier than ever to start - Calm, Headspace, and many other apps tout their ability to teach you the magical skill of meditation. I used both Calm and Headspace early on, and do recommend them for getting started. But I quickly found their idea of ‘listen to lectures about life’ to be pretty far removed from actual meditation, and largely rely on the far simpler Insight Timer.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/meditation.png"><img class="postimagesmaller" src="/writing/images/2018-01-26-habits/meditation.png" alt="Meditation" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>I have not been the best at this… but trying to get better.</p> </figcaption></figure> <h4 id="new-gratitude-journaling">New: Gratitude Journaling</h4> <p>I started keeping a journal more than a year ago now. Not for mental health reasons, just for… fun. Something I found when catching up on this journaling (sometimes whole weeks at a time, sometimes even more), is that trying to recall some fun event from a while ago almost made me feel like how I felt when it actually happened. Furthermore, consistently writing down this journal just made me appreciate the mundane little milestones of life just a little more.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/journal.png"><img class="postimagesmaller" src="/writing/images/2018-01-26-habits/journal.png" alt="Journal" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Just about the year mark, these started showing up. I use the Journey app, which I recommend, though there are many good options.</p> </figcaption></figure> <p>So I was not surprised to find out there was another surprisingly simple scientificaly backed way of maintaing mental health - <a href="https://en.wikipedia.org/wiki/Gratitude_journal">gratitude journaling</a>. It used to be this would seem to touchy-feely for me, but no more. My plan, given my whole experience with journaling, is to stick to the weekly cadence by writing down some things I am grateful for from the past week. Too early to tell if I’ll stick to this, but I am eager to try it out for this next year.</p> <h3 id="productivity">Productivity</h3> <h4 id="old-email-cleaning-calendar-onenote-habitbull">Old: Email Cleaning, Calendar, OneNote, HabitBull</h4> <p>This stuff is boring but useful, so let’s keep it brief:</p> <ul> <li>I try to keep my email inbox nearly empty, so every email still not deleted or categorized to a folder is an ongoing TODO item. Having an empty inbox is rare, and feels great. I’ve tried to use Inbox’s whole email snoozing stuff off and on, but find it’s overkill most of the time.</li> <li>I used to be terrible about keeping a calendar, but forgetting some significant things last year made me come around - I now track my meetings, classes, everything.</li> <li>I’ve tried different approaches to note taking, idea tracking, info dumping, all that. OneNote’s is the one I prefer by far. The idea of having a notebook with various sections and pages within each section just works wonderfully for a great many applications.</li> </ul> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/onenote.png"><img class="postimagesmall" src="/writing/images/2018-01-26-habits/onenote.png" alt="OneNote" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>All the stuff I have been meaning to get around to writing for years, and have yet to write.</p> </figcaption></figure> <ul> <li>I first started thinking about my habits years ago, and figured there must be a to-do app that captures the entirely intuitive ‘secret’ (<a href="https://lifehacker.com/281626/jerry-seinfelds-productivity-secret">often attributed to Jerry Seinfeld</a>) that having a streak to keep going is a good motivator. Today that is indeed the case, there are dozens of apps for this, but a few years ago HabitBull (now called Habit Tracker) was one of very few. I started using it years ago and still use it today.</li> </ul> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/habitbull.png"><img class="postimagesmaller" src="/writing/images/2018-01-26-habits/habitbull.png" alt="HabitBull" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>A good week!</p> </figcaption></figure> <h4 id="new-todoist-rescuetime">New: todoist, rescuetime</h4> <p>Despite having this email and OneNote and Calendar system, I was still not as productive as I wanted. I found that I never managed to make time for stuff like writing for this blog, and often forgot smaller chores til far too late. So this year I had a new idea: spend 5-10 minutes in the morning reflecting on my goal for the day, and jot them down in a todo app. I had always found todo apps pretty useless, as I inevitably got behind and abandoned the endeavour. So this time, I plan to stick to keeping the todo items on at most a day-week timeline, and striving for zero tasks just as with my email inbox. This has already paid off in just a few weeks, with me being better able to keep track of various small chores as well as get around to things like writing the very text you are reading.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/todoist.png"><img class="postimagesmaller" src="/writing/images/2018-01-26-habits/todoist.png" alt="Todoist" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>My TODO app, Todoist. Works like a charm.</p> </figcaption></figure> <p>As you may have noticed, this post has quite a few metrics and graphs. Partly this is just for fun, but I do also think that the knowledge that these metrics and graphs are out there serves as extra motivation to follow through on my ambitions. Inspired somewhat by <a href="http://karpathy.github.io/2014/08/03/quantifying-productivity/">Andrej Karpathy</a>, I decided to take this to the next level this year with <a href="https://www.rescuetime.com/">rescuetime</a>. Basically, it logs every single thing I do on my phone and computer, and tells me how much time I spend doing things that are productive (like this) or unproductive (like browsing reddit). I honestly doubt it being there will change my behavior much, but then again I have felt that it has gotten me to visit distracting websites already so I may be wrong there.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2018-01-26-habits/rescuetime1.png"><img class="postimagesmall" src="/writing/images/2018-01-26-habits/rescuetime1.png" alt="Rescutime" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>My first week with this, tracked.</p> </figcaption></figure> <h2 id="closure">Closure</h2> <p>So there you had it, an infodump on all the habits I have cultivated and plan to cultivate, along with the best tools for said cultivation. Hopefully, if you were so inclined as to read all this a few of these might prove a fruitful addition to your own life.</p> <p><a href="/writing/life/habits/">Habits and Tools, Old and New</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on January 26, 2018.</p> <![CDATA[DeformNet Grasp Transfer]]> /projects/research/deformnet-grasp-transfer 2017-11-11T00:00:00-08:00 2017-11-11T00:00:00-08:00 www.andreykurenkov.com contact@andreykurenkov.com <p>A project that did not ultimately pan out quite as succesfully as one might have hoped - but that’s research! Getting to present the work at CoRL was still great, though.</p> <p><a href="/projects/research/deformnet-grasp-transfer/">DeformNet Grasp Transfer</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on November 11, 2017.</p> <![CDATA[Some Thoughts on Meditation]]> /writing/life/some-thoughts-on-meditation 2017-10-22T16:19:34-07:00 2017-10-22T16:19:34-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>I’ve had a rough few months. Work, work, work, a failed research project, broken laptops, moving, PhD applications, financial mishaps… it’s been a stressful time. Bummer. But, it’s not big deal, this too shall pass, etc - the bummerness of the situation is not the point. The point is that as has happened before and will happen again, this current tired predicament has given me motivation to take up something I have been meaning to do for a while: writing about meditation.</p> <p>Meditation - or at least the entirely secular non new-agey kind - has long intrigued me as a seemingly scientifically backed method for being calmer, healthier, and perhaps wiser. Particularly for a habitually working-too-much-and-taking-life-too-seriously type such as myself, it’d clearly be the height of rationality to go ahead and take up such an objectively good practice. Still, I did not start trying it out until two years ago after first hearing about and then reading a <a href="http://www.10percenthappier.com/mindfulness-meditation-the-basics/">best-seller</a> specifically aimed at selling anti-new-age skeptics like myself on the idea. The book was alright, a sort of shallow crowd pleaser, but it did convince my rational self it was time to do this thing.</p> <p>So now that I’ve done it on and off for a couple of years, I feel compelled to write this: it works. It calms me down, improves my focus, gives me ideas, and is all around a worthwhile practice. Sitting up with eyes closed and attention focused on the breath, something so simple, invariably leads to an unmistakable change in my physical and emotional state. So much so that it has become a regular goto destressing practice akin to reading, excercise, or spending time with good friends. And as the last few months have been rough, it has been particularly helpful, and I have become particularly motivated to speak of its helpfulness.</p> <p>Mind you, none of this is profound. Meditation has not changed me deeply, has not inched me closer to enlightnment, none of that. And what I refer to as meditation is just about the most basic and straighforward practice that can be called that, nothing at all advanced. And on top of that, much like excercise I am not even that good about consistenly doing it.</p> <p>But, like excercise, whenever I do it I am impressed with how undeniably beneficial it is for me and wish I did it more. It really is a foolproof way of becoming less stressed, especially when the stress is a big heavy pile that’s been building up too long. And it’s so easy, especially in our modern there-is-an-app-for-everything times. I rather liked <a href="https://www.calm.com/">calm</a> when getting started with it, and <a href="https://www.headspace.com/">headspace</a> is similarly good (nowdays I use <a href="https://insighttimer.com/">Insight Timer</a>).</p> <p>So. If you’ve read this far, all this obviously goes to say - maybe you should try it too. I am certainly glad I did.</p> <p><a href="/writing/life/some-thoughts-on-meditation/">Some Thoughts on Meditation</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on October 22, 2017.</p> <![CDATA[DeformNet, Or A Tale of Broken Chairs]]> /writing/project/deformnet-or-a-tale-broken-chairs 2017-10-18T16:19:34-07:00 2017-10-18T16:19:34-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>I spent about half of his year so far devoted to my first major research project at Stanford - DeformNet. The gist of the project was to create 3D models of objects based on a single 2D image of the object, by deforming the 3D model of a similar object. It took months to get working properly, but all that work eventually led to a paper - “DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image” - with its very own <a href="https://arxiv.org/abs/1708.04672">Arxiv post</a> and <a href="https://deformnet-site.github.io/DeformNet-website/">project page</a>. But, it also led to much more. It led to horrific, bizzare, and delightful failures.</p> <p>All research projects, and projects in general, face a sequence of failures before their eventual success (or demise). Unlike the eventual success, the failures are typically barely catalogued, and certainly not freely shared with the world. But not DeformNet! Due to its nature as computer vision research, and in particular research on deforming 3D models of objects, DeformNet generated uniquelly broken and surreal imagery in its many months of development. So much so, that I was inspired to capture and share said imagery with my Friends on facebook. And now, dear reader, I shall share it with you! Without further ado, or any further context or explanation, please enjoy:</p> <figure class="figure"><figcaption class="figure__caption"><p>Beautiful Arrows</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/2016-4-15-a-brief-history-of-game-ai/16797445_1438988406114168_9212772470262986498_o.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16797445_1438988406114168_9212772470262986498_o.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Spread 1</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/2016-4-15-a-brief-history-of-game-ai/16807616_1442442285768780_7897771771545166109_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16807616_1442442285768780_7897771771545166109_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Spread 2</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/2016-4-15-a-brief-history-of-game-ai/16830718_1442442282435447_7909518336960080635_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16830718_1442442282435447_7909518336960080635_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Spread 3</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/16831179_1442442279102114_4063186804650971445_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16831179_1442442279102114_4063186804650971445_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Spread 4</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/16832243_1442442312435444_2730660855479305093_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16832243_1442442312435444_2730660855479305093_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Bent Out of Shape</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/2016-4-15-a-brief-history-of-game-ai/16716040_1442443249102017_6471384390721714604_o.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16716040_1442443249102017_6471384390721714604_o.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Split</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/16864203_1447351921944483_8878398950718576730_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16864203_1447351921944483_8878398950718576730_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Modernist Chair</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/16806900_1452334068112935_5140427923114788905_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16806900_1452334068112935_5140427923114788905_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>The First</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17021757_1455820011097674_3621251967274797044_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17021757_1455820011097674_3621251967274797044_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Perfect Correspondence</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17202800_1459767327369609_4492442434375103005_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17202800_1459767327369609_4492442434375103005_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Color Spiral</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/16939623_1459767330702942_4073595802082156353_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/16939623_1459767330702942_4073595802082156353_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Strech</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17309979_1467615326584809_4257681561614777357_o.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17309979_1467615326584809_4257681561614777357_o.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Chair Growths</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17621968_1483241391688869_9139772162230816003_o.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17621968_1483241391688869_9139772162230816003_o.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Chair Growths 2</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17622114_1483241395022202_135454706992583524_o.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17622114_1483241395022202_135454706992583524_o.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Chair Hallucinations 1</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17435863_1483241398355535_2328838522499450722_o.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17435863_1483241398355535_2328838522499450722_o.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Chair Hallucinations 1</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17621744_1483241425022199_7862745114641406764_o.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17621744_1483241425022199_7862745114641406764_o.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Foreshortening</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/18058151_1518098891536452_6182864029848832977_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18058151_1518098891536452_6182864029848832977_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Relaxing Beach Chair</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17992160_1518098894869785_3643613128276992654_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17992160_1518098894869785_3643613128276992654_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Unfriendly Chair</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/18118890_1518098898203118_7556374425933627554_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18118890_1518098898203118_7556374425933627554_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Elephant Chair</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/17992348_151809 8921536449_8297748936321495132_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/17992348_1518098921536449_8297748936321495132_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Perfect Correspondence 2</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/18194139_1521245274555147_6387873167953340635_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18194139_1521245274555147_6387873167953340635_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Uncomfortable Chair</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/18423960_1536082143071460_5562000232042538980_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18423960_1536082143071460_5562000232042538980_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Chill Plane</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/18519900_1542003065812701_69998419652910827_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18519900_1542003065812701_69998419652910827_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Alien Counch</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/18486356_1542003069146034_8832229143760253451_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18486356_1542003069146034_8832229143760253451_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Melted Car</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/18527562_1542003072479367_757465147288662663_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18527562_1542003072479367_757465147288662663_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Old Counch</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/ 18581869_1542003169146024_415512911450005262_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18581869_1542003169146024_415512911450005262_n.jpg" alt="History" /></a></p> </div></figure> <figure class="figure"><figcaption class="figure__caption"><p>Success!</p> </figcaption><div class="figure__main"> <p><a href="/writing/images/ 18556281_1542003282479346_5657773445370981127_n.jpg"><img class="postimageactual" src="/writing/images/2017-10-15-deformnet-or-a-tale-broken-chairs/18556281_1542003282479346_5657773445370981127_n.jpg" alt="History" /></a></p> </div></figure> <p><a href="/writing/project/deformnet-or-a-tale-broken-chairs/">DeformNet, Or A Tale of Broken Chairs</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on October 18, 2017.</p> <![CDATA[KerasJS3D]]> /projects/major_projects/KerasJS3D 2017-08-18T00:00:00-07:00 2017-08-18T00:00:00-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>Not much to say, click on the demo link (hopefully it still works…)! A fun little project.</p> <p><a href="/projects/major_projects/KerasJS3D/">KerasJS3D</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on August 18, 2017.</p> <![CDATA[DeformNet]]> /projects/research/deformnet 2017-08-11T00:00:00-07:00 2017-08-11T00:00:00-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>I was not the one to have the original idea, but mostly the one to grind out and iterate a lot to get it working. See the Arxiv post and website for more details.</p> <p><a href="/projects/research/deformnet/">DeformNet</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on August 11, 2017.</p> <![CDATA[Moving On, Looking Back]]> /writing/life/moving-on-looking-back 2017-07-28T16:19:34-07:00 2017-07-28T16:19:34-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>For a while now, I have been living two lives at once: that of a full time software engineer working at Oracle, and that of a CS Masters grad student at Stanford. In one life, finishing the first release of a brand new product with a small team of esoteric engineers scattered across a few different state. In the other, trying to keep up with classwork and delve deeper into research in AI. Granted, I was officially only a grad student part time, but it certainly did not feel like it.</p> <p>Well, no more. As of two weeks ago, my time at Oracle has come to an end - I am now a full time grad student at Stanford. Being the sentimental guy that I am, I decided to write up a little blog post to commemorate the occasion - and hopefully this also mark a return to writing more stuff on this site, as I have not done in more than a year now.</p> <p>It was an odd choice, going to Oracle. I had focused extensively on robotics research and AI in my undergrad, and had a perfect little plan to spend a bit of time working on robotics in industry before most likely going back to grad school. Long story short, the robotics interviews did not pan out and I went with the backup option of just doing some data-related software engineering.</p> <p>Though without a hint of relation to AI, the job at Oracle did offer the exciting prospect of working on a prototype of a product rather than small iterative feature development or bug fixing as would be the case in almost all entry level positions. But, within a short period there it became clear to me the progress being made in robotics and AI is far too exciting to not get back into that world. So I applied to Stanford’s HCP program, which allows unusually enterprising young men such as myself to get a CS Master’s degree while working full time.</p> <p>The plan was to stick around Oracle long enough to hopefully see our product through to release, while getting back into research and exploring possibilities for working in AI. And the plan worked! I managed to survive doing both long enough to <a href="http://www.oracle.com/technetwork/server-storage/sun-unified-storage/downloads/systems-manager-zfs-3711217.html">see our product through to a beta release</a>. At the same time, I have become a research assistent at the Stanford AI lab (my name is listed <a href="http://cvgl.stanford.edu/people.html">here</a> and everything).</p> <p>And here I am, now. Research and working with robots has felt like the right thing for me to do since I’ve gotten back to it, so I expect to continue with this particular focus indefinitely. Now, there is just the small matter of figuring out exactly what specific contributions I can make in this field… all in good time.</p> <p><a href="/writing/life/moving-on-looking-back/">Moving On, Looking Back</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on July 28, 2017.</p> <![CDATA[OSM]]> /projects/team_projects/oracle-systems-manager 2017-07-14T00:00:00-07:00 2017-07-14T00:00:00-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>As documented in my <a href="http://127.0.0.1:4000/writing/moving-on-looking-back/">more sentimental</a> post on this, my choice to go work at Oracle for a while was a bit odd but ultimately rewarding. Working on a prototype and eventually an entire new product from scratch was a rare opportunity and a rewarding project. We prototyped a microservices driven concept on the side for about nine months with fancy new tools such as Docker and the various popular messaging queues and NoSQL DBs, before being given the green light to go ahead and develop the monitoring and management tool in Java as a proper product. In just about a year, we had finished a polished mostly-bug-free complex product that did a lot of data crunching and visualization. And we did this as a small team of about a dozen engineers, so I got to have broad and significant influence over the development of the backend. Even if I eventually left to go back to research (and the project was ultimately scrapped, after beta release), I think fondly of how we started with zero code and built a really solid management tool in just a year.</p> <p><a href="/projects/team_projects/oracle-systems-manager/">OSM</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on July 14, 2017.</p> <![CDATA[ObjectCropBot]]> /projects/hacks/objectcropbot 2017-02-18T00:00:00-08:00 2017-02-18T00:00:00-08:00 www.andreykurenkov.com contact@andreykurenkov.com <p>I had already had experience with DeepMask when starting on this, since my project for Stanford’s AI class was to modify DeepMask to see if it could be used to crop objects with a single click. My conclusion from that was that I was still better off using Facebook’s approach, but my experience with AWS for that project came in handy. Specifically, I reused my previous AWS DeepMask computing solution: paying for an AWS EC2 instance with a GPU, installing the relevant dependencies on it, and making it host a REST server. I had to modify the DeepMask code slightly, but for the most part I just set it up to run on the cloud and used Facebook’s pretrained models.</p> <p>The bulk of the remaining work was in implementing the http://objectcropbot.com/ web demo. I made this with basic HTML/CSS/JS parts, except for the excellent open source visual cropping library Cropper.js. After an all-nighter tweaking the demo to look and feel right, I set it up to be hosted on GitHub and arranged the domain to be what it is after buying it from NameCheap.</p> <p><a href="/projects/hacks/objectcropbot/">ObjectCropBot</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on February 18, 2017.</p> <![CDATA[IMDB Data Viz]]> /projects/hacks/imdb-data-viz 2017-02-18T00:00:00-08:00 2017-02-18T00:00:00-08:00 www.andreykurenkov.com contact@andreykurenkov.com <p>See the linked writeup. A fun little project - gotta love classes that let the student decide on what to work on for their project.</p> <p><a href="/projects/hacks/imdb-data-viz/">IMDB Data Viz</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on February 18, 2017.</p> <![CDATA[DeepCrop]]> /projects/major_projects/deepcrop 2016-12-16T01:26:22-08:00 2016-12-16T01:26:22-08:00 www.andreykurenkov.com contact@andreykurenkov.com <p>See the quite in-depth attached documents. In the end the outcome was not that impresive, but it was quite fun to do and a good chance to play with Deep Learning.</p> <p><a href="/projects/major_projects/deepcrop/">DeepCrop</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on December 16, 2016.</p> <![CDATA[IMDB Data Visualizations with D3 + Dimple.js]]> /writing/project/visualizing-imdb-data-with-d3 2016-08-10T16:19:34-07:00 2016-08-10T16:19:34-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p><em>Notes: not optimized for mobile (or much else). Full page version <strong><a href="/writing/files/2016-08-10-visualizing-imdb-data-with-d3/standalone_page.html">here</a></strong>, visualization code <strong><a href="https://github.com/andreykurenkov/imdb-data-viz">here</a></strong>. I don’t get into the technical aspects here, but feel free to take a look.</em></p> <div id="genreChartContainer" class="chartContainer"> <script type="text/javascript"> /*Start on 1915 because prior too few movies are listed to make them a fair comparison to modern times*/ var start_year = 1915; /*End on 2013 due to a strange dive towards zero in 2014 and 2015 I cannot explain or guarantee is not due to flawed data. At first I included the dip but received feedback it is best to remove it to avoid confusion, and then removed it.*/ var end_year = 2013; //Get from localhost, perhaps change to github later var data_source = "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/yearly_data.tsv"; var name = "IMDB Yearly Movie And Genre Counts (1915-2013)"; createGenreChart("#genreChartContainer", data_source, name, start_year, end_year); </script> </div> <form class="form" id="genreToggleForm"> <div class="switch-field"> <!-- <div class="switch-title">Display Type</div> --> <input type="radio" id="switch_left" name="switch" value="yes" checked="" /> <label for="switch_left">Counts</label> <input type="radio" id="switch_right" name="switch" value="no" /> <label for="switch_right">Percents</label> </div> </form> <p>And there it is! IMDB data<sup id="fnref:gotten_with"><a href="#fn:gotten_with" class="footnote">1</a></sup> visualized with <a href="https://d3js.org/">D3</a>, or more precisely with the D3-powered <a href="http://dimplejs.org/">Dimple.js</a>. The data is minimally cleaned up by filtering for movies that have at least one vote and associated length information, and info on TV episodes or shows is also not included, but the data is otherwise directly (after parsing) from IMDB. The legend is interactive (try clicking the rectangles!).</p> <p>As you can see this chart visualizes the number of genre movie releases between 1915 and 2013<sup id="fnref:why_years"><a href="#fn:why_years" class="footnote">2</a></sup>, as well as the total number of movies in those years. A single movie may be associated with zero, one, or multiple genres and so the ‘Total Movies’ line corresponds to actual movie counts and every colored-in area represents the number of movies tagged with that genre for that year. The clear conclusion is that there has been an explosion in film production from the 90s onward, for which I have some theories<sup id="fnref:theories"><a href="#fn:theories" class="footnote">3</a></sup> but no definitive explanation. Beyond the big takeaway there are a multitude of possible smaller conclusions regarding the relative popularity of genres and movies overall, which was really my intent in making such an open-ended visualization.</p> <p>There is a ton more that can be done with the data. The direction I decided to go with it was to explore various aspects of more recent data rather than more aspects related to change over time. I would love to eventually add controls to view any year range for all the following charts<sup id="fnref:nontrivial"><a href="#fn:nontrivial" class="footnote">4</a></sup>, but they still reveal some interesting aspects about modern movie production and IMDB metrics.</p> <p>An obvious place to start is with looking at how rating data is distributed, and the answer is delightfully normal:</p> <div id="ratingChartContainer" class="chartContainer"> <script type="text/javascript"> createLineChart("#ratingChartContainer", "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/rating_data.tsv", false, "IMDB Average Movie Rating Distribution (2003-2013) ", "rating", false, "Average IMDB User Rating"); </script> </div> <p>Yep, a bell curve-ish<sup id="fnref:bell_curve"><a href="#fn:bell_curve" class="footnote">5</a></sup> kinda shape! Not overly suprising to see that most movies are rated as mediocre/good and the frequency flattens out at either extreme. Next, a slightly more fun shape from the length distribution:</p> <div id="lengthChartContainer" class="chartContainer"> <script type="text/javascript"> createLineChart("#lengthChartContainer", "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/length_data.tsv", true, "IMDB Movie Length Distribution (2003-2013)", "length", false, "Length (minutes)", "max"); </script> </div> <p>Ah, what a nice regularly spiky shape<sup id="fnref:fourier"><a href="#fn:fourier" class="footnote">6</a></sup>. It’s logical that most movies hit the 90-minute mark, though it seems likely that simplified data entry also brings about the periodicity here. The chart is a bit of a mess as a line graph, so it makes sense to clean it up by binning the data quite a bit more:</p> <div id="lengthBinChartContainer" class="chartContainer"> <script type="text/javascript"> createHistChart("#lengthBinChartContainer", "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/length_data_hist.tsv", true, "IMDB Movie Length Distribution (2003-2013) ", "length", false, "Length (minutes)"); </script> </div> <p>And there it is, hiding in that data was another sort-of bell curve. Except of course for that first bar - IMDB apparently has a large amount of shorter 0-20 minute film entries as well. No doubt short films are part of this, though it’s unclear why there are quite so many. As with many aspects of the data, it could be explored more deeply and filtered more thouroughly to focus on a specific subset of films. But, that’s for another day. For now I continued my visualization quest by looking into the vote distribution:</p> <div id="votesChartContainer" class="chartContainer"> <script type="text/javascript"> createLineChart("#votesChartContainer", "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/votes_data.tsv", true, "IMDB Movie Vote Count Distribution (2003-2013) ", "votes", true, "IMDB User Vote Count"); </script> </div> <p>Yes astute reader<sup id="fnref:corny"><a href="#fn:corny" class="footnote">7</a></sup>, that is indeed a log-scale on the x axis. Unsurprisingly, the number of votes for any given film declines exponentially - very few of those thousands of movies in the first graph are blockbusters<sup id="fnref:again"><a href="#fn:again" class="footnote">8</a></sup>. As with the histogram above the continuous data is in fact binned for counting, but in this case there are enough bins that it makes sense to smooth out into a line. Once again the data can also be shown via a histogram with fewer bins:</p> <div id="votesBinChartContainer" class="chartContainer"> <script type="text/javascript"> createHistChart("#votesBinChartContainer", "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/votes_data_hist.tsv", true, "IMDB Movie Vote # Distribution (2003-2013) ", "votes", true, "IMDB User Vote Count"); </script> </div> <p>Lastly, I explored the distribution of budgets within the data <sup id="fnref:budgets"><a href="#fn:budgets" class="footnote">9</a></sup>. I was originally inspired to look into movie data based on <a href="http://flavorwire.com/492985/how-the-death-of-mid-budget-cinema-left-a-generation-of-iconic-filmmakers-mia">an article</a> that discussed the death of mid-budget-cinema, and of course I wanted to look into the data and see the phenomenon myself. The result once again demands a log-scale and reveals a certain periodicity:</p> <div id="budgetChartContainer" class="chartContainer"> <script type="text/javascript"> createLineChart("#budgetChartContainer", "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/budget_data.tsv", true, "IMDB Movie Budget Distribution (2003-2013) ", "budget", true, "Budget (USD)", "average"); </script> </div> <p>The data does<sup id="fnref:plural"><a href="#fn:plural" class="footnote">10</a></sup> not seem to back the notion of mid-budget movies dying, since one peak is at about 1m, but then again as said before the data is not particularly carefully filtered. There being a ton of less-than one million budget movies certainly explains how such an explosion in movie production might have been possible in the past twenty years. That guess shall hopefully be further explored in future posts, but for now I will finish with a final simplified histogram:</p> <div id="budgetBinChartContainer" class="chartContainer"> <script type="text/javascript"> createHistChart("#budgetBinChartContainer", "/writing/files/2016-08-10-visualizing-imdb-data-with-d3/data/budget_data_hist.tsv", true, "IMDB Movie Budget Distribution (2003-2013) ", "budget", true, "Budget (USD)"); </script> </div> <h2 id="what-i-learned">What I Learned</h2> <p>And now time for everyone’s favorite part of the book report. In truth I prepared the genre chart for Udacity’s an online class, <a href="https://www.udacity.com/course/data-visualization-and-d3js--ud507">Data Visualization with D3.JS</a>. I completed the class as part of Udacity’s Data Analyst Nanodegree, and as with my <a href="http://www.andreykurenkov.com/writing/fun-visualizations-of-stackoverflow/">previous</a> <a href="http://www.andreykurenkov.com/writing/power-of-ipython-pandas-scikilearn/">posts</a> based off projects for the nanodegree I felt that I learned<sup id="fnref:learned"><a href="#fn:learned" class="footnote">11</a></sup> a useful technology and got the chance to complete a fun project with it worthy of cataloguing. I have a few key takeaways from having now completed the project:</p> <ul> <li>It’s way easier to do data exploration and visualization via RStudio or IPython than D3. Perhaps this is not true for others, but I was surprised by how high level and unstreamlined D3 is for typical visualization tasks. Of course this is part of its power and the reason that higher-abstraction libraries like Dimple.js got built, but on balance I still felt that using JavaScript, HTML, and a browser was not nearly as elegant as RStudio. As someone only mildly experienced with web-dev, the prior classes on R and Pandas+IPython made me feel much more empowered to play with data easily.</li> <li>D3 allows for interactivity, but interactvity is not always needed. This one is rather obvious but why not still spell it out. All but the genre chart here could have comfortably been PNGs files (as in my previous visualization posts), and not lost much. Still, allowing for interactivity does open up a considerable amount of possibilies and in particular is good for open ended data visualization without a single particularly concrete point.</li> <li>Aggregation over hundreds of thousands of data points in JS is probably a bad idea. The year-grouped data for the genre graph was originally completely computed in JS when I submitted my project to Udacity. This took painful seconds, which was not helped by my meek laptop. Again unsurprising, but I did feel a tinge of annoyance at realizing I would be best off writing a python script to pre-process my data into multiple CSV files ready for charting without much manipulation.</li> </ul> <h2 id="bloopers">Bloopers</h2> <p>You did not ask for them, and I delivered. Here are a couple of silly moments during the creation of this<sup id="fnref:high_five"><a href="#fn:high_five" class="footnote">12</a></sup>. Hope you enjoyed!</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/oops.png"><img class="postimageactual" src="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/oops.png" alt="oops" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>Making small ordering mistakes unsurisingly had major glitchy implications…</p> </figcaption></figure> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/great.png"><img class="postimageactual" src="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/great.png" alt="great" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>… which were worse in some cases than others.</p> </figcaption></figure> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/step1.png"><img class="postimageactual" src="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/step1.png" alt="step1" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>For a while I thought to plot the binned data with tiny little cute bins via step interpolation…</p> </figcaption></figure> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/step2.png"><img class="postimageactual" src="/writing/images/2016-08-10-visualizing-imdb-data-with-d3/step2.png" alt="step2" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>… but evidently changed my mind.</p> </figcaption></figure> <h2 id="notes">Notes</h2> <div class="footnotes"> <ol> <li id="fn:gotten_with"> <p>(gotten with <a href="https://github.com/andreykurenkov/data-movies">this code</a>) <a href="#fnref:gotten_with" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:why_years"> <p>The cutoff years are chosen due to there being very few movies comparable to modern films prior to 1915, and the data possibly being incomplete post 2013 <a href="#fnref:why_years" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:theories"> <p>Primarily that films are cheaper to make due to digital technology and that IMDB tacks more modern movies better <a href="#fnref:theories" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:nontrivial"> <p>This is nontrivial for various boring reasons and I have too many side-projects as it is… <a href="#fnref:nontrivial" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:bell_curve"> <p>Fine, a somewhat offset and wobbly bell curve, but still looks pretty good. <a href="#fnref:bell_curve" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fourier"> <p>Don’t you just want to take the fourier transform of it? <a href="#fnref:fourier" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:corny"> <p>Too corny? I shan’t apologize, this is my site! <a href="#fnref:corny" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:again"> <p>Again, something warranting a deeper dive someday. <a href="#fnref:again" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:budgets"> <p>Many movies did not have associated budget data, but that still left thousands that did <a href="#fnref:budgets" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:plural"> <p>I know, ‘do’ is grammatically correct here, but then natural speech is largely nonsensical so who cares <a href="#fnref:plural" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:learned"> <p>Well, learned a little… <a href="#fnref:learned" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:high_five"> <p>Wow, did you actually read all the text and are still reading it? High five. <a href="#fnref:high_five" class="reversefootnote">&#8617;</a></p> </li> </ol> </div> <p><a href="/writing/project/visualizing-imdb-data-with-d3/">IMDB Data Visualizations with D3 + Dimple.js</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on August 10, 2016.</p> <![CDATA[VividVr]]> /projects/hacks/vividvr 2016-06-17T00:00:00-07:00 2016-06-17T00:00:00-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>The presentation slides or photos below sum it up; basically we tried to glue together some cutting edge open source projects to get something quite novel. Trying to do so in 24 hours proved tough, and we only go as far as running the separate programs but not to the full pipeline. Something like this may still be worth exploring, though I don’t think this is the best approach to go about it.</p> <p><a href="/projects/hacks/vividvr/">VividVr</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on June 17, 2016.</p> <![CDATA[The Power of IPython Notebook + Pandas + and Scikit-learn]]> /writing/project/power-of-ipython-pandas-scikilearn 2016-06-10T19:19:34-07:00 2016-06-10T19:19:34-07:00 www.andreykurenkov.com contact@andreykurenkov.com <p>IPython Notebook, Numpy, Pandas, MongoDB, R — for the better part of a year now, I have been trying out these technologies as part of Udacity’s <a href="https://www.udacity.com/course/data-analyst-nanodegree--nd002">Data Analyst Nanodegree</a>. My undergrad education barely touched on data visualization or more broadly data science, and so I figured being exposed to the aforementioned technologies would be fun. And fun it has been, with R’s powerful IDE-powered data mundging and visualization techniques having been particularly revelatory. I learned enough of R to create <a href="/writing/fun-visualizations-of-stackoverflow/">some complex visualizations</a>, and was impressed by how easy is to import data into its Dataframe representations and then transform and visualize that data. I also thought RStudio’s paradigm of continuously intermixed code editing and execution was superior to my habitual workflow of just endlessly cycling between tweaking and executing of Python scripts.</p> <figure class="figure"><div class="figure__main"> <p><a href="/writing/images/2016-06-10-power-of-ipython-pandas-scikitlearn/rstudio.png"><img class="postimageactual" src="/writing/images/2016-06-10-power-of-ipython-pandas-scikitlearn/rstudio.png" alt="History" /></a></p> </div><figcaption class="figure__caption" style="padding-top:0;"><p>The RStudio IDE</p> </figcaption></figure> <p>Still, R is a not-quite-general-purpose-language and I hit upon multiple instances in which simple things were hard to do. In such times, I could not help but miss the powers of Python, a language I have tons of experience with and which is about as general purpose as it gets. Luckily, the courses also covered the equivalent of an R implementation for Python: the Python Data Analysis Library, Pandas. This let me use the features of R I now liked — dataframes, powerful plotting methods, elegant methods for transforming data — with Python’s lovely syntax and libraries I already knew and loved. And soon I got to do just that, using both Pandas and the supremely good Machine Learning package Scikit-learn for the final project of <a href="https://www.udacity.com/course/intro-to-machine-learning--ud120">Udacity’s Intro to Machine Learning Course</a>. Not only that, but I also used IPython Notebook for RStudio-esque intermixed code editing and execution and nice PDF output.</p> <p>I had such a nice experience with this combination of tools that I decided to dedicate a post to it, and what follows is mostly a summation of that experience. Reading it should be sufficient to get a general idea for why these tools are useful, whereas a much more detailed introdution and tutorial for Pandas can be found elsewhere (for instance <a href="http://nbviewer.jupyter.org/github/fonnesbeck/pytenn2014_tutorial/blob/master/Part%201.%20Data%20Wrangling%20with%20Pandas.ipynb">here</a>). Incidentally, this whole post was written in IPython Notebook and the source of that <a href="http://www.andreykurenkov.com/writing/files/2016-06-10-power-of-ipython-pandas-scikilearn/post.ipynb">can be found here</a> with the produced HTML <a href="http://www.andreykurenkov.com/writing/files/2016-06-10-power-of-ipython-pandas-scikilearn/post.html">here</a>.</p> <h2 id="data-summarization">Data Summarization</h2> <p>First, a bit about the project. The task was to first explore and clean a given dataset, and then train classification models using it. The dataset contained dozens of features about roughly 150 important employees from the <a href="https://en.wikipedia.org/wiki/Enron_scandal">notoriously corrupt</a> company Enron, witch were classified as either a “Person of Interest” or not based on the outcome of investigations into Enron’s corruption. It’s a tiny dataset and not what I would have chosen, but such were the instructions. The data was provided in a bunch of Python dictionaries, and at first I just used a Python script to change it into a CSV and started exploring it in RStudio. But, it soon dawned on me that I would be much better off just working entirely in Python, and the following code is taken verbatim from my final project submission.</p> <p>And so, the code. Following some imports and a ‘%matplotlib notebook’ comment to allow plotting within IPython, I loaded the data using pickle and printed out some basic things about it (not yet using Pandas):</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span> <span class="kn">import</span> <span class="nn">matplotlib</span> <span class="kn">import</span> <span class="nn">pickle</span> <span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">display</span> <span class="o">%</span><span class="n">matplotlib</span> <span class="n">notebook</span></code></pre></div> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">enron_data</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="nb">open</span><span class="p">(</span><span class="s">&quot;./ud120-projects/final_project/final_project_dataset.pkl&quot;</span><span class="p">,</span> <span class="s">&quot;rb&quot;</span><span class="p">))</span> <span class="k">print</span><span class="p">(</span><span class="s">&quot;Number of people: </span><span class="si">%d</span><span class="s">&quot;</span><span class="o">%</span><span class="nb">len</span><span class="p">(</span><span class="n">enron_data</span><span class="o">.</span><span class="n">keys</span><span class="p">()))</span> <span class="k">print</span><span class="p">(</span><span class="s">&quot;Number of features per person: </span><span class="si">%d</span><span class="s">&quot;</span><span class="o">%</span><span class="nb">len</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">enron_data</span><span class="o">.</span><span class="n">values</span><span class="p">())[</span><span class="mi">0</span><span class="p">]))</span> <span class="k">print</span><span class="p">(</span><span class="s">&quot;Number of POI: </span><span class="si">%d</span><span class="s">&quot;</span><span class="o">%</span><span class="nb">sum</span><span class="p">([</span><span class="mi">1</span> <span class="k">if</span> <span class="n">x</span><span class="p">[</span><span class="s">&#39;poi&#39;</span><span class="p">]</span> <span class="k">else</span> <span class="mi">0</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">enron_data</span><span class="o">.</span><span class="n">values</span><span class="p">()]))</span></code></pre></div> <pre><code>Number of people: 146 Number of features per person: 21 Number of POI: 18 </code></pre> <p>But working with this set of dictionaries would not be nearly as fast or easy as a Pandas dataframe, so I soon converted it to that and went ahead and summarized all the features with a single method call:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="o">.</span><span class="n">from_dict</span><span class="p">(</span><span class="n">enron_data</span><span class="p">)</span> <span class="k">del</span> <span class="n">df</span><span class="p">[</span><span class="s">&#39;TOTAL&#39;</span><span class="p">]</span> <span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">transpose</span><span class="p">()</span> <span class="n">numeric_df</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">apply</span><span class="p">(</span><span class="n">pd</span><span class="o">.</span><span class="n">to_numeric</span><span class="p">,</span> <span class="n">errors</span><span class="o">=</span><span class="s">&#39;coerce&#39;</span><span class="p">)</span> <span class="k">del</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;email_address&#39;</span><span class="p">]</span> <span class="n">numeric_df</span><span class="o">.</span><span class="n">describe</span><span class="p">()</span></code></pre></div> <div class="post_table_div"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>bonus</th> <th>deferral_payments</th> <th>deferred_income</th> <th>director_fees</th> <th>exercised_stock_options</th> <th>expenses</th> <th>from_messages</th> <th>from_poi_to_this_person</th> <th>from_this_person_to_poi</th> <th>loan_advances</th> <th>long_term_incentive</th> <th>other</th> <th>poi</th> <th>restricted_stock</th> <th>restricted_stock_deferred</th> <th>salary</th> <th>shared_receipt_with_poi</th> <th>to_messages</th> <th>total_payments</th> <th>total_stock_value</th> </tr> </thead> <tbody> <tr> <th>count</th> <td>81.000000</td> <td>38.000000</td> <td>48.000000</td> <td>16.000000</td> <td>101.000000</td> <td>94.000000</td> <td>86.000000</td> <td>86.000000</td> <td>86.000000</td> <td>3.000000</td> <td>65.000000</td> <td>92.000000</td> <td>145</td> <td>109.000000</td> <td>17.000000</td> <td>94.000000</td> <td>86.000000</td> <td>86.000000</td> <td>1.240000e+02</td> <td>125.000000</td> </tr> <tr> <th>mean</th> <td>1201773.074074</td> <td>841602.526316</td> <td>-581049.812500</td> <td>89822.875000</td> <td>2959559.257426</td> <td>54192.010638</td> <td>608.790698</td> <td>64.895349</td> <td>41.232558</td> <td>27975000.000000</td> <td>746491.200000</td> <td>465276.663043</td> <td>0.124138</td> <td>1147424.091743</td> <td>621892.823529</td> <td>284087.542553</td> <td>1176.465116</td> <td>2073.860465</td> <td>2.623421e+06</td> <td>3352073.024000</td> </tr> <tr> <th>std</th> <td>1441679.438330</td> <td>1289322.626180</td> <td>942076.402972</td> <td>41112.700735</td> <td>5499449.598994</td> <td>46108.377454</td> <td>1841.033949</td> <td>86.979244</td> <td>100.073111</td> <td>46382560.030684</td> <td>862917.421568</td> <td>1389719.064851</td> <td>0.330882</td> <td>2249770.356903</td> <td>3845528.349509</td> <td>177131.115377</td> <td>1178.317641</td> <td>2582.700981</td> <td>9.488106e+06</td> <td>6532883.097201</td> </tr> <tr> <th>min</th> <td>70000.000000</td> <td>-102500.000000</td> <td>-3504386.000000</td> <td>3285.000000</td> <td>3285.000000</td> <td>148.000000</td> <td>12.000000</td> <td>0.000000</td> <td>0.000000</td> <td>400000.000000</td> <td>69223.000000</td> <td>2.000000</td> <td>False</td> <td>-2604490.000000</td> <td>-1787380.000000</td> <td>477.000000</td> <td>2.000000</td> <td>57.000000</td> <td>1.480000e+02</td> <td>-44093.000000</td> </tr> <tr> <th>25%</th> <td>425000.000000</td> <td>79644.500000</td> <td>-611209.250000</td> <td>83674.500000</td> <td>506765.000000</td> <td>22479.000000</td> <td>22.750000</td> <td>10.000000</td> <td>1.000000</td> <td>1200000.000000</td> <td>275000.000000</td> <td>1209.000000</td> <td>0</td> <td>252055.000000</td> <td>-329825.000000</td> <td>211802.000000</td> <td>249.750000</td> <td>541.250000</td> <td>3.863802e+05</td> <td>494136.000000</td> </tr> <tr> <th>50%</th> <td>750000.000000</td> <td>221063.500000</td> <td>-151927.000000</td> <td>106164.500000</td> <td>1297049.000000</td> <td>46547.500000</td> <td>41.000000</td> <td>35.000000</td> <td>8.000000</td> <td>2000000.000000</td> <td>422158.000000</td> <td>51984.500000</td> <td>0</td> <td>441096.000000</td> <td>-140264.000000</td> <td>258741.000000</td> <td>740.500000</td> <td>1211.000000</td> <td>1.100246e+06</td> <td>1095040.000000</td> </tr> <tr> <th>75%</th> <td>1200000.000000</td> <td>867211.250000</td> <td>-37926.000000</td> <td>112815.000000</td> <td>2542813.000000</td> <td>78408.500000</td> <td>145.500000</td> <td>72.250000</td> <td>24.750000</td> <td>41762500.000000</td> <td>831809.000000</td> <td>357577.250000</td> <td>0</td> <td>985032.000000</td> <td>-72419.000000</td> <td>308606.500000</td> <td>1888.250000</td> <td>2634.750000</td> <td>2.084663e+06</td> <td>2606763.000000</td> </tr> <tr> <th>max</th> <td>8000000.000000</td> <td>6426990.000000</td> <td>-833.000000</td> <td>137864.000000</td> <td>34348384.000000</td> <td>228763.000000</td> <td>14368.000000</td> <td>528.000000</td> <td>609.000000</td> <td>81525000.000000</td> <td>5145434.000000</td> <td>10359729.000000</td> <td>True</td> <td>14761694.000000</td> <td>15456290.000000</td> <td>1111258.000000</td> <td>5521.000000</td> <td>15149.000000</td> <td>1.035598e+08</td> <td>49110078.000000</td> </tr> </tbody> </table> </div> <p>This high level summarization of data is one example of what Pandas can do for you. But the main strength is in how easy it is to manipulate the data and derive new things from it. The project instructed me to first summarize some things about the data, and then handle outliers. The summary indicated a large standard deviation for many of the features, and also a lot of missing values in the data for various features. First I dropped features with almost no non-null values, such as loan_advances and restricted_stock_deferred. Then, in order to investigate if any features are particularly bad in terms of outliers, I went ahead computed the standard deviation of each feature for each entry in the data, and easily got summary statistics for this data as well:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">del</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;loan_advances&#39;</span><span class="p">]</span> <span class="k">del</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;restricted_stock_deferred&#39;</span><span class="p">]</span> <span class="k">del</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;director_fees&#39;</span><span class="p">]</span> <span class="n">std</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="o">.</span><span class="n">apply</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">x</span><span class="o">.</span><span class="n">mean</span><span class="p">())</span> <span class="o">/</span> <span class="n">x</span><span class="o">.</span><span class="n">std</span><span class="p">())</span> <span class="n">std</span> <span class="o">=</span> <span class="n">std</span><span class="o">.</span><span class="n">fillna</span><span class="p">(</span><span class="n">std</span><span class="o">.</span><span class="n">mean</span><span class="p">())</span> <span class="n">std</span><span class="o">.</span><span class="n">describe</span><span class="p">()</span></code></pre></div> <div class="post_table_div"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>bonus</th> <th>deferral_payments</th> <th>deferred_income</th> <th>exercised_stock_options</th> <th>expenses</th> <th>from_messages</th> <th>from_poi_to_this_person</th> <th>from_this_person_to_poi</th> <th>long_term_incentive</th> <th>other</th> <th>poi</th> <th>restricted_stock</th> <th>salary</th> <th>shared_receipt_with_poi</th> <th>to_messages</th> <th>total_payments</th> <th>total_stock_value</th> </tr> </thead> <tbody> <tr> <th>count</th> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> <td>145.000000</td> </tr> <tr> <th>mean</th> <td>0.612134</td> <td>0.670659</td> <td>0.690552</td> <td>0.558364</td> <td>0.739307</td> <td>0.487468</td> <td>0.694769</td> <td>0.532234</td> <td>0.670577</td> <td>0.444004</td> <td>0.657200</td> <td>0.525893</td> <td>0.568830</td> <td>0.794256</td> <td>0.648079</td> <td>0.287221</td> <td>0.547885</td> </tr> <tr> <th>std</th> <td>0.587181</td> <td>0.371822</td> <td>0.409188</td> <td>0.689763</td> <td>0.537626</td> <td>0.669599</td> <td>0.549542</td> <td>0.648923</td> <td>0.491393</td> <td>0.711333</td> <td>0.751724</td> <td>0.735294</td> <td>0.659254</td> <td>0.462087</td> <td>0.582615</td> <td>0.884946</td> <td>0.774945</td> </tr> <tr> <th>min</th> <td>0.001230</td> <td>0.001025</td> <td>0.002415</td> <td>0.040311</td> <td>0.005314</td> <td>0.028674</td> <td>0.010294</td> <td>0.032302</td> <td>0.027083</td> <td>0.000058</td> <td>0.375173</td> <td>0.044846</td> <td>0.025148</td> <td>0.037736</td> <td>0.041484</td> <td>0.003077</td> <td>0.014143</td> </tr> <tr> <th>25%</th> <td>0.380270</td> <td>0.670659</td> <td>0.611358</td> <td>0.346078</td> <td>0.510059</td> <td>0.310038</td> <td>0.481671</td> <td>0.342075</td> <td>0.546392</td> <td>0.297679</td> <td>0.375173</td> <td>0.302841</td> <td>0.250755</td> <td>0.605495</td> <td>0.455283</td> <td>0.130231</td> <td>0.296228</td> </tr> <tr> <th>50%</th> <td>0.612134</td> <td>0.670659</td> <td>0.690552</td> <td>0.470558</td> <td>0.739307</td> <td>0.324161</td> <td>0.694769</td> <td>0.412024</td> <td>0.670577</td> <td>0.334411</td> <td>0.375173</td> <td>0.417338</td> <td>0.568830</td> <td>0.794256</td> <td>0.648079</td> <td>0.196170</td> <td>0.423551</td> </tr> <tr> <th>75%</th> <td>0.612134</td> <td>0.670659</td> <td>0.690552</td> <td>0.558364</td> <td>0.817162</td> <td>0.487468</td> <td>0.694769</td> <td>0.532234</td> <td>0.670577</td> <td>0.444004</td> <td>0.375173</td> <td>0.525893</td> <td>0.568830</td> <td>0.847365</td> <td>0.648079</td> <td>0.271301</td> <td>0.508700</td> </tr> <tr> <th>max</th> <td>4.715491</td> <td>4.332032</td> <td>3.103078</td> <td>5.707630</td> <td>3.786101</td> <td>7.473631</td> <td>5.324312</td> <td>5.673526</td> <td>5.097756</td> <td>7.119750</td> <td>2.647054</td> <td>6.051404</td> <td>4.669820</td> <td>3.687066</td> <td>5.062584</td> <td>10.638201</td> <td>7.004259</td> </tr> </tbody> </table> </div> <p>This result suggested that most features have large outliers (larger than 3 standard deviations). In order to be careful not to remove any useful data, I manually inspected all rows with large outliers to see any values that seem appropriate for removal:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">outliers</span> <span class="o">=</span> <span class="n">std</span><span class="o">.</span><span class="n">apply</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span> <span class="o">&gt;</span> <span class="mi">5</span><span class="p">)</span><span class="o">.</span><span class="n">any</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="n">outlier_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">index</span><span class="o">=</span><span class="n">numeric_df</span><span class="p">[</span><span class="n">outliers</span><span class="p">]</span><span class="o">.</span><span class="n">index</span><span class="p">)</span> <span class="k">for</span> <span class="n">col</span> <span class="ow">in</span> <span class="n">numeric_df</span><span class="o">.</span><span class="n">columns</span><span class="p">:</span> <span class="n">outlier_df</span><span class="p">[</span><span class="nb">str</span><span class="p">((</span><span class="n">col</span><span class="p">,</span><span class="n">col</span><span class="o">+</span><span class="s">&#39;_std&#39;</span><span class="p">))]</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">numeric_df</span><span class="p">[</span><span class="n">outliers</span><span class="p">][</span><span class="n">col</span><span class="p">],</span><span class="n">std</span><span class="p">[</span><span class="n">outliers</span><span class="p">][</span><span class="n">col</span><span class="p">]))</span> <span class="n">display</span><span class="p">(</span><span class="n">outlier_df</span><span class="p">)</span> <span class="n">numeric_df</span><span class="o">.</span><span class="n">drop</span><span class="p">(</span><span class="s">&#39;FREVERT MARK A&#39;</span><span class="p">,</span><span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">df</span><span class="o">.</span><span class="n">drop</span><span class="p">(</span><span class="s">&#39;FREVERT MARK A&#39;</span><span class="p">,</span><span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span></code></pre></div> <div class="post_table_div"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>('bonus', 'bonus_std')</th> <th>('deferral_payments', 'deferral_payments_std')</th> <th>('deferred_income', 'deferred_income_std')</th> <th>('exercised_stock_options', 'exercised_stock_options_std')</th> <th>('expenses', 'expenses_std')</th> <th>('from_messages', 'from_messages_std')</th> <th>('from_poi_to_this_person', 'from_poi_to_this_person_std')</th> <th>('from_this_person_to_poi', 'from_this_person_to_poi_std')</th> <th>('long_term_incentive', 'long_term_incentive_std')</th> <th>('other', 'other_std')</th> <th>('poi', 'poi_std')</th> <th>('restricted_stock', 'restricted_stock_std')</th> <th>('salary', 'salary_std')</th> <th>('shared_receipt_with_poi', 'shared_receipt_with_poi_std')</th> <th>('to_messages', 'to_messages_std')</th> <th>('total_payments', 'total_payments_std')</th> <th>('total_stock_value', 'total_stock_value_std')</th> </tr> </thead> <tbody> <tr> <th>DELAINEY DAVID W</th> <td>(3000000.0, 1.24731398542)</td> <td>(nan, 0.67065886001)</td> <td>(nan, 0.690552246623)</td> <td>(2291113.0, 0.121547846815)</td> <td>(86174.0, 0.6936264325)</td> <td>(3069.0, 1.3363193564)</td> <td>(66.0, 0.0127001697143)</td> <td>(609.0, 5.67352642171)</td> <td>(1294981.0, 0.635622582522)</td> <td>(1661.0, 0.333603873451)</td> <td>(True, 2.64705431598)</td> <td>(1323148.0, 0.078107486712)</td> <td>(365163.0, 0.457714373186)</td> <td>(2097.0, 0.781228126919)</td> <td>(3093.0, 0.394602217763)</td> <td>(4747979.0, 0.22391802188)</td> <td>(3614261.0, 0.0401335784062)</td> </tr> <tr> <th>FREVERT MARK A</th> <td>(2000000.0, 0.553678511813)</td> <td>(6426990.0, 4.33203246439)</td> <td>(-3367011.0, 2.95725609803)</td> <td>(10433518.0, 1.35903759241)</td> <td>(86987.0, 0.711258803121)</td> <td>(21.0, 0.319272057897)</td> <td>(242.0, 2.03617142019)</td> <td>(6.0, 0.352068179278)</td> <td>(1617011.0, 1.00881008801)</td> <td>(7427621.0, 5.00989337561)</td> <td>(False, 0.375173052658)</td> <td>(4188667.0, 1.3518014845)</td> <td>(1060932.0, 4.38570296241)</td> <td>(2979.0, 1.5297529467)</td> <td>(3275.0, 0.465071080146)</td> <td>(17252530.0, 1.54183664695)</td> <td>(14622185.0, 1.72513602468)</td> </tr> <tr> <th>HIRKO JOSEPH</th> <td>(nan, 0.612134343218)</td> <td>(10259.0, 0.644790923106)</td> <td>(nan, 0.690552246623)</td> <td>(30766064.0, 5.05623412708)</td> <td>(77978.0, 0.515871316129)</td> <td>(nan, 0.487467982744)</td> <td>(nan, 0.694769235346)</td> <td>(nan, 0.532233915598)</td> <td>(nan, 0.670576589457)</td> <td>(2856.0, 0.332743987428)</td> <td>(True, 2.64705431598)</td> <td>(nan, 0.52589323995)</td> <td>(nan, 0.568830375372)</td> <td>(nan, 0.794256482633)</td> <td>(nan, 0.648079292459)</td> <td>(91093.0, 0.266895026444)</td> <td>(30766064.0, 4.19630821004)</td> </tr> <tr> <th>KAMINSKI WINCENTY J</th> <td>(400000.0, 0.556138245963)</td> <td>(nan, 0.67065886001)</td> <td>(nan, 0.690552246623)</td> <td>(850010.0, 0.383592797689)</td> <td>(83585.0, 0.637476115725)</td> <td>(14368.0, 7.47363149225)</td> <td>(41.0, 0.274724723819)</td> <td>(171.0, 1.29672636328)</td> <td>(323466.0, 0.490226746415)</td> <td>(4669.0, 0.331439407211)</td> <td>(False, 0.375173052658)</td> <td>(126027.0, 0.454000599932)</td> <td>(275101.0, 0.0507338450054)</td> <td>(583.0, 0.503654613618)</td> <td>(4607.0, 0.980810226817)</td> <td>(1086821.0, 0.161950156636)</td> <td>(976037.0, 0.363704047455)</td> </tr> <tr> <th>LAVORATO JOHN J</th> <td>(8000000.0, 4.71549135347)</td> <td>(nan, 0.67065886001)</td> <td>(nan, 0.690552246623)</td> <td>(4158995.0, 0.21810105193)</td> <td>(49537.0, 0.100958023148)</td> <td>(2585.0, 1.07342360688)</td> <td>(528.0, 5.32431220222)</td> <td>(411.0, 3.69497297064)</td> <td>(2035380.0, 1.4936409531)</td> <td>(1552.0, 0.33368230657)</td> <td>(False, 0.375173052658)</td> <td>(1008149.0, 0.0619063591605)</td> <td>(339288.0, 0.311636142127)</td> <td>(3962.0, 2.3639931937)</td> <td>(7259.0, 2.00764222154)</td> <td>(10425757.0, 0.822328102755)</td> <td>(5167144.0, 0.277836132837)</td> </tr> <tr> <th>LAY KENNETH L</th> <td>(7000000.0, 4.02185587986)</td> <td>(202911.0, 0.495369827029)</td> <td>(-300000.0, 0.29833016899)</td> <td>(34348384.0, 5.70763022327)</td> <td>(99832.0, 0.98984158372)</td> <td>(36.0, 0.311124462355)</td> <td>(123.0, 0.668028926971)</td> <td>(16.0, 0.252141237305)</td> <td>(3600000.0, 3.30681561025)</td> <td>(10359729.0, 7.11975001798)</td> <td>(True, 2.64705431598)</td> <td>(14761694.0, 6.05140425399)</td> <td>(1072321.0, 4.44999996622)</td> <td>(2411.0, 1.0477097521)</td> <td>(4273.0, 0.851488248598)</td> <td>(103559793.0, 10.6382007936)</td> <td>(49110078.0, 7.00425896119)</td> </tr> <tr> <th>MARTIN AMANDA K</th> <td>(nan, 0.612134343218)</td> <td>(85430.0, 0.586488215565)</td> <td>(nan, 0.690552246623)</td> <td>(2070306.0, 0.16169859209)</td> <td>(8211.0, 0.997237664333)</td> <td>(230.0, 0.205748893335)</td> <td>(8.0, 0.654125583284)</td> <td>(0.0, 0.412024344462)</td> <td>(5145434.0, 5.09775639018)</td> <td>(2818454.0, 1.6932755666)</td> <td>(False, 0.375173052658)</td> <td>(nan, 0.52589323995)</td> <td>(349487.0, 0.36921495869)</td> <td>(477.0, 0.593613378808)</td> <td>(1522.0, 0.21367570973)</td> <td>(8407016.0, 0.609562657351)</td> <td>(2070306.0, 0.196202351233)</td> </tr> <tr> <th>SHAPIRO RICHARD S</th> <td>(650000.0, 0.382729377561)</td> <td>(nan, 0.67065886001)</td> <td>(nan, 0.690552246623)</td> <td>(607837.0, 0.427628659031)</td> <td>(137767.0, 1.81257710587)</td> <td>(1215.0, 0.329276547308)</td> <td>(74.0, 0.104676135645)</td> <td>(65.0, 0.237500778364)</td> <td>(nan, 0.670576589457)</td> <td>(705.0, 0.33429178227)</td> <td>(False, 0.375173052658)</td> <td>(379164.0, 0.341483782727)</td> <td>(269076.0, 0.0847481963923)</td> <td>(4527.0, 2.84349038551)</td> <td>(15149.0, 5.06258356331)</td> <td>(1057548.0, 0.165035387918)</td> <td>(987001.0, 0.362025768533)</td> </tr> <tr> <th>WHITE JR THOMAS E</th> <td>(450000.0, 0.521456472283)</td> <td>(nan, 0.67065886001)</td> <td>(nan, 0.690552246623)</td> <td>(1297049.0, 0.302304844785)</td> <td>(81353.0, 0.58906842664)</td> <td>(nan, 0.487467982744)</td> <td>(nan, 0.694769235346)</td> <td>(nan, 0.532233915598)</td> <td>(nan, 0.670576589457)</td> <td>(1085463.0, 0.446267416662)</td> <td>(False, 0.375173052658)</td> <td>(13847074.0, 5.64486498335)</td> <td>(317543.0, 0.188873972681)</td> <td>(nan, 0.794256482633)</td> <td>(nan, 0.648079292459)</td> <td>(1934359.0, 0.072623789327)</td> <td>(15144123.0, 1.80502999986)</td> </tr> </tbody> </table> </div> <p>Looking through these, I found one instance of a valid outlier - Mark A. Frevert (CEO of Enron), and removed him from the dataset.</p> <p>I should emphasize the benefits of doing all this in IPython Notebook. Being able to tweak parts of the code without reexecuting all of it and reloading all the data made iterating on ideas much faster, and iterating on ideas fast is essential for exploratory data analysis and development of machine learned models. It’s no accident that the Matlab IDE and RStudio, both tools commonly used in the sciences for data processing and analysis, have essentially the same structure. I did not understand the benefits of IPython Notebook when I was first made to use it for class assignments in College, but now it has finally dawned on me that it fills the same role as those IDEs and became popular because it is similaly well suited for working with data.</p> <h2 id="feature-visualization-engineering-and-selection">Feature Visualization, Engineering and Selection</h2> <p>The project also instructed me to choose a set of features, and to engineer some of my own. In order to get an initial idea of possible promising features and how I could use them to create new features, I computed the correlation of each feature to the Person of Interest classification:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">corr</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="o">.</span><span class="n">corr</span><span class="p">()</span> <span class="k">print</span><span class="p">(</span><span class="s">&#39;</span><span class="se">\n</span><span class="s">Correlations between features to POI:</span><span class="se">\n</span><span class="s"> &#39;</span> <span class="o">+</span><span class="nb">str</span><span class="p">(</span><span class="n">corr</span><span class="p">[</span><span class="s">&#39;poi&#39;</span><span class="p">]))</span></code></pre></div> <pre><code>Correlations between features to POI: bonus 0.306907 deferral_payments -0.075632 deferred_income -0.334810 exercised_stock_options 0.513724 expenses 0.064293 from_messages -0.076108 from_poi_to_this_person 0.183128 from_this_person_to_poi 0.111313 long_term_incentive 0.264894 other 0.174291 poi 1.000000 restricted_stock 0.232410 salary 0.323374 shared_receipt_with_poi 0.239932 to_messages 0.061531 total_payments 0.238375 total_stock_value 0.377033 Name: poi, dtype: float64 </code></pre> <p>The results indicated that ‘exercised_stock_options’, ‘total_stock_value’, and ‘bonus’ are the most promising features. Just for fun, I went ahead and plotted these features to see if I could visually verify their significance:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">numeric_df</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">column</span><span class="o">=</span><span class="s">&#39;exercised_stock_options&#39;</span><span class="p">,</span><span class="n">by</span><span class="o">=</span><span class="s">&#39;poi&#39;</span><span class="p">,</span><span class="n">bins</span><span class="o">=</span><span class="mi">25</span><span class="p">,</span><span class="n">sharex</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span><span class="n">sharey</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">&quot;exercised_stock_options by POI&quot;</span><span class="p">)</span></code></pre></div> <p><img src="" /></p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">numeric_df</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">column</span><span class="o">=</span><span class="s">&#39;total_stock_value&#39;</span><span class="p">,</span><span class="n">by</span><span class="o">=</span><span class="s">&#39;poi&#39;</span><span class="p">,</span><span class="n">bins</span><span class="o">=</span><span class="mi">25</span><span class="p">,</span><span class="n">sharex</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span><span class="n">sharey</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">&quot;total_stock_value by POI&quot;</span><span class="p">)</span></code></pre></div> <p><img src="" /></p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">numeric_df</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">column</span><span class="o">=</span><span class="s">&#39;bonus&#39;</span><span class="p">,</span><span class="n">by</span><span class="o">=</span><span class="s">&#39;poi&#39;</span><span class="p">,</span><span class="n">bins</span><span class="o">=</span><span class="mi">25</span><span class="p">,</span><span class="n">sharex</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span><span class="n">sharey</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">&quot;bonus by POI&quot;</span><span class="p">)</span></code></pre></div> <p><img src="" /></p> <p>As well as one that is not strongly correlated:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">numeric_df</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">column</span><span class="o">=</span><span class="s">&#39;to_messages&#39;</span><span class="p">,</span><span class="n">by</span><span class="o">=</span><span class="s">&#39;poi&#39;</span><span class="p">,</span><span class="n">bins</span><span class="o">=</span><span class="mi">25</span><span class="p">,</span><span class="n">sharex</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span><span class="n">sharey</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">&quot;to_messages by POI&quot;</span><span class="p">)</span></code></pre></div> <p><img src="" /></p> <p>The data and plots above indicated that the exercised_stock_options, total_stock_value, and restricted_stock, and to a lesser extent to payment related information (total_payments, salary, bonus, and expenses), are all correlated to Persons of Interest. Therefore, I created new features as sums and ratios of these ones. Working with Pandas made this incredibely easy due to vectorized operations, and though Numpy could similarly make this easy I think Pandas’ Dataframe construct makes it especially easy.</p> <p>It was also easy to fix any problems with the data before starting to train machine learning models. In order to use the data for evaluation and training, I replaced null values with the mean of each feature so as to be able to use the dataset with Scikit-learn. I also scaled all features to a range of 1-0, to better work with Support Vector Machines:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="c">#Get rid of label</span> <span class="k">del</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;poi&#39;</span><span class="p">]</span> <span class="n">poi</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="s">&#39;poi&#39;</span><span class="p">]</span> <span class="c">#Create new features</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;stock_sum&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;exercised_stock_options&#39;</span><span class="p">]</span> <span class="o">+</span>\ <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;total_stock_value&#39;</span><span class="p">]</span> <span class="o">+</span>\ <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;restricted_stock&#39;</span><span class="p">]</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;stock_ratio&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;exercised_stock_options&#39;</span><span class="p">]</span><span class="o">/</span><span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;total_stock_value&#39;</span><span class="p">]</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;money_total&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;salary&#39;</span><span class="p">]</span> <span class="o">+</span>\ <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;bonus&#39;</span><span class="p">]</span> <span class="o">-</span>\ <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;expenses&#39;</span><span class="p">]</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;money_ratio&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;bonus&#39;</span><span class="p">]</span><span class="o">/</span><span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;salary&#39;</span><span class="p">]</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;email_ratio&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;from_messages&#39;</span><span class="p">]</span><span class="o">/</span><span class="p">(</span><span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;to_messages&#39;</span><span class="p">]</span><span class="o">+</span><span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;from_messages&#39;</span><span class="p">])</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;poi_email_ratio_from&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;from_poi_to_this_person&#39;</span><span class="p">]</span><span class="o">/</span><span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;to_messages&#39;</span><span class="p">]</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;poi_email_ratio_to&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;from_this_person_to_poi&#39;</span><span class="p">]</span><span class="o">/</span><span class="n">numeric_df</span><span class="p">[</span><span class="s">&#39;from_messages&#39;</span><span class="p">]</span> <span class="c">#Feel in NA values with &#39;marker&#39; value outside range of real values</span> <span class="n">numeric_df</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="o">.</span><span class="n">fillna</span><span class="p">(</span><span class="n">numeric_df</span><span class="o">.</span><span class="n">mean</span><span class="p">())</span> <span class="c">#Scale to 1-0</span> <span class="n">numeric_df</span> <span class="o">=</span> <span class="p">(</span><span class="n">numeric_df</span><span class="o">-</span><span class="n">numeric_df</span><span class="o">.</span><span class="n">min</span><span class="p">())</span><span class="o">/</span><span class="p">(</span><span class="n">numeric_df</span><span class="o">.</span><span class="n">max</span><span class="p">()</span><span class="o">-</span><span class="n">numeric_df</span><span class="o">.</span><span class="n">min</span><span class="p">())</span></code></pre></div> <p>Then, I scored features using Scikit-learn’s SelectKBest to get an ordering of them to test with multiple algorithms afterward. Pandas Dataframes can be used directly with Scikit-learn, which is another great benefit of it:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">from</span> <span class="nn">sklearn.feature_selection</span> <span class="kn">import</span> <span class="n">SelectKBest</span> <span class="n">selector</span> <span class="o">=</span> <span class="n">SelectKBest</span><span class="p">()</span> <span class="n">selector</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">numeric_df</span><span class="p">,</span><span class="n">poi</span><span class="o">.</span><span class="n">tolist</span><span class="p">())</span> <span class="n">scores</span> <span class="o">=</span> <span class="p">{</span><span class="n">numeric_df</span><span class="o">.</span><span class="n">columns</span><span class="p">[</span><span class="n">i</span><span class="p">]:</span><span class="n">selector</span><span class="o">.</span><span class="n">scores_</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">numeric_df</span><span class="o">.</span><span class="n">columns</span><span class="p">))}</span> <span class="n">sorted_features</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">scores</span><span class="p">,</span><span class="n">key</span><span class="o">=</span><span class="n">scores</span><span class="o">.</span><span class="n">get</span><span class="p">,</span> <span class="n">reverse</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="k">for</span> <span class="n">feature</span> <span class="ow">in</span> <span class="n">sorted_features</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s">&#39;Feature </span><span class="si">%s</span><span class="s"> has value </span><span class="si">%f</span><span class="s">&#39;</span><span class="o">%</span><span class="p">(</span><span class="n">feature</span><span class="p">,</span><span class="n">scores</span><span class="p">[</span><span class="n">feature</span><span class="p">]))</span></code></pre></div> <pre><code>Feature exercised_stock_options has value 30.528310 Feature total_stock_value has value 22.901164 Feature stock_sum has value 16.090150 Feature salary has value 14.428640 Feature poi_email_ratio_to has value 13.619580 Feature bonus has value 11.771121 Feature money_total has value 11.005135 Feature deferred_income has value 9.058555 Feature total_payments has value 8.334006 Feature restricted_stock has value 7.335986 Feature long_term_incentive has value 6.448285 Feature shared_receipt_with_poi has value 6.340473 Feature other has value 4.067974 Feature money_ratio has value 3.781568 Feature from_poi_to_this_person has value 3.626045 Feature email_ratio has value 2.176411 Feature from_this_person_to_poi has value 1.318493 Feature poi_email_ratio_from has value 1.279491 Feature from_messages has value 0.613342 Feature expenses has value 0.543049 Feature to_messages has value 0.400295 Feature deferral_payments has value 0.223368 Feature stock_ratio has value 0.013109 </code></pre> <p>It appeared that several of my features are among the most useful, as ‘poi_email_ratio_to’, ‘stock_sum’, and ‘money_total’ are all ranked highly. But, since the data is so small I had no need to get rid of any of the features and went ahead with testing several classifiers with several sets of features.</p> <h1 id="training-and-evaluating-models">Training and Evaluating Models</h1> <p>Proceding with the project, I selected three algorithms to test and compare: Naive Bayes, Decision Trees, and Support Vector Machines. Naive Bayes is a good baseline for any ML task, and the other two fit well into the task of binary classification with many features and can both be automatically tuned using sklearn classes. A word on SkLearn: it is simply a very well designed Machine Learning toolkit, with great compatibility with Numpy (and therefore also Pandas) and an elegant and smart API structure that makes trying out different models and evaluating features and just about anything one might want short of Deep Learning easy.</p> <p>I think the code that follows will attest to that. I tested those three algorithms with a variable number of features, from one to all of them ordered by the SelectKBest scoring. Because the data is so small, I could afford an extensive validation scheme and did multiple random splits of the data into training and testing to get an average that best indicated the strength of each algorithm. I also went ahead and evaluated precision and recall besides accuracy, since those were to be the metric of performance. And all it took to do all that is maybe 50 lines of code:</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">from</span> <span class="nn">sklearn.naive_bayes</span> <span class="kn">import</span> <span class="n">GaussianNB</span> <span class="kn">from</span> <span class="nn">sklearn.svm</span> <span class="kn">import</span> <span class="n">SVC</span> <span class="kn">from</span> <span class="nn">sklearn.grid_search</span> <span class="kn">import</span> <span class="n">RandomizedSearchCV</span><span class="p">,</span> <span class="n">GridSearchCV</span> <span class="kn">from</span> <span class="nn">sklearn.tree</span> <span class="kn">import</span> <span class="n">DecisionTreeClassifier</span> <span class="kn">from</span> <span class="nn">sklearn.metrics</span> <span class="kn">import</span> <span class="n">precision_score</span><span class="p">,</span> <span class="n">recall_score</span><span class="p">,</span> <span class="n">accuracy_score</span> <span class="kn">from</span> <span class="nn">sklearn.cross_validation</span> <span class="kn">import</span> <span class="n">StratifiedShuffleSplit</span> <span class="kn">import</span> <span class="nn">scipy</span> <span class="kn">import</span> <span class="nn">warnings</span> <span class="n">warnings</span><span class="o">.</span><span class="n">filterwarnings</span><span class="p">(</span><span class="s">&#39;ignore&#39;</span><span class="p">)</span> <span class="n">gnb_clf</span> <span class="o">=</span> <span class="n">GridSearchCV</span><span class="p">(</span><span class="n">GaussianNB</span><span class="p">(),{})</span> <span class="c">#No params to tune for for linear bayes, use for convenience</span> <span class="n">svc_clf</span> <span class="o">=</span> <span class="n">SVC</span><span class="p">()</span> <span class="n">svc_search_params</span> <span class="o">=</span> <span class="p">{</span><span class="s">&#39;C&#39;</span><span class="p">:</span> <span class="n">scipy</span><span class="o">.</span><span class="n">stats</span><span class="o">.</span><span class="n">expon</span><span class="p">(</span><span class="n">scale</span><span class="o">=</span><span class="mi">1</span><span class="p">),</span> <span class="s">&#39;gamma&#39;</span><span class="p">:</span> <span class="n">scipy</span><span class="o">.</span><span class="n">stats</span><span class="o">.</span><span class="n">expon</span><span class="p">(</span><span class="n">scale</span><span class="o">=.</span><span class="mi">1</span><span class="p">),</span> <span class="s">&#39;kernel&#39;</span><span class="p">:</span> <span class="p">[</span><span class="s">&#39;linear&#39;</span><span class="p">,</span><span class="s">&#39;poly&#39;</span><span class="p">,</span><span class="s">&#39;rbf&#39;</span><span class="p">],</span> <span class="s">&#39;class_weight&#39;</span><span class="p">:[</span><span class="s">&#39;balanced&#39;</span><span class="p">,</span><span class="bp">None</span><span class="p">]}</span> <span class="n">svc_search</span> <span class="o">=</span> <span class="n">RandomizedSearchCV</span><span class="p">(</span><span class="n">svc_clf</span><span class="p">,</span> <span class="n">param_distributions</span><span class="o">=</span><span class="n">svc_search_params</span><span class="p">,</span> <span class="n">n_iter</span><span class="o">=</span><span class="mi">25</span><span class="p">)</span> <span class="n">tree_clf</span> <span class="o">=</span> <span class="n">DecisionTreeClassifier</span><span class="p">()</span> <span class="n">tree_search_params</span> <span class="o">=</span> <span class="p">{</span><span class="s">&#39;criterion&#39;</span><span class="p">:[</span><span class="s">&#39;gini&#39;</span><span class="p">,</span><span class="s">&#39;entropy&#39;</span><span class="p">],</span> <span class="s">&#39;max_leaf_nodes&#39;</span><span class="p">:[</span><span class="bp">None</span><span class="p">,</span><span class="mi">25</span><span class="p">,</span><span class="mi">50</span><span class="p">,</span><span class="mi">100</span><span class="p">,</span><span class="mi">1000</span><span class="p">],</span> <span class="s">&#39;min_samples_split&#39;</span><span class="p">:[</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">],</span> <span class="s">&#39;max_features&#39;</span><span class="p">:[</span><span class="mf">0.25</span><span class="p">,</span><span class="mf">0.5</span><span class="p">,</span><span class="mf">0.75</span><span class="p">,</span><span class="mf">1.0</span><span class="p">]}</span> <span class="n">tree_search</span> <span class="o">=</span> <span class="n">GridSearchCV</span><span class="p">(</span><span class="n">tree_clf</span><span class="p">,</span> <span class="n">tree_search_params</span><span class="p">,</span> <span class="n">scoring</span><span class="o">=</span><span class="s">&#39;recall&#39;</span><span class="p">)</span> <span class="n">search_methods</span> <span class="o">=</span> <span class="p">[</span><span class="n">gnb_clf</span><span class="p">,</span><span class="n">svc_search</span><span class="p">,</span><span class="n">tree_search</span><span class="p">]</span> <span class="n">average_accuracies</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">0</span><span class="p">],[</span><span class="mi">0</span><span class="p">],[</span><span class="mi">0</span><span class="p">]]</span> <span class="n">average_precision</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">0</span><span class="p">],[</span><span class="mi">0</span><span class="p">],[</span><span class="mi">0</span><span class="p">]]</span> <span class="n">average_recall</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">0</span><span class="p">],[</span><span class="mi">0</span><span class="p">],[</span><span class="mi">0</span><span class="p">]]</span> <span class="n">num_splits</span> <span class="o">=</span> <span class="mi">10</span> <span class="n">train_split</span> <span class="o">=</span> <span class="mf">0.9</span> <span class="n">indices</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">StratifiedShuffleSplit</span><span class="p">(</span><span class="n">poi</span><span class="o">.</span><span class="n">tolist</span><span class="p">(),</span> <span class="n">num_splits</span><span class="p">,</span> <span class="n">test_size</span><span class="o">=</span><span class="mi">1</span><span class="o">-</span><span class="n">train_split</span><span class="p">,</span> <span class="n">random_state</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span> <span class="n">best_features</span> <span class="o">=</span> <span class="bp">None</span> <span class="n">max_score</span> <span class="o">=</span> <span class="mi">0</span> <span class="n">best_classifier</span> <span class="o">=</span> <span class="bp">None</span> <span class="n">num_features</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">for</span> <span class="n">num_features</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="nb">len</span><span class="p">(</span><span class="n">sorted_features</span><span class="p">)</span><span class="o">+</span><span class="mi">1</span><span class="p">):</span> <span class="n">features</span> <span class="o">=</span> <span class="n">sorted_features</span><span class="p">[:</span><span class="n">num_features</span><span class="p">]</span> <span class="n">feature_df</span> <span class="o">=</span> <span class="n">numeric_df</span><span class="p">[</span><span class="n">features</span><span class="p">]</span> <span class="k">for</span> <span class="n">classifier_idx</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">3</span><span class="p">):</span> <span class="n">sum_values</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">]</span> <span class="c">#Only do parameter search once, too wasteful to do a ton</span> <span class="n">search_methods</span><span class="p">[</span><span class="n">classifier_idx</span><span class="p">]</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">feature_df</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="n">indices</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">],:],</span> <span class="n">poi</span><span class="p">[</span><span class="n">indices</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]]</span><span class="o">.</span><span class="n">tolist</span><span class="p">())</span> <span class="n">classifier</span> <span class="o">=</span> <span class="n">search_methods</span><span class="p">[</span><span class="n">classifier_idx</span><span class="p">]</span><span class="o">.</span><span class="n">best_estimator_</span> <span class="k">for</span> <span class="n">split_idx</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_splits</span><span class="p">):</span> <span class="n">train_indices</span><span class="p">,</span> <span class="n">test_indices</span> <span class="o">=</span> <span class="n">indices</span><span class="p">[</span><span class="n">split_idx</span><span class="p">]</span> <span class="n">train_data</span> <span class="o">=</span> <span class="p">(</span><span class="n">feature_df</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="n">train_indices</span><span class="p">,:],</span><span class="n">poi</span><span class="p">[</span><span class="n">train_indices</span><span class="p">]</span><span class="o">.</span><span class="n">tolist</span><span class="p">())</span> <span class="n">test_data</span> <span class="o">=</span> <span class="p">(</span><span class="n">feature_df</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="n">test_indices</span><span class="p">,:],</span><span class="n">poi</span><span class="p">[</span><span class="n">test_indices</span><span class="p">]</span><span class="o">.</span><span class="n">tolist</span><span class="p">())</span> <span class="n">classifier</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">train_data</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span><span class="n">train_data</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="n">predicted</span> <span class="o">=</span> <span class="n">classifier</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">test_data</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="n">sum_values</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">+=</span><span class="n">accuracy_score</span><span class="p">(</span><span class="n">predicted</span><span class="p">,</span><span class="n">test_data</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="n">sum_values</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">+=</span><span class="n">precision_score</span><span class="p">(</span><span class="n">predicted</span><span class="p">,</span><span class="n">test_data</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="n">sum_values</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">+=</span><span class="n">recall_score</span><span class="p">(</span><span class="n">predicted</span><span class="p">,</span><span class="n">test_data</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="n">avg_acc</span><span class="p">,</span><span class="n">avg_prs</span><span class="p">,</span><span class="n">avg_recall</span> <span class="o">=</span> <span class="p">[</span><span class="n">val</span><span class="o">/</span><span class="n">num_splits</span> <span class="k">for</span> <span class="n">val</span> <span class="ow">in</span> <span class="n">sum_values</span><span class="p">]</span> <span class="n">average_accuracies</span><span class="p">[</span><span class="n">classifier_idx</span><span class="p">]</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">avg_acc</span><span class="p">)</span> <span class="n">average_precision</span><span class="p">[</span><span class="n">classifier_idx</span><span class="p">]</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">avg_prs</span><span class="p">)</span> <span class="n">average_recall</span><span class="p">[</span><span class="n">classifier_idx</span><span class="p">]</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">avg_recall</span><span class="p">)</span> <span class="n">score</span> <span class="o">=</span> <span class="p">(</span><span class="n">avg_prs</span><span class="o">+</span><span class="n">avg_recall</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span> <span class="k">if</span> <span class="n">score</span><span class="o">&gt;</span><span class="n">max_score</span> <span class="ow">and</span> <span class="n">avg_prs</span><span class="o">&gt;</span><span class="mf">0.3</span> <span class="ow">and</span> <span class="n">avg_recall</span><span class="o">&gt;</span><span class="mf">0.3</span><span class="p">:</span> <span class="n">max_score</span> <span class="o">=</span> <span class="n">score</span> <span class="n">best_features</span> <span class="o">=</span> <span class="n">features</span> <span class="n">best_classifier</span> <span class="o">=</span> <span class="n">search_methods</span><span class="p">[</span><span class="n">classifier_idx</span><span class="p">]</span><span class="o">.</span><span class="n">best_estimator_</span> <span class="k">print</span><span class="p">(</span><span class="s">&#39;Best classifier found is </span><span class="si">%s</span><span class="s"> </span><span class="se">\n\</span> <span class="s"> with score (recall+precision)/2 of </span><span class="si">%f</span><span class="se">\n\</span> <span class="s"> and feature set </span><span class="si">%s</span><span class="s">&#39;</span><span class="o">%</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">best_classifier</span><span class="p">),</span><span class="n">max_score</span><span class="p">,</span><span class="n">best_features</span><span class="p">))</span></code></pre></div> <pre><code>Best classifier found is DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None, max_features=0.25, max_leaf_nodes=25, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, presort=False, random_state=None, splitter='best') with score (recall+precision)/2 of 0.370000 and feature set ['exercised_stock_options', 'total_stock_value', 'stock_sum', 'salary', 'poi_email_ratio_to', 'bonus'] </code></pre> <p>Then, I could go right back to Pandas to plot the results. Sure, I could do this with matplotlib just as well, but the flexibility and simplicity of the ‘plot’ function call on a DataFrame is more elegant in my opinion.</p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">results</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="o">.</span><span class="n">from_dict</span><span class="p">({</span><span class="s">&#39;Naive Bayes&#39;</span><span class="p">:</span> <span class="n">average_accuracies</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="s">&#39;SVC&#39;</span><span class="p">:</span><span class="n">average_accuracies</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">&#39;Decision Tree&#39;</span><span class="p">:</span><span class="n">average_accuracies</span><span class="p">[</span><span class="mi">2</span><span class="p">]})</span> <span class="n">results</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xlim</span><span class="o">=</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="nb">len</span><span class="p">(</span><span class="n">sorted_features</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">),</span><span class="n">ylim</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">&quot;Classifier accuracy by # of features&quot;</span><span class="p">)</span></code></pre></div> <p><img src="" /></p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">results</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="o">.</span><span class="n">from_dict</span><span class="p">({</span><span class="s">&#39;Naive Bayes&#39;</span><span class="p">:</span> <span class="n">average_precision</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="s">&#39;SVC&#39;</span><span class="p">:</span><span class="n">average_precision</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">&#39;Decision Tree&#39;</span><span class="p">:</span><span class="n">average_precision</span><span class="p">[</span><span class="mi">2</span><span class="p">]})</span> <span class="n">results</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xlim</span><span class="o">=</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="nb">len</span><span class="p">(</span><span class="n">sorted_features</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">),</span><span class="n">ylim</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">&quot;Classifier precision by # of features&quot;</span><span class="p">)</span></code></pre></div> <p><img src="" /></p> <div class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">results</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="o">.</span><span class="n">from_dict</span><span class="p">({</span><span class="s">&#39;Naive Bayes&#39;</span><span class="p">:</span> <span class="n">average_recall</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="s">&#39;SVC&#39;</span><span class="p">:</span><span class="n">average_recall</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">&#39;Decision Tree&#39;</span><span class="p">:</span><span class="n">average_recall</span><span class="p">[</span><span class="mi">2</span><span class="p">]})</span> <span class="n">results</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">xlim</span><span class="o">=</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="nb">len</span><span class="p">(</span><span class="n">sorted_features</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">),</span><span class="n">ylim</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">&quot;Classifier recall by # of features&quot;</span><span class="p">)</span></code></pre></div> <p><img src="" /></p> <p>As output by my code, the best algorithm was consistently found to be Decision Trees and so I could finally finish up the project by submitting that as my model.</p> <h2 id="conclusion">Conclusion</h2> <p>I did not much care for the project’s dataset and overall structure, but I still greatly enjoyed completing it because of how fun it was to combine Pandas data processing with Scikit-learn model training in the process, with IPython Notebook making that process even more fluid. While not at all a well written introduction or tutorial for these packages, I do hope that this write up about a single project I finished using them might inspire some readers to try out doing that as well.</p> <p><a href="/writing/project/power-of-ipython-pandas-scikilearn/">The Power of IPython Notebook + Pandas + and Scikit-learn</a> was originally published by Andrey Kurenkov at <a href="">Andrey Kurenkov's Web World</a> on June 10, 2016.</p>