{"id":130,"date":"2020-08-07T19:14:00","date_gmt":"2020-08-07T17:14:00","guid":{"rendered":"http:\/\/egert.org\/blog\/?p=130"},"modified":"2020-11-13T15:43:53","modified_gmt":"2020-11-13T14:43:53","slug":"ranked-6th-at-walmart-data-science-competition","status":"publish","type":"post","link":"https:\/\/egert.org\/blog\/2020\/08\/07\/ranked-6th-at-walmart-data-science-competition\/","title":{"rendered":"Ranked 6th at Walmart Data Science Competition"},"content":{"rendered":"\n<p>Before Covid-19 hit in France, my colleague <a href=\"https:\/\/www.linkedin.com\/in\/rafael-de-rezende-8237a927\/\" data-type=\"URL\" data-id=\"https:\/\/www.linkedin.com\/in\/rafael-de-rezende-8237a927\/\">Rafael de Rezende<\/a> asked me if I wanted to join his team for competing in the <a href=\"https:\/\/www.kaggle.com\/c\/m5-forecasting-uncertainty\">kaggle M5 Data Science competition<\/a> on predicting Walmart Sales. <\/p>\n\n\n\n<p>The given data set consisted of a sales history of US stores in three different states in three different categories (e.g. FOOD). We did not get any information on the specific items, e.g. items were just labeled &#8216;FOOD-123&#8217;. Our sales data was cut at a certain moment in history and we were to predict the following weeks of sales as a probability distribution.<\/p>\n\n\n\n<p>My role was primarily business analysis using Python Jupyter notebooks to figure out the impact of aspects of the time series such as day of the week, month-based seasonality, the impact of calendar events such as Christmas (which varied a lot depending on the representation of religions in the different states), but also the effect of food stamp distribution that varied greatly by state.<\/p>\n\n\n\n<p>The team then used this insight to craft a multi-stage state-space model (states inactive or active) with Monte Carlo simulations to generate our predictions as negative binomial probability distributions.<\/p>\n\n\n\n<p>If you want to know more:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/www.kaggle.com\/c\/m5-forecasting-uncertainty\/discussion\/163559\">Rafael&#8217;s summary on kaggle<\/a><\/li><li><a href=\"https:\/\/blog.lokad.com\/journal\/2020\/7\/2\/ranked-6th-out-of-909-teams-m5-competition\/\">Lokad&#8217;s blog post<\/a><\/li><li>Rafael was a guest at a <a href=\"https:\/\/tv.lokad.com\/journal\/2020\/7\/9\/walmart-forecasting-competition-post-game-analysis\/\">Lokad TV<\/a>&#8216;s episode about the competition (23 min)<\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Before Covid-19 hit in France, my colleague Rafael de Rezende asked me if I wanted to join his team for competing in the kaggle M5 Data Science competition on predicting Walmart Sales. The given data set consisted of a sales history of US stores in three different states in three different categories (e.g. FOOD). We [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":133,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-130","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/posts\/130"}],"collection":[{"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/comments?post=130"}],"version-history":[{"count":3,"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/posts\/130\/revisions"}],"predecessor-version":[{"id":134,"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/posts\/130\/revisions\/134"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/media\/133"}],"wp:attachment":[{"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/media?parent=130"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/categories?post=130"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/egert.org\/blog\/wp-json\/wp\/v2\/tags?post=130"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}