Data Science Resources

In this post, I will collect some online data science resources such as online classes, blog posts etc. that I personally find very useful. (Ratings: ⭐ useful, ⭐⭐ very well done, ⭐⭐⭐ recommend absolutely)

Coursera: IBM Data Science Professional Certificate ⭐⭐
This class is a perfect introduction for anyone entering the field. It does not assume much prior knowledge, even though some prior programming experience is helpful. Even if most of it is not completely new material, as it was for me and probably for most people, it is a great refresher: If you already know some topics, you will be quicker going through exercises and labs, e.g. you won’t lose money because of it.

Pricing: around 35€ / month.

Pros:

  • Very comprehensive: Covering theoretic methodology, SQL, Python tutorials, visualisation, etc.
  • Nicely done quizzes and labs.
  • The community is quite active in the forums: Your questions will be answered quickly.
  • The class gives an overview into the IBM data science framework around Watson studio.
  • Not very expensive. The more time you have, the less you spend.

Cons:

  • Within the class, some of the individual classes are a bit redundant.
  • Lecturers have varying presentational skills.
  • The final project for the last class (Predicting car accident severity based on a police accident report data set) was less well curated.
  • It does not go beyond IBM tools.
  • There is little information or material on how to “production-ize” a machine learning project.

Coursera: Deep Learning Specialization (deeplearning.ai) ⭐⭐⭐
This class picks up where the IBM classes stopped. Once you have mastered the foundations in machine learning such as logistic regression, you start with the more complex concepts such as neural networks. I find this class extremely well curated. This doesn’t as a surprise, since Andrew Ng, the course lecturer, is also the co-founder of coursera, so it makes sense that he is able to leverage this platform to its fullest.

Pricing: 41€/month

Pros:

  • Andrew Ng’s tutorial videos are great extremely well done and nice to follow.
  • The labs are interactive (auto correcting) and figure into your final grade. (Usually in coursera, it’s quizzes and peer-reviewed assignments)
  • The community support by the organizer team is very quick and comprehensive.

Cons:

  • The forums are a bit spammed sometimes and it is sometimes hard to find a relevant post. This is probably a general Coursera problem.

I’m currently enrolled in this class, so this verdict is not final.

Repost: Can an algorithm beat VOGUE in predicting fashion trends?

What is Fashion’s next big trend is a three digit billion dollar question. It is however not a question, that, personally, I thought I wouldn’t have much to contribute to, given that I work in a technology company, where the biggest trend seems to be goofy meme t-shirts. So, naturally, our approach to fashion trends is completely non-traditional and does not rely on our own fashion expertise (which is probably a good thing).

Traditionally, trend scouting starts by closely following high fashion designers and fashion shows, which may be picked up later by the mainstream fashion industry.

Other trend sources can come from films and TV series, influences from YouTube and the blogosphere. To determine exactly what will sell in the next seasons used to be a question only industry experts, with decades of experience, could determine. Then, it was more of a guessing game, when a trend wave would pick up and how long it would last before slowly ebbing out or being completely consumed by another trend.

At Lokad, our data scientists have developed algorithms to revolutionize this trend scouting process, allowing for large scale reliable and accurate sales predictions based on prior seasons’ sales history. Furthermore, while trend-scouting may be the holy grail of purchasing decisions for fashion organisation, identifying statistical noise is equally important and usually completely overlooked. Excluding noise in demand, that is observed “links” between products that are happening at random without any particular trend connection, can boost the effectiveness of any buying decisions.

To find out what actually drives sales, that is what attributes of a product make customers buy a product is somewhat elusive. Defining a product for now is more of an art than a science. At Lokad, our quantitative approach allows us to study precisely what has triggered sales, may it be a fancy product name, a combination of price, value of material or may be a certain visible accessory, which made a customer choose one product over other similar products.

In particular, Lokad can leverage similarities between products to predict sales quantities of products that have never been sold anywhere by comparing it to sales of other products in the past. To look for similarities between products, Lokad analyses not only product attributes such as materials or colours, but also envisions leveraging the selling history of customers.

Identifying early trend adopters within a customer base allows Lokad to obtain a next season “preview”: for example, imagine you have that one very fashionable colleague who always wears a certain colour before everyone else does. With big data crunching tools, Lokad can identify this colleague and through their current shopping habits can pick up what many others will want to buy next season.

However, in the end, there is no way around a team of fashion experts who can distinguish the mega trends from the “ordinary” ones. However, with Lokad, this expert group can leverage their insights on a much higher scale with high statistically sound computational power – minimizing costs for any brand or market place. This is what we refer to as augmented human intelligence, putting the enablement of the expert group as the ultimate goal of our work.

 

This article was first published on LinkedIn on October 14, 2017. 

Disclaimer: Katharina currently works as a supply chain scientist at Lokad.