Do you also have Peter’s Problem?

Pet·ers Pro·blem. n

The set of problems that arise if one’s online profile is incorrect, resulting in unsuitable online advertising and ratings which may limit your access to information and services regulated by these ratings and clustering based on one’s profile.

In his latest novel (1), the German cabaret artist Marc-Uwe Kling envisions an absurdist post-modern future in a country that for marketing reason chose to rename itself “Qualityland” and of which (also for marketing reasons) all inhabitants must only use superlatives when speaking about it.

The protagonist, Peter Arbeitsloser (Peter Unemployed – to increase transparency, children in Qualityland take their parents’ job title at the time of birth as surname), suffers from a particular problem: The major online actors, TheShop – The world’s most popular online retailer -, which sends you products you will like based on your profile, the social media platform Everybody – everybody is on Everybody! and as well as QualityPartner – the dating service that chooses your partner for you-, have gotten it wrong: Peter receives things he has no use for, he is made to hang out with people he does not like and is dating someone he cannot stand. All of it is due to the fact that people have completely surrendered to the suggestions and recommendations of the underlying algorithms making it virtually impossible to consciously make one’s own choices.

Now Peter has been misclassified by all of these platforms. The most obvious consequences like the TheShop sending him highly inappropriate dolphin-shaped items seem like nothing more than a nuisance at first glance. Much more problematic though is the fact that Qualityland’s society has decided to also completely rely on one’s online profile for any sort of aptitude tests, such as recruiting or any sort of social stance. Much like Black Mirror’s episode Nosedive, one’s profile is condensed into a personal score which functions as a shortcut to one’s worth in society.

Peter, at some point, drops down to a 9 out of a 100 and is therefore part of the “useless” category of society, banning him from most of the daily life. (For real life examples of this see (4))

So, what does Peter’s tale wants us to do? To stop using our favorite online services is hardly an option. But to take ownership of our profiles and to what purpose it is used might be worthwhile.

The twist however is where this highly amusing piece of fiction comes to its limits: Having Peter’s problem becomes less and less likely. Already today, having access to someone’s “likes” on social media makes algorithms far more capable of judging one’s personality than ones acquaintances, friends and family. (2)

While the current mood tends to go to more transparency toward what our data is used for, it must still be clear, that once we create data points in public, conclusions can be drawn from that. This includes also not quite voluntarily produced data aka data collection of which we cannot opt out easily, such as face recognition in public – see again (4))

It is the far greater challenge for us to decide, what we, both as citizens and consumers, would like to do with this data and how we would like to steer our ownership of it.

As organizations collecting these personal profiles, the need for a transparent usage policy of these profiles, that is a

clearly defined data model/structure,
an explicit formulation of intent of usage visible to the user,
and a defined data life cycle from collection to an option of deletion (e.g. flow chart style)

becomes, in my opinion, more and more important. First, because of legal reasons (5), and secondly, to continue to provide any “data driven” added value of quality (pun intended) based on correct and consciously given data.

Notes:

(1) For now, Qualityland is only available in German – but Marc-Uwe Kling’s first series The Kangaroo Chronicles is already available in English.

(2) Private traits and attributes are predictable from digital records of human behavior. Michal Kosinski, David Stillwell, and Thore Graepel. PNAS April 9, 2013. 110 (15) 5802-5805; https://doi.org/10.1073/pnas.1218772110

(3) If you’re interested in what kind of conclusions Google and co have drawn from your online behavior, check out the following links:

(4) The Chinese Government seems to also to be quite ambitious in this direction.

(5) GDPR is in place since May 25th 2018.

Heute Mathe, morgen Lokad

Am 12. Juni 2018 bin ich auf Kurzbesuch in meiner Uni, an der TU Darmstadt und halte einen Vortrag im Seminar ‘Heute Mathe Morgen ?’ und spreche über meine in Paris ansässige Firma Lokad:

12.06.18 — 13:30 Uhr in S2|15 51

Wer Interesse hat mehr zu erfahren über:

Deep Learning / Machine Learning in der Industrie,
Quantitative Logistik Optimierung,
was endliche l_1-Folgen mit Logistik zu tun haben,
Arbeiten in Frankreich,

oder wer tendenziell Interesse hat

in Frankreich zu arbeiten, bspw. wie ich als Supply Chain Scientist,
an einem 6-monatiges Praktikum in Paris oder
einer Promotion im Unternehmen im Bereich Machine Learning,

ist herzlich willkommen!

Bitte teilen!

On June 12th I have the great honor to be back in Darmstadt! I’ll be speaking in the alumni seminar ‘Heute Mathe Morgen ?’

Deep Learning: What it is and how it relates to supply chains

Disclaimer: In this post I’m going to write about how we use Deep Learning in my company, Lokad.

When you follow the news about deep learning, you might have come across exciting breakthroughs such as algorithms which are able to colorize black and white photographs or automatic real-life translations of texts on pictures taken by a phone app . While these are all pretty cool applications, they do not immediately give any direct use cases for most traditional businesses. At Lokad, our goal is to translate the stunning reach of deep learning capabilities into the real world, to optimize supply chains everywhere.

So, before going into detail how we do that, let me quickly and very roughly summarize what Deep Learning actually entails without going too much into technical details.

First of all, deep learning is a flavor of machine learning. Regular non-machine learning algorithms require full prior knowledge on the task (and no training data whatsoever). An expert-knowledge approach to demand forecasting would require you to specify in advance all specific rules and patterns such as

“All articles that have category=Spring will peak in May and slowly die down until October.”

This, however may only be true for some products of this category. It is also possible that there might be subcategories that behave a bit differently and so on. Combining these with a moving average forecast already yields an overall understanding of future demand which is not so far from reality.

However it does have the following downsides:

It does not embrace uncertainty — In our experience, risk and uncertainty are crucial for supply chains, since it’s mostly the boundary cases that can be either very profitable or very costly if ignored,
You have to maintain and manage the complexity of your rule set – An approach is only as powerful as the set of rules that are applied to it. Maintaining rules is very costly i.e. for each rule in the algorithm, we calculate there is an initial cost of about 1 man-day of implementation, testing and proper documentation initially and about half a day of maintenance. Assuming you keep on refining your rules and therefore have to readjust the old ones this yields a cost of 8k € per rule for a five year period. It is worth noting that this only applies for one rule and does not take into account the exponential increase in complexity that arises when dealing with more complex product portfolios. Even demand patterns for small businesses usually exhibit dozens of influences making their maintenance incredibly costly.

Now imagine that there is a technology that could, like a human child, learn on its own to deduce patterns from data and could thus independently predict how your portfolio of products develop throughout a year.

Just like a child in development, a deep learning algorithm will try to make sense of the world by trying to deduce correlations from observations. It will test them and discard those that do not make any sense for the remaining data.

Again following our analogy, like a child learning to makes sense of the world, a deep learning algorithm is consuming lots and lots of data and the key lies in grasping the information that is actually relevant. While a child in a big city might be completely overwhelmed with all the different colors, noises and smells, it will learn later that the traffic lights are the ones to watch out for in combination with noises coming from approaching cars that are most critical when crossing a street. The same mechanism is in place for deep learning. The algorithm may process a vast amount of data and needs to find out the essence of what drives demand.

The way to figure out what is important and what is not is carried out via repeating similar situations several times, like you would repeat correct traffic behavior with a child. A human brain is highly parallelizing its sensory input processing and reaction, so that it is able to react quickly to urgent new data such as a car that is approaching while crossing the street.

With the rise of big data, parallelization became also a key topic driving efficiency and, in fact, feasibility of a “human-like” autonomous learning process.

At Lokad, we actually use the parallelized computing power high end gaming graphic cards in our cloud servers, to efficiently run our optimization for our clients, processing for a portfolio of 10.000 products with five years of sales data in less half an hour while largely outperforming any conventional moving average based algorithms (or even Lokad’s own earlier generation machine learning forecasts) in accuracy.

Lokad then uses the demand forecasting results which come in a probabilistic format to optimize the supply chain decisions taking into account economic drivers such as one’s stance on growth vs. profitability. With these analyses, Lokad directly delivers the best supply chain decisions such as purchase orders or dispatching decisions. “Best” here refers to the economic driver set up (i.e. growth vs. profitability, opportunity costs etc.) that has been put in place supply chain decisions. It it will scale with the business as one’s portfolio and demand patterns become more complex making any hard coded demand forecasting rules which need to be maintained by a human completely obsolete.

Notes:

Average Developer salary in Germany 58k €, 261 working days – 30 days of vacation in 2018 yields a 250€ manday rate)

Technology: A tool or an agent?

Last week, the CEO of Google, Sundar Pichai, presented in a quite impressive keynote the capacities of their new digital assistant Google Duplex. They showed how their assistant was able to make calls to book appointments or make restaurant reservations while carrying out quite naturally sounding conversations. In fact, it was able to understand and react to ambiguity and could deal with information that was not quite what it originally asked for – Quite a leap forward from the very limited interactions and tasks that for example Apple’s Siri used to be able to perform.

Sparked by this presentation, this article juxtaposes the most dominating two views on the position of technology currently exposed by the main technology providers:

One view is that technology mainly serves as a tool for people, as sort of a “bicycle of the mind” enhancing human capabilities. This view is notably demonstrated by the philosophies of Apple and Microsoft. Both these companies come from a background of being the main drivers of personal computing and come from the same era in the 1970’s.

The other view is that technology is supposed to act as an agent, that is by carrying out tasks for you independently. This view is attributed to Google and Facebook – companies that are much younger than Apple and Microsoft from the internet era. In fact, this is exactly what the new abilities of Duplex should show you: It is able to carry out tedious tasks, such as making appointments, for you instead of you. In this second scope, you could also place the autonomously driving cars.

One of the main implications of these different view points is their different ethical setup. While in the tool model, the users – the ones literally using the tools – are always in control and therefore assume responsibility of the actions of operating these tools. On the other hand, the question of responsibility becomes less clear in the second model when technology has its own agency. If this technological agent, say your autonomous digital assistant or your autonomous car does something that goes against your intentions? Who is at fault? You, the technology provider, or maybe the tech agent itself? What if it actually acted upon something that you wanted but maybe only subconsciously?

Therefore, defining the exact scope, the precise intentions and possible means of the actions performed by such an agent seems crucial. In my opinion, having these three things transparent is what we as users should demand from all our “agent providers” aka Google and Facebook.

Repost: Can an algorithm beat VOGUE in predicting fashion trends?

What is Fashion’s next big trend is a three digit billion dollar question. It is however not a question, that, personally, I thought I wouldn’t have much to contribute to, given that I work in a technology company, where the biggest trend seems to be goofy meme t-shirts. So, naturally, our approach to fashion trends is completely non-traditional and does not rely on our own fashion expertise (which is probably a good thing).

Traditionally, trend scouting starts by closely following high fashion designers and fashion shows, which may be picked up later by the mainstream fashion industry.

Other trend sources can come from films and TV series, influences from YouTube and the blogosphere. To determine exactly what will sell in the next seasons used to be a question only industry experts, with decades of experience, could determine. Then, it was more of a guessing game, when a trend wave would pick up and how long it would last before slowly ebbing out or being completely consumed by another trend.

At Lokad, our data scientists have developed algorithms to revolutionize this trend scouting process, allowing for large scale reliable and accurate sales predictions based on prior seasons’ sales history. Furthermore, while trend-scouting may be the holy grail of purchasing decisions for fashion organisation, identifying statistical noise is equally important and usually completely overlooked. Excluding noise in demand, that is observed “links” between products that are happening at random without any particular trend connection, can boost the effectiveness of any buying decisions.

To find out what actually drives sales, that is what attributes of a product make customers buy a product is somewhat elusive. Defining a product for now is more of an art than a science. At Lokad, our quantitative approach allows us to study precisely what has triggered sales, may it be a fancy product name, a combination of price, value of material or may be a certain visible accessory, which made a customer choose one product over other similar products.

In particular, Lokad can leverage similarities between products to predict sales quantities of products that have never been sold anywhere by comparing it to sales of other products in the past. To look for similarities between products, Lokad analyses not only product attributes such as materials or colours, but also envisions leveraging the selling history of customers.

Identifying early trend adopters within a customer base allows Lokad to obtain a next season “preview”: for example, imagine you have that one very fashionable colleague who always wears a certain colour before everyone else does. With big data crunching tools, Lokad can identify this colleague and through their current shopping habits can pick up what many others will want to buy next season.

However, in the end, there is no way around a team of fashion experts who can distinguish the mega trends from the “ordinary” ones. However, with Lokad, this expert group can leverage their insights on a much higher scale with high statistically sound computational power – minimizing costs for any brand or market place. This is what we refer to as augmented human intelligence, putting the enablement of the expert group as the ultimate goal of our work.

This article was first published on LinkedIn on October 14, 2017.

Disclaimer: Katharina currently works as a supply chain scientist at Lokad.

Bits and Pieces

A personal blog on topics related to Data Science

Month: May 2018