The First Year of Building a BI Team at Stylight
Looking back on one year of progress, results and lessons learned.
What is the business value of freeing up over 120 hours wasted decision-maker time every week? This is one of the impressive results that Dr. Konstantin Wemhöner and his team have accomplished in their first year of collective work, establishing a dedicated BI team at their company.
Konstantin is Head of Business Intelligence at Stylight, an internet startup headquartered in Munich. When he joined in June 2014, the company was already over 100 people strong, operating globally with offices in London and New York. In the following interview, he shares some details about the first year at the company, lessons learned along the way and his plans for the future.
(If you would like to get a quick impression of key takeaways from the interview, check out this summary of my personal highlights.)
Stylight is Europe’s leading fashion aggregator operating at a global level with one goal: to help aspiring women evolve their style through the power of shoppable inspiration. Today Stylight is available in 15 countries all around the globe. It has its headquarter in Munich with two office branches in London, United Kingdom, and New York, USA.
Stylight was founded in Munich (Germany) in November 2008 by the four friends and students Anselm Bauer, Benjamin Günther, Max-Josef Meier and Sebastian Schuon. Their goal was to make the fashion landscape more accessible for everyone. The four created an easy-to-use fashion search tool for an entrepreneurship program. This enterprise has grown into a successful e-commerce company with a team of over 200 employees from all around the world. Stylight has currently over 350 partner shops, 6,000 brands on offer and more than 10 Mio. visits per month worldwide. Investors of Stylight include Holtzbrinck Ventures, Tengelmann Ventures and Seven Ventures.
For more information, go to www.stylight.com.
Dr. Konstantin Wemhöner studied human biology and economics. He did his PhD in neurosciences on the role of ion channels in cardiac arrhythmia and epilepsy. From consulting, he transitioned to media optimization (TV, print and online), and business development for a mid-sized pharmaceutical company. In 2014 he joined Stylight to lead the Business Intelligence department.
In The Beginning
What triggered the formation of a dedicated BI team?
My impression is that people realized that data as a dedicated entity in the company should be valued more. Most people were working with data already, but no one had time to focus on what was possible by connecting all available data. Having a dedicated workforce with an overview and who are able to consult others on data-related topics was the intention behind building the team.
When you started at the company one year ago, what was the general situation?
The idea that led to my hiring was the need for someone who understood the business aspects of data but also had solid foundation in statistical evaluation. Understanding the significance of observed events, for example – what events must be taken seriously and what might be explained by random movement. When I arrived, Stylight already employed one data scientist who was handling existing data. His work was mostly project-based, building custom tools and performing certain analyses. BI as a department did not exist back then. The initial BI team was the two of us: me in the role of an analyst, and him filling the data scientist/data engineer roles.
The data was stored in different systems at the time. Production databases for internal and external services were located on MySQL machines, and we relied a lot on Google Analytics. Apart from that, each department had their own online reporting tools, which were working completely independent from the company’s other tools. Analyses were performed by combining data from different data sources, which was often done manually. This was especially complicated when trying to answer larger queries which dealt with topics that needed to be viewed above a single department level.
How important was data for decisions at Stylight when you started?
There’s a quote by one of the company founders which I like very much: “When four guys start a fashion company, they must rely on data.” One guiding principle at Stylight is that decisions should be founded on data. That doesn’t mean that data completely controls all actions, but it does imply that you are not doing things blindly. Sometimes, it’s worth making a decision that has a short-term negative impact according to the data if the expected long-term value of the decision warrants it.
When I started, people did not have access to as much relevant information, whether there were no dedicated resources to do the necessary data work or because making the connections across important pieces of data were not possible on a day-to-day operational point of view. Too much manual work was involved when handling data. Making connections across the boundaries of departments or even seeing the opportunities to link data for a better view was challenging.
What first steps did you take after joining the company?
I needed to find out what pieces were already in place. There were two main parts:
The first was a financial report that needed to be sent out on a daily basis. It contained many metrics that were meant to communicate the economic success of the company. The responsibility for providing these reports was passed to me. My first job was getting a quick understanding of where the numbers came from, what they meant and making the connection between the numbers and the business model that generated them.
The second part was getting an understanding of what data was available in the company. I had to determine who was producing which parts of it, and have a deeper understanding of each. It was a whole lot of data.
I also had to understand the company’s internal operations, like the contexts in which people were working, the positions of the departments in the value chain, and how they cooperated and communicated with each other. Finally, I needed to answer the question of how we in BI could make things measurable across the company and inside each department.
We have made good progress in the past year, but the process is still ongoing. BI, at least in my opinion and in the form in which we understand the term at the company, is in a great position to assist departments to communicate with each other.
What qualities of the company have most influenced your work?
First, we are an internet-based company, which implies that many aspects are pretty technical in nature and every department is already used to working with data. This helps to make BI work simpler, because we can just say that we need access data that already exists instead of saying, “We need specific data from you, so please get back to us when you pull all the pieces together.” Of course, sometimes we still have to track down data, when we launched our magazine, for example. Some details were not being tracked in the best way possible. Another example is our current process of tracking our site users. There is a lot of value in knowing who your customers are and providing them with the best possible user experience. Data can do a lot to achieve that.
One of our assets is the fact that the founders are deeply rooted in the tech scene in Munich and throughout Germany. Their connections with other internet companies makes it easier for us to get in touch with our peers in the BI area. I can’t stress enough how helpful that is.
What was usually the most work-intensive part of dealing with data?
The most work-intensive part was aggregation – actively searching for the parts of the data that one actually wanted to see. Trying to get an overview and an understanding of data that lived outside of your own department. Or what data was available in the company and what the meaning behind the numbers was. A clear definition of certain metrics was lacking, especially because some metrics that were used across the company sometimes had different data origins in single departments and were calculated independently. There was much confusion, and much effort connected to understanding why the numbers of one department deviated from the numbers that others had arrived at, although both wanted to measure the same things.
What were the biggest challenges in the past year?
When I started, I was relatively new to the field of BI in general. During my university studies, I acquired useful skills, such as analyzing data and understanding statistics, but not an understanding of the technical aspect of BI.
The engineers on the team build technical solutions and maintain them afterwards, and they are also responsible for thinking about the details of what is being built. If I would just go ahead and dictate that I needed a certain data warehouse with a that specific ETL tool and some more technical details, they would probably still build it, but a) they would be less motivated, and b) I might be imposing an inferior solution because I lacked the level of understanding that an engineer has.
It was a challenge for me to get to a sufficient level of technical understanding so that I was able to participate in discussions about technical aspects. Gaining an acceptance among engineers, not only from my team but in general, was not easy. I was a non-engineer who had to join the discussion and communicate on a sufficiently technical level. Acquiring enough knowledge to get to a point where engineers felt comfortable with my understanding took a while. My goal was to have an engineer who talked to me think something along the lines of, “He is understanding what’s happening here. He doesn’t get all the details, but we can talk about it and it helps both sides”.
How has working with data changed and what are the implications?
We are enabling cross-department views on day-to-day operations. We are supporting coworkers by taking care of analytical work that can be automated so our coworkers have more time to focus on their main job goals, which they are actually good at and love doing, instead of copy-pasting data every morning. People can use this expanded information that has not been available before in their daily work.
The feedback I’m getting in general is that our work as a BI team and the value that we helped create is a factor in that success.
What were noteworthy “small wins”?
Those were among the first projects where we were able to create value for the company. There are multiple examples.
Our Business Development department relies on a certain kind of report. Previously, everybody from the department had to aggregate data manually from around four to five data sources on a daily basis. They had to search for the data they needed and enter it, everything by hand. By automating a big chunk of this effort, the data was ready for them each morning without much effort. That saved about half an hour every day for each team member. With ten people involved, it’s easy to estimate how much working time was lost every single week.
Another example has to do with product processing. The goal there is to apply “tags” to products which come from different online shops and lack information that we need. These tags might describe the brand, the pattern or the color of a product. Those are subsequently listed on the site and are used for navigation to search and to divide everything into categories. We created a visualization in Tableau, which made it possible to see on a daily basis a very high-level view of the global state of tagged products. From there you could navigate the information and look at the products in each country or single categories to get a better understanding. If you wanted to get this level of insight and overview previously, even with simple queries, it would require about three to four hours of manual effort per day. After we were done, the data was ready to be explored each morning. After about fifteen minutes of interaction, the person managing product processing knew how the previous day had gone and was better able to plan how products or resources should be best allocated.
There is one thing which I would rather have done differently in hindsight. When we started out, we planned our approach and decided that we wanted to go through the company, department by department, following the value chain of the company. That decision resulted in us focusing on working with a single department and creating significantly less value for all the other departments in the long run, apart from small requests. A more sensible approach would have been to get an overview of what low hanging fruits there are across the company and helping each department to achieve at least one quick win. That would have resulted in everybody getting a bit of relief as soon as possible. Currently, that’s the approach we are executing.
In summary, what has been achieved in the past year?
From a global perspective, we helped the company arrive at a clear understanding of what BI is and what BI can do for everybody. We’re now at a point where if somebody encounters data-related problems or needs help, we are directly addressed and are happy to provide support.
From my point of view, our colleagues now better understand the impact of using data as a foundation for their actions. Sometimes, if they don’t have the data, they get in touch with us right away since there is an understanding that BI can help everybody in the company. This works for business units which are very close to the available data naturally, but also for projects like content campaigns. We can use our data to help them research current trends and choose related topics or use some of our findings as starting points for stand-alone articles.
So, we were able to create an understanding of what BI can do for the company and give departments the opportunity to actually use it. From the technical side, I believe we have developed an understanding of how our setup should look like. We have a clear vision of not only what the BI department wants to provide in the future, but also what it will not deal with. For example, reporting is not a goal in and of itself. Our purpose is not to create reports nobody uses. Instead, people from single departments should be enabled to get all the numbers that they currently need by themselves. That’s where we want to arrive. After we are done connecting everything to our new data warehouse, the next steps will lead us closer to self-service BI.
We want to make the numbers accessible and reliable. Temporary data inconsistencies will happen in any company – over and over. BI is there to validate that those numbers are consistent. If you are losing 10% of the click data for technical reasons, you might not realize this when looking at that issue from a business point of view. You might look at the report and think, “Well, that must have been a bad day.” Our task is to make sure that the data quality is high.
How does your work influence the working days of your non-BI colleagues?
In many cases, we assist where departments hit their knowledge limit or encounter a new problem during their work. We try to take a bit of workload off them so they can focus on their daily business while we take the time to dig deeper. Often, we are operating between departments, especially when problems occur or new concepts need to be established.
We encourage discussions and in some way even force some to happen. I believe, as an operations department such as BI, you have a better feeling for the fact that some processes which seem to be specific to that department actually reach far beyond single departments. Let me clarify: the departments know this for a fact, but the daily awareness just isn’t there if you are not facing it on a daily basis.
Naturally, our work influences people if the reports we provide are not up to date. That’s a pretty negative impact, especially because people are relying heavily on what we deliver to them and base their work on it. Of course, they can do their jobs without having the reports, just as they did before, but now that they know what is possible and how useful it can be, they don’t really want to miss it in their daily work because it is genuinely helpful and beneficial.
What aspects of your work are most important to you?
One aspect which is very important to me is the interaction with other departments. It’s crucial to find ways to get everybody who is directly involved with various projects completely on board. This is especially true in a company that is steadily growing and where new people are joining on a regular basis. You must make sure that everybody stays on the same page. Of course, you want the departments to have a degree of independence as they are working on distinct things.
It’s important to start out with very clear explanations of how data should be passed to us. If I would be approaching the topic over again with the knowledge that I now have, one of the first aspects I would take care of is communicating this more clearly: Look, this is the data warehouse we are going to use, that’s the data handover point, and we will need to get the following information. An example would be the number of products which are listed on the site – we don’t really care how the data is generated, but we need to have it available. It’s about defining “until here you are responsible, and starting from that point we take care of things and expect them to look a certain way.”
What is your most recent result?
Something that we are currently working on is to move our Redshift data warehouse cluster from a testing phase into production. It will be immensely helpful once it’s finished.
From an analytical point of view, we are currently supporting the launch of the magazine by automating reports and making the underlying queries sufficiently performant. This way, everybody will actually want to look at the data daily without having to deal with frustrating waiting times. Instead, when you arrive in the morning, the information is already there and you can see how things are going.
To put it into perspective, waiting for a query to complete is not the worst case. What would be really frustrating is when part of the data simply fails to arrive to you as a user. If this happens, you will be looking at it once per week at most, or not at all. In the areas where people are working less with data on a daily basis, it’s especially important that they develop an intuitive understanding that the information is not just there to look fancy, but to help them with their daily work. Getting people to implement working with data into their daily routine and help them realize the value of it is very important.
This leads to more eyes on the data, which is a definite upside. In case there are some strange events, such as a reduction in sessions or a rising bounce rate, people will notice and help everybody react faster. The data has to be accurate and be delivered in a timely fashion as a prerequisite.
What are the greatest challenges for you right now?
First, a certain amount of communication and negotiation is needed, depending on the department. It takes time to establish an understanding that it’s the department’s job to hand their part of the data over to us, and we only can take over from a certain point. We are not the people who are running around and retrieving everything.
Additionally, now that everything is in a kind of mid-stage of development, the expectations about what we can deliver are sometimes a bit higher than reality permits. This means that we are managing expectations but also telling people what the long-term perspective is. We want to keep morale up but make sure that everybody understands at what stage we’re at, where we are heading, and at what pace.
The topic of self-service BI is a larger challenge at the moment. We’re trying to give people the necessary tools and deliver training so they will be able to make the best possible use of them.
You already started on the training efforts with internal Google Analytics workshops. What were the results? Has it reduced the amount of questions people have?
There were even more requests, but that’s great! That shows me that people are making good use of the knowledge and are actively using the available tools. Getting asked more elaborate questions means that people are approaching the limits of their current knowledge and are interested in going further. More interest and more questions were the results.
Most importantly, this means that people are more aware that we can help them in these matters, and that we like doing it. That has to do with how I see BI: you have to have a certain service mentality, because you are basically working to support others. Your impact on their work is what makes your work become good work. In that sense, it’s important that a) you are approachable for everybody, and b) that you completely understand that your work generates value by empowering others to perform better. You don’t see the results right away. You may have created a new kind of report, but you will not be the one who uses it to achieve an increase of 100k Euro in revenue – only indirectly. People’s direct feedback is the most immediate way to measure your work in the short term.
You are currently working on growing your team. How are you doing?
Stylight is growing at a fast pace across all departments. In the short time since I’ve started here, we almost doubled in employee numbers. Our growth means we are in constant need for good people - including in our BI team. However, as we are a fashion company, a person looking from outside might not immediately realize that this company might also need data scientists. In my opinion, many potential applicants for BI are therefore still not aware of the exciting work field Stylight could provide them with. We have to work on that.
What are your long-term goals?
A concept I have frequently encountered is that BI often happens in three steps, if we disregard work at the data level. The first step is, as a starting point, to make it possible for each user and each department to take care of their reporting needs themselves. The second step is to provide guidance in addition to what has been put into place and optimized until then. And in the third step, when everything listed previously is covered to about 90%, which is probably a realistic expectation, we can say: “Now BI is able to work exploratively and will help you generate ideas.” You do your daily work, and we take a look at the “crazy shit”.
For example, we look at issues like user clickout behaviour (that’s what we earn money on) in correlation with the weather, day of the week or time of day. The result of such an analysis might be that we decide something like “on a Saturday afternoon, at 13 degrees Celsius we don’t need to bit quite so high on PPC advertisements compared to a Sunday evening around 20:00.” Such work needs time, and you don’t know right away if it will lead to value being created. But those are the things that somebody in a data scientist or analyst role, I believe, really wants to do.
There are three goals we would like to achieve. The first is that every part of the company is taking care of their own data. There are clearly defined interfaces where data can be collected from outside of the department. The second is to provide self-service BI for everybody in the company, from the management level who can make active use of it to people who don’t strictly need it for their daily work. The third is providing internal BI consulting, which enables data scientists and analysts to tackle problems which are challenging and very interesting for them.
What do you think could be realistically achieved in the coming year?
That we will have all business critical data sources connected to our central data warehouse and will have trained a bigger part of the company in the usage of our self-service tools. We want to achieve a clear understanding of how they will be responsible for their own reportings and from what point we will step in. Also, we want a certain degree of freedom in our own work – being able to investigate exploratively from time to time, even if those inquiries might be fruitless.
What could have gone better?
Assimilating the mindset earlier, that data should be kept centralized and having people pass their data on to the BI team. I also wish that I had taken about half a day of time to work on a more strategic level on a regular basis. I was very involved in the day-to-day operations because I was filling the role of the analyst in the team. I should have moved a step back, looked at what was happening from a different perspective and thought about things strategically.
Is there something you wish you had either stopped or started doing much earlier?
One thing I wish I had started earlier was to grow the team. There was certainly need to get more people on the team. We were working at the limit of our backlog pretty much constantly. I should have seen hiring as a more urgent matter earlier because it takes time to find good people.
As for what I could have stopped: me being engaged in operative activities. It would have helped to find somebody who would be responsible for performing the analyses. As mentioned previously, it would have helped me to let the department take care of immediate issues and find time to think strategically.
Before using Redshift, you had a time when you were using Tableau to aggregate and manage data from different data sources. How did this work out?
We had taken 2-3 months of time to evaluate different solutions. We wanted to be able to integrate data and create visualizations. We ended up using Tableau for both. Tableau has the possibility to connect to many different types of data sources through its many database connector types. There is Google Analytics, for example, but there are also connectors for MySQL and many others – JDBC or ODBC is supported for custom connectors or feed it with an ETL job.
Because there is the possibility to rename, and connect data across data sources and restructure pretty much anything inside of the application, we have misinterpreted its power, and have wrongly tried to use it as a kind of data warehouse. We actually tried to maintain data inside of Tableau, which is something I really would not recommend to do. At the beginning of this year we have switched to Redshift data sources. The huge performance increase and the peace of mind, knowing that no data will be lost, has improved our situation a hundredfold.
At what point do you think does it make sense to start building a dedicated BI team?
Well, that depends on the business model and the exact definition of “BI team”. In my opinion, especially if you are an internet startup or a technical company, you should get somebody who understands what data means and what value it has on board as early as possible. From my current perspective, if I were to found a company, I would want to have somebody on the founding team who will be able to take care of this. For me, a perfect team would consist of three people: somebody who understands finances, someone who is extremely good technically, and a person who takes care of the data. As early as possible.
What’s important when working on improving the team itself internally?
Communication. There are things which are frustrating at times, and you have to make sure that everybody understands that such phases happen. Yes, there are times when the work might be less enjoyable, but that the situation will improve in the long run. Once again, communication is important. Everybody on the team needs to keep their eyes open and constantly be on the lookout for potential problems and problem statements. On our team, I encourage everybody to voice their concerns as soon as they see something that might be going wrong or that the rest of the team can help solve.
A pattern that I often see is that startups who are in possession of data and want to start to make use of it go ahead and hire a data scientist who ends up having to deal with data engineering issues instead of working with the data. Would it have helped you to have a data engineer on your team from the get-go?
Yes, definitively. Luckily, the data scientist on my team was able to handle data engineering problems. But I don’t believe that those were the kinds of tasks that he likes the most.
I recently read an article about Zynga, namely the details about how they are defining data flows before anything goes live. Their approach is to answer “What do we want to measure?” beforehand. They are in the business of offering a digital product, as is Stylight, and put much thought into what will be needed to be optimized for it to perform well. It’s not an approach like “we start out and then we will see what can be tweaked”, but instead they plan in advance for the details that need to be measured and make sure that it is possible. Only then do they start working on the game.
In my opinion, it’s very helpful to have somebody on board who has a higher architectural understanding and who enjoys handling and managing data sources. I also believe, that it’s a very specific kind of job which you must enjoy as well as being good at. Like a dentist who is genuinely happy and constantly thinks to himself, “This is the best profession ever,” but someone else looks at him and thinks, “That poor person has to look into mouths all day.” I definitely think that having someone dedicated to take care of data engineering is very beneficial.
What were the most important things you did that got you to where you are today?
Networking, communicating and being persistent. When you are the head of a department, you are responsible for it. You can only be responsible if you have the control over the decisions being made. With regard to this, you should get to know what the team wants. You should not always give in if there is an opposing wind in the company, up to some point.
We are an ops department, we are a service department, but we also have a vision of how we want to work and what the department should be like. It’s quite important to be flexible, but not give in if demands go too far against the setting we need to do our best work. Of course, this vision has to be reviewed depending on what is happening as a whole in the company. BI should not be solely seen as a tool which can be used, but an active unit that needs to be given space to develop, and not just to be shaped passively by external influences.
What advice would you give to someone who is in a similar situation as the one you have started from?
Start to network and communicate as soon as possible. Find out what others in the field are doing and what works for them.
Many things will need to be built and maintained in the long run. Make sure that the skills of the people on your team cover everything that is needed to make this possible. Those people should be genuinely interested in their respective areas. In my opinion, a dedicated analyst should join the team only in a later stage, but it would be best to have someone who can take care of the data engineering aspect and someone who can help to build the reporting infrastructure.
Search exchange with peers so you are better able to judge what’s possible and what’s realistic. Collect information early on what data is already present in the company. Look at what problems you might be able to address, but make sure to think about problems completely detached from the available data. Fill up your backlog and prioritize.
Thanks for reading! I hope that you have received value from this interview. If you want to get in touch with Konstantin and talk data, feel free to drop him a mail at firstname.lastname@example.org. If you are currently building a data team at your startup and would like your first data scientist to be happy and productive, drop me your email address below and I will send a few helpful resources your way.