

The internet would have you believe that Data Science is simultaneously the sexiest job of the 21st century and the role most likely to become obsolete. 困惑? 是的,我也是. 

这一点也不困扰我. 作为数据科学家,我们喜欢概率,我们对任何事情都不确定. 这意味着我们不太擅长解释我们实际在做什么. 我能做的就是告诉你在KrakenFlex做一名数据科学家是什么感觉


The best part of being a Data Scientist at KrakenFlex are the problems that we get to work on. They’re very complex problems that no one has solved before and they contribute directly towards preventing climate change. 

I am part of a team that is currently developing the software and algorithms that control electric car charging overnight in order to use the greenest and cheapest electricity.  (http://octopus.energy/intelligent-octopus/ ) This involves prediction (how many cars will need charging and how much energy do they need?)和优化(我们应该何时收费?). 两者都是极具挑战性的数据科学问题,也是能源行业面临的新问题. Electrification in transport is a trend that brings a totally different electricity demand than we’ve experienced before. What’s most exciting is that this demand can be flexible and at KrakenFlex, we live for flexibility! 

什么是灵活性? 想想下午6点下班回家后给电动汽车充电的人. At 6pm the electricity grid is under pressure with lots of people using electricity as everyone gets home from work and cooks dinner. 这个人的车很可能不需要马上充电. 相反,我们可以等到半夜需求低得多的时候, 当所有人都睡着了. 这就是我所说的灵活性. 在一天的不同时间点打开和关闭事物的能力. The reason this is going to become more important is because of how many electric cars we will soon have on our roads, 再加上热能的电气化,我们将不再使用传统的燃气锅炉.

By moving demand from the peak (between 4-7pm) we can help to reduce the need for instances where fossil fuel power plants need to be powered up to meet demand. 几十万辆汽车在家里充电, 同时将需要相当于一个大型电站的功率.  

我们在KrakenFlex的其他数据科学家也在研究酷, 复杂问题-包括, how to optimise grid scale batteries; researching how a local energy market might work; or building new functionality into our optimisation tooling. 


团队名称. 数据科学家分布在KrakenFlex的各个部门,组成跨职能团队. 我是"时髦长臂猿"的一员,,是3名软件工程师的家, 2数据科学家, 我们的产品大师. We also have a Data Science guild where all the Data Scientists come together to share knowledge or help each other out. 以这种方式工作提供了很多很好的学习机会. KrakenFlex鼓励数据科学家像开发人员一样编码,并帮助他们实现这一目标, through the abundance of pair programming opportunities (where you team up and code with someone else). 数据科学团队也不甘示弱?  有自己的标志,我还没看懂,但里面有一只独角兽. 我们不能被指责太把自己当回事! 

当我们在能源灵活性领域工作时, 我们也可以在家和办公室之间灵活地工作. 


If I was to try and break down a typical workflow for a Data Scientist at KrakenFlex it would look something like this: 

1 .发现

这是我们弄清我们试图解决的问题的细节的地方. 我们使用Miro, 这就像一个在线白板, 我们问了很多问题来梳理我们真正想要做的事情. 我们倾向于作为一个团队来做这件事,这使得它具有互动性和概念上的趣味性. It is also important to understand why we’re solving the problem (in case it doesn’t need solving at all). 

步骤2 -概念验证

The next stage of the process for the Data Scientists will likely involve using a Jupyter Notebook to test out the ideas suggested in discovery and start building out a proof of concept. This is where we can really start to dig into the details of how an approach or algorithm works and whether the results are likely to meet the requirements we need. This stage is often very iterative and we will regularly give results to the product team or customer for feedback. We can also make use of components of our optimisation framework that we have been building out since the company was founded. There aren’t rules here, we’re just looking for the best algorithm or approach to solve the problem. 

3 .构建

The third stage is likely to involve building out the best version from the proof of concept work into the main code base. This is where we will start working closely with the software engineers again to build out the interactions, 数据馈送和测试将可靠地捕获我们实现中的任何错误. 这项工作将不再在笔记本中进行,但仍将基于Python.

步骤4 -监控

一旦产品投入生产,我们就不能放下工具放松了. We need to think about how well our solution is performing (how much carbon and money we saved) and monitor this over time. Alongside our software developers we also look at how best to store and query our data so that other colleagues can make use of it. 比如我们的车辆充电优化, 客户可能会在一夜之间联系他们的汽车充电问题. It’s important that the customer support team can easily access any data to help them explain to the customer what went wrong and provide a good customer experience. 


Funky Gibbons have a soft spot for online board games (nothing like trying to play Pictionary with a touchpad). We also regularly complete retros where as a team we celebrate what went well and propose ways we can improve based on the last sprint. Every project is technically different and we’re constantly making tweaks to the way we work too. 或者像上周那样, 我们参加社交活动,看看谁是飞镖高手(不点名).


如果你想进入数据科学领域,我的建议总是很简单. 找到一个你感兴趣的问题,然后开始尝试用数据来解决它. 从python开始,从Jupyter Notebook开始. I think working on your own problem is much more motivating and more realistic of the challenges you’ll face in reality. 例如,我喜欢能量和运动,所以我会考虑这样的事情:

  • 使用你自己的智能电表数据来预测你的用电量.

  • 使用智能手表上的运动数据来分析你的心率和训练表现.

你可以做很多不同的事情. Alternatively you can find problems that are more structured and already set out in formal courses. 这些都很棒, but remember part of the job of the Data Scientist is understanding the problem that needs solving. I encourage you not to be put off by what might seem like a mountain of knowledge you need to acquire. 你会惊讶于你能如此迅速地开始贡献. 

你可能已经注意到,我没有提到机器学习、统计学或人工智能, 和确定, 这些都是与Krakenflex数据科学家的世界相关的术语, 但是,在我们的生活中,还有比这些流行语可能给我们带来的更多的东西. 不管你是否相信互联网对数据科学的看法, I hope sharing a little bit about a day in the life of a Data Scientist at KrakenFlex has been helpful. There’s a tidal wave of data science problems that need solving to get the world to net zero and if this sounds exciting to you, 你不会失望的. 

