My views on anything

One model to predict them all (failures that is)

One model to predict them all (failures that is)

Having done various predictive maintenance scenarios over the course of 3 years, I noticed two common pitfalls. First, the lack of business understanding and the cost of being right vs the cost of being wrong (I will write about that later). Second, the idea of a single model that can predict it all, under all conditions, no matter what. The latter I often see when working novice Data Scientist. They will spend most of their time re-training their model to get their perfect F1 or accuracy metric instead of knowing when to stop and re-think strategy. This problem gets even more severe when you deal with very imbalanced datasets.

Why does this matter?

Each device and each sensor will have its unique data footprint with its noise distribution. If your pool is sufficiently large, you will be able to detect generic trends and make generalised predictions. However, you might have lost the subtle differences between machines and thus lost predictive powers. Maybe if you performed some clustering analyses ab-initio you could have decided that 3 or 5 models would have served your problem much better. Or perhaps if you just already know if your data is an anomaly you can drive business value.

What could be a solution?

As always, there is no single best solution out there. There will still be a trade-off to get a model generalised sufficiently to bring to production. What I did observe so-far, at several customers, is that a combination of weak-learners is outperforming most of these highly specialised models. This effect is often even stronger when we take these “weak” models into field trials and or production. Odds are, your training data never was complete, and thus you instead have the flexibility to deal with this incompleteness by having weak, but multiple, models.

Visual Object Tagging Tool and Microsoft Cognitive Toolkit

Visual Object Tagging Tool and Microsoft Cognitive Toolkit

The Visual Object Tagging Tool (VoTT) features a great bunch functionalities to kickstart your FAST-RCNN modelling using Microsoft Cognitive Toolkit (used to be called CNTK). It offers an end-to-end solution from tagging your data till deep learning model validation. After loading a bunch of images in VoTT you tag them and the tool will let you export the images in a format ready for your Microsoft Cognitive Toolkit experiment.

Visual Object Tagging Tool and CNTK

Links

VoTT on Git: https://github.com/CatalystCode/VOTT

Fast-RCNN code on Git: https://docs.microsoft.com/en-us/cognitive-toolkit/Object-Detection-using-Fast-R-CNN

 

Digital Transformation of Services at Tetra Pak

Digital Transformation of Services at Tetra Pak

Digital transformation is driving changes in behaviour and business models globally. I’ve had the pleasure to work with Tetra Pak Services and to see first hand how data science and connected devices are shaping our future.

 

In the video below you see a demo (from Hannover Messe 2017) on how our solution looks like.

Using Microsoft Azure Blog Storage from within R using AzureSMR

Using Microsoft Azure Blog Storage from within R using AzureSMR

One of the great new features that AzureSMR is enabling is the read and write access to Azure Blog Storage. This is happening in a similar manner as is the case for when you use Python.

Shameless copy from the README:

In order to access Storage Blobs you need to have a key. Use azureSAGetKey() to get a Key or alternatively supply your own key. When you provide your own key you no longer need to use azureAuthenticate() since the API uses a diferent authentication approach.

sKey <- AzureSAGetKey(sc, resourceGroup = "Analytics", storageAccount = "analyticsfiles")
To list containers in a storage account use azureListContainers()

azListContainers(sc, storageAccount = "analyticsfiles", containers = "Test")
To list blobs in a container use azureListStorageBlobs()

azureListStorageBlobs(sc, storageAccount = "analyticsfiles", container = "test")
To Write a Blobs use azurePutBlob()

AzurePutBlob(sc, StorageAccount = "analyticsfiles", container = "test",
contents = "Hello World",
blob = "HELLO")

To read a blob in a container use azureGetBlob()

azureGetBlob(sc, storageAccount = "analyticsfiles", container = "test",
blob="HELLO",
type="text")

AzureSMR: handle your Azure subscription with R

AzureSMR: handle your Azure subscription with R

Great new package for the people that use Microsoft Azure as their platform of choice and love R. With AzureSMR you are capable to handle the following services:

  • Azure Blob: List, Read and Write to Blob Services
  • Azure Resources: List, Create and Delete Azure Resource. Deploy ARM templates.
  • Azure VM: List, Start and Stop Azure VMs
  • Azure HDI: List and Scale Azure HDInsight Clusters
  • Azure Hive: Run Hive queries against a HDInsight Cluster
  • Azure Spark: List and create Spark jobs/Sessions against a HDInsight Cluster(Livy)

Install it from your interactive shell:


#Install devtools
if(!require("devtools")) install.packages("devtools")
devtools::install_github("Microsoft/AzureSMR")
library(AzureSMR)

GitHub: https://github.com/Microsoft/AzureSMR

Source: http://blog.revolutionanalytics.com/2016/12/azuresmr.html

The emerging technology hype cycle by Gartner for 2016 covers 3 key Technologies trends:

The emerging technology hype cycle by Gartner for 2016 covers 3 key Technologies trends:

Transparently immersive experiences: Technology will continue to become more human-centric to the point where it will introduce transparency between people, businesses and things. This relationship will become much more entwined as the evolution of technology becomes more adaptive, contextual and fluid within the workplace, at home, and interacting with businesses and other people.

The perceptual smart machine age: Smart machine technologies will be the most disruptive class of technologies over the next 10 years due to radical computational power, near-endless amounts of data, and unprecedented advances in deep neural networks that will allow organizations with smart machine technologies to harness data in order to adapt to new situations and solve problems that no one has encountered previously.

The platform revolution: Emerging technologies are revolutionizing the concepts of how platforms are defined and used. The shift from technical infrastructure to ecosystem-enabling platforms is laying the foundations for entirely new business models that are forming the bridge between humans and technology.

http://www.gartner.com/newsroom/id/3412017

 

Check this out on Google+ 6 1