17 Feb On location at Wibautstraat 202, 1091 GS Amsterdam, The Netherlands

Streamlining Data Science Workflows with a Feature Catalog

February 17, 2023 / 3:30 pm - 4:15 pm

With the democratization of data via data lakes data science teams increasingly rely on custom model pipelines for data preprocessing and feature engineering. As a result it becomes difficult to reuse features or even compare similar features across different teams. This can be a significant challenge, as it can lead to duplicative work and ambiguous definitions that cause confusion and a risk on wrong conclusions. For example, two data science teams that both communicate an average click-rate to the marketing team, where one team excludes clicks made by robots and the other doesn’t. Without being aware of this different interpretation the marketing team can make some awfully wrong decisions.

skip to content

Schedule

3:30 pm

Streamlining Data Science Workflows with a Feature Catalog

With the democratization of data via data lakes data science teams increasingly rely on custom model pipelines for data preprocessing and feature engineering. As a result it becomes difficult to reuse features or even compare similar features across different teams. This can be a significant challenge, as it can lead to duplicative work and ambiguous definitions that cause confusion and a risk on wrong conclusions. For example, two data science teams that both communicate an average click-rate to the marketing team, where one team excludes clicks made by robots and the other doesn’t. Without being aware of this different interpretation the marketing team can make some awfully wrong decisions.

Host

Roel Bertens Data Scientist Xebia